Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[trace-agent] adding output of 'trace-agent -info' #3164

Merged
merged 3 commits into from
Feb 2, 2017

Conversation

ufoot
Copy link
Contributor

@ufoot ufoot commented Feb 1, 2017

What does this PR do?

When calling datadog-agent info, this patch makes a subcall to trace-agent -info to show trace agent (APM) status.

Motivation

It's easier to monitor the agent status this way, provides a quick way to know what's happening.

Testing Guidelines

  • Check that datadog-agent status display relevant info.
  • Check the return code of that command (should be 0 on success, obviously)

Additional Notes

  • Wondering wether Debian support is enough or if all platforms should be done...
  • What happens when the agent is not running, should we report an error (return code 1) or return 0 as APM is not part of the base, core install

IMPORTANT -> must NOT ship before: DataDog/datadog-trace-agent#214

@ufoot ufoot requested a review from talwai February 1, 2017 16:46
@@ -121,6 +121,10 @@ case $action in
RETURN_VALUE=$(($RETURN_VALUE || $?))
python agent/ddagent.py info
RETURN_VALUE=$(($RETURN_VALUE || $?))
if [ -x ./bin/trace-agent ]; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can skip this file, we don't ship with the source install

@@ -111,6 +112,10 @@ case $1 in
RETURN_VALUE=$(($RETURN_VALUE || $?))
$FORWARDERPATH info
RETURN_VALUE=$(($RETURN_VALUE || $?))
if [ -x $TRACEAGENTPATH ]; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can skip this file, we don't ship with osx

su $AGENTUSER -c "$TRACEAGENTPATH -info"
TRACEAGENT_RETURN=$?
fi
exit $(($COLLECTOR_RETURN+$DOGSTATSD_RETURN+$FORWARDER_RETURN+$TRACEAGENT_RETURN))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as mentioned in chat, i'd prefer if we don't include $TRACEAGENT_RETURN in the exit code. we mark it as an optional process, disabled by default. since info is sometimes used as a healthcheck - i think we should consider the agent "healthy" even when trace-agent -info exits non-zero. @olivielpeau any thoughts on this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way I see it, the exit code should reflect trace agent's status if it's enabled and discard it/ignore it otherwise.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes @truthbk really makes sense -> @talwai do we have a way to do that beyond parsing the conf for apm_enable=true & reading the env var DD_APM_ENABLED ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we have a way to do that beyond parsing the conf for apm_enable=true & reading the env var DD_APM_ENABLED ?

this is the only way. i agree with @truthbk 's logic in principle if this is not too messy to implement

@talwai
Copy link
Contributor

talwai commented Feb 1, 2017

- Dropped source & OS/X patches, does not make sense yet
- Still returning 0 if trace-agent fails to get its status.
  Indeed, some might use this script as a health check, and
  we don't want it to fail just because the trace-agent is
  not working, as it's not a base core component (yet).
@ufoot ufoot force-pushed the christian/traceagentstatusinfo branch from f87187b to 9c0c8dc Compare February 1, 2017 17:01
@truthbk
Copy link
Member

truthbk commented Feb 1, 2017

I'm not convinced the trace agent status should be ignored if it's indeed enabled. I'd like to hear @olivielpeau input on this one.

Copy link
Member

@olivielpeau olivielpeau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had one comment, other than that this looks good to me 👍

@truthbk: thanks for the ping! Since running the trace-agent is an opt-in and is configured in the datadog.conf I think it's a good thing not to use the exit code of the trace info command, for now at least.

@@ -176,6 +177,11 @@ case "$1" in
DOGSTATSD_RETURN=$?
su $AGENTUSER -c "$FORWARDERPATH info"
FORWARDER_RETURN=$?
TRACEAGENT_RETURN=0
if [ -x $TRACEAGENTPATH ];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

applies to the 2 other init files as well: I don't think we need to check that the trace agent binary exists and is executable, these init files are only used by the packaged Linux agent and it'll always ship the trace agent going forward, with correct exec permissions. Unless I'm missing something.

@olivielpeau
Copy link
Member

Also if the plan is to ship this for the 5.11.0 agent release please set the related milestone on the PR :)

@ufoot ufoot added this to the 5.11.0 milestone Feb 2, 2017
@talwai talwai merged commit 90dfc24 into master Feb 2, 2017
talwai pushed a commit that referenced this pull request Feb 2, 2017
* [trace-agent] adding output of 'trace-agent -info' on 'datadog-agent info'

* [trace-agent] added support for CentOS

- Dropped source & OS/X patches, does not make sense yet
- Still returning 0 if trace-agent fails to get its status.
  Indeed, some might use this script as a health check, and
  we don't want it to fail just because the trace-agent is
  not working, as it's not a base core component (yet).

* [trace-agent] not checking for trace-agent binary, should be always shipped
@ufoot ufoot deleted the christian/traceagentstatusinfo branch February 2, 2017 17:31
degemer added a commit that referenced this pull request Feb 7, 2017
* master: (67 commits)
  [network] dont combine connection states (#3158)
  renames function in line with other checks
  [couchbase] Modified service_check_tags in couchbase.py to include user-specified tags. (#3079)
  [docker] fix image tag extraction
  Fix tests, refactor how we collect container and volume states
  test_docker_daemon.py: fix syntax errors
  Beginning work on docker_daemon tests.
  Add 5 opt-in checks to docker_daemon
  add mention of office hours (#3171)
  updates psycopg2 to 2.6.2 (#3170)
  [trace-agent] adding output of 'trace-agent -info' (#3164)
  [riak] Change default value in configuration example to match default value from the code
  move setting parameter to instance level
  riak security support
  [dns_check][ci] Bring back assertions on metrics (#3162)
  [powerdns_recursor] adds support for v4. (#3166)
  [tcp_check] Add custom tags to response_time gauge
  catch can't connect instead of failing on nodata found (#3127)
  [php-fpm] add http_host tag
  [dns_check] Document NXDOMAIN usage in yaml example file
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants