-
Notifications
You must be signed in to change notification settings - Fork 814
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[service_discovery][jmx] trying to pick-up JMX changes with SD. #3010
Conversation
dd70cb1
to
7c96a62
Compare
JMX_GRACE_SECS = 2 | ||
SERVICE_DISCOVERY_PREFIX = 'SD-' | ||
SD_PIPE_NAME = "dd-service_discovery" | ||
SD_PIPE_UNIX_PATH = "/tmp" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quick note: I think using /opt/datadog-agent/run
is more appropriate and easier to adapt to Windows, unless you have a good reason to use /tmp
. 😃
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Named pipes are only supported on windows in a special path (and I don't even think all versions will support it anyways), so we'll need a specific path for windows anyway.
I used /tmp
to avoid issues with the source installs.
TODO: I could probably add some tests for the |
177649c
to
b9f8d01
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice one @truthbk ! This will be a major improvement for Java apps monitoring :)
Left a few comments.
|
||
if not os.path.exists(pipe_name): | ||
os.mkfifo(pipe_name) | ||
self.sd_pipe = os.open(pipe_name, os.O_RDWR) # RW to avoid blocking (will only W) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't think we should create the fifo if SD is not enabled
@@ -1,4 +1,3 @@ | |||
# (C) Datadog, Inc. 2010-2016 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's keep this guy around
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how I got rid of that! 😓
try: | ||
yaml = config_to_yaml(check_config) | ||
# generated["{}_{}".format(check_name, idx)] = yaml | ||
generated["{}_{}".format(check_name, 0)] = yaml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the _0
for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this was actually a product of an original misunderstanding on my part. But then I saw some utility for it (although I have to take another look at this code) - for example simultaneous containers for the same service, but with different tags or whatever, that could have different configurations. On the JMX side we'd use the check name as the key in the map, so this would allow us to support that in the future, that's why we have the commented line on the top. We can discuss and possibly remove.
log.debug("YAML generated: %s", yaml) | ||
except Exception as e: | ||
log.exception("Unable to generate YAML config for %s: %s", check_name, e) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
except:
log.exception("Unable to generate YAML config for %s", check_name)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch!
@@ -123,6 +123,9 @@ gce_updated_hostname: yes | |||
# and modify its value. | |||
# sd_template_dir: /datadog/check_configs | |||
# | |||
# JMX Service Disocvery |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a better description would be nice, something like Enable service discovery for JMX checks
?
|
||
return yaml_output | ||
|
||
def dump_yaml(path, config): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doesn't seem to be used anywhere
@@ -216,9 +248,17 @@ def run(self, config=None): | |||
if self._agentConfig.get('service_discovery'): | |||
self.sd_backend = get_sd_backend(self._agentConfig) | |||
|
|||
# Initialize Supervisor proxy (unix specific) | |||
self.supervisor_proxy = self._get_supervisor_socket(self._agentConfig) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we want to init that if SD is not enabled?
# Load JMX configs if available | ||
jmx_sd_configs = generate_jmx_configs(self._agentConfig, hostname) | ||
if jmx_sd_configs: | ||
self._submit_jmx_service_discovery(jmx_sd_configs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto, this will trigger even without SD, is that expected?
25c5199
to
edb94e0
Compare
LGTM |
[service_discovery] adding named pipe for IPC. [util] if file doesnt exist, create it. [service_discovery][jmxfetch] add some service discovery configurability. [jmxfetch][service_discovery] increase max heap size for SD to 512. [service_discovery][jmxfetch] create jmx service discovery option in config. [service_discovery][jmx] if jmx is not up, try to bring it up - wait for it to boot too. [service_discovery] if the previous index was None, consider that a change (bootup changes to apply). [service_discovery][jmx] cleaning up service discovery/supervisor interactions. [service_discovery][jmx] supervisor_proxy should not be None, cleanup. [service_discovery][jmxfetch] avoid double-reload of checks at start-up. Refactor.
…pt/datadog-agent/run/
63d8ce5
to
aec4d71
Compare
No changes to the logic, just added the test. We're green, so merging! Thanks @hkaj |
* master: (53 commits) [nginx] Update example config [service_discovery] Add a Zookeeper service discovery implementation. [aggregator] if sample rate is bad, fix it but still parse tags. (#3073) [yarn] whitelist authorized application_tags Alex poe/update jmx with refresh beans (#3068) [config] Fix `_is_affirmative` when passed argument is `None` (#3063) Send all configured tags with process checks. (#2976) fix flake8 errors [flare] ignore whitespace before proxy credentials [core] add a switch to disable profiling, but still use developer mode (#2898) [tests] allow tests to use the additional_checksd parameter (#3056) [service_discovery][jmx] trying to pick-up JMX changes with SD. (#3010) [install_script] Make `dd-agent` group of `datadog.conf` (#3036) [postgres] Allow disable postgresql.database_size (#3035) [core] Fixes IndexError for process lookup (#3043) remove warning message leaking password strings (#3053) trap psutil.NoSuchProcess exception (#3052) Fix grammar and casing in exception text (#3050) allow override of kubelet host with KUBERNETES_KUBELET_HOST env var [service discovery] properly handle config reload for removed containers ...
What does this PR do?
Enables service discovery for JMX integrations.
Once a service discovery index config is detected, we parse the config from JSON (as stored in etcd/consul) to YAML and the relayed over to JMXfetch over a named pipe
dd-service_discovery
that lives in/tmp
by default.If multiple changes are detected, these are all submitted to JMXfetch by a single write operation to avoid race conditions. The pipe should take care of locking from the reading and writing ends, and by dropping all relevant configs in a single write operation we can guarantee all configs will be read on the JMXfetch in a single iteration and thus help avoid having multiple versions of the same config in the pipe at any given time.
Motivation
Enabling Service Discovery for JMXfetch integrations.
Testing Guidelines
An overview on testing
is available in our contribution guidelines.
Additional Notes
Related PR: DataDog/jmxfetch#113