-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Threads grow indefinitely #460
Comments
Which version of the Agent/JMXFetch are you running? Running |
Helm chart version 3.30.10 JMXFetch Just saw the release 0.47.9 - not sure if the thread leak mentioned is the issue we encountered? |
Closing as issue most likely fixed by #432 |
We run DataDog using a helm chart in k8s. We recently encountered a situation where a DataDog process grew to tens of thousands of threads that caused crashes for all other jvm processes running on the same node. Given we use DataDog to monitor a lot of nodes, this caused a lot of crashes.
During investigation, we used the resource metrics from the process dashboard, and saw a number of processed has unbound growth on thread counts. Here is one example:
We took the PID associated with the one above and checked the process on the host and saw it was jmxfetch:
The DataDog agent has this configuration for some Debzium monitoring:
A week before our production systems were affected, we decommissioned the debezium setup, but did not remove the DataDog monitoring. We think there might be an edge case with jmxfetch if the services are removed after initially existing that causes the growth in threads. After restarting all of the agents, we have not seen the same issue.
It might be similar in nature to the issue reported here.
The text was updated successfully, but these errors were encountered: