-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Socket listener only processes first 1 or 2 batches of metrics with errors #12176
Comments
Hi,
Based on the timestamps, you have provided While there do appear to be a number of debug messages from the serializer that it was unable to serialize a field, those fields are skipped and processing on other fields does continue. You mentioned #5858, however, PR #5943 implemented the debug messages and ensured processing of other fields would continue. Which metrics are no longer showing up? For how long did you let this run for? What is sending the data to the socket listener? How often is it sending new metrics or generating new metrics? |
There is metrics being continuously sent from collectd in intervals of 10. By the end of the initial burst of metrics processed, the buffer stays empty of metrics, but there is certainly metrics being sent. |
Here are some more logs, where at the beginning of the program, the InfluxDB was turned off then later turned on after a couple of minutes: https://gist.github.com/7Hazard/1ef922b590592b4029e59255e287088b |
This should have no affect on how telegraf processes metrics from the socket listener. All this would do is increase the # of metrics in the buffer and produce some error messages about the output not being available. You can see this in messages like this:
It says it in the logs, the
I asked which metrics are no longer showing up and it still is not clear to me what you think is missing. It would be ideal if you could provide a way to reproduce what you think is the issue. |
This appears to have been some networking issue in the Kubernetes cluster I was working in, where the health check for the UDP port was trying TCP instead, which prevented packets from being sent. I figured this out after I deduced that it wasn't an issue from Telegrafs side. |
Thanks for following up! |
Relevant telegraf.conf
Logs from Telegraf
https://gist.github.com/7Hazard/c3e7b49b8d2981cb99a6b8a0cc9e3238
System info
Docker Image - telegraf:1.24-alpine
Docker
FROM telegraf:1.24-alpine
COPY telegraf.conf /etc/telegraf/telegraf.conf
COPY collectd-types.db /usr/share/collectd/types.db
Steps to reproduce
These collectd metrics are forwarded from GitHub Enterprise to Telegraf.
Expected behavior
That metrics are continually sent to InfluxDB, not just during the first batches of Telegraf's program life.
Actual behavior
The first batches of metrics that are received, are processed with some errors and sent to InfluxDB successfully. But later on, it does no longer send any metrics to InfluxDB.
Additional info
Might be related to #5858
The text was updated successfully, but these errors were encountered: