You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When Telegraf is being run as a Linux daemon (through either Systemd or SysvInit) when told to stop it does not. It must be manually killed using kill -9 . I've included my telegraf.conf below, but it also occurs on the default telegraf.conf. (The longest I've left it is overnight, so it doesn't seem to be waiting for the next collect cycle, it was still running ~16 after being told to stop).
When using Systemd (on Ubuntu) during normal operation it lists two proceses in the cgroup, /bin/sh and /usr/bin/telegraf, the first being the parent of the second. When systemctl stop telegraf is executed the first (/bin/sh) exits, but the second does not, it will still be running hours later.
Relevant telegraf.conf:
# Configuration for telegraf agent
[agent]
## Default data collection interval for all inputs
interval = "10m"
## Rounds collection interval to 'interval'
## ie, if interval="10s" then always collect on :00, :10, :20, etc.
round_interval = true
## Telegraf will send metrics to outputs in batches of at
## most metric_batch_size metrics.
metric_batch_size = 1000
## For failed writes, telegraf will cache metric_buffer_limit metrics for each
## output, and will flush this buffer on a successful write. Oldest metrics
## are dropped first when this buffer fills.
metric_buffer_limit = 10000
## Collection jitter is used to jitter the collection by a random amount.
## Each plugin will sleep for a random time within jitter before collecting.
## This can be used to avoid many plugins querying things like sysfs at the
## same time, which can have a measurable effect on the system.
collection_jitter = "0s"
## Default flushing interval for all outputs. You shouldn't set this below
## interval. Maximum flush_interval will be flush_interval + flush_jitter
flush_interval = "10s"
## Jitter the flush interval by a random amount. This is primarily to avoid
## large write spikes for users running a large number of telegraf instances.
## ie, a jitter of 5s and interval 10s means flushes will happen every 10-15s
flush_jitter = "0s"
## Run telegraf in debug mode
debug = false
## Run telegraf in quiet mode
quiet = false
## Override default hostname, if empty use os.Hostname()
hostname = "<set but redacted>"
## If set to true, do no set the "host" tag in the telegraf agent.
omit_hostname = false
Bug report
When Telegraf is being run as a Linux daemon (through either Systemd or SysvInit) when told to stop it does not. It must be manually killed using kill -9 . I've included my telegraf.conf below, but it also occurs on the default telegraf.conf. (The longest I've left it is overnight, so it doesn't seem to be waiting for the next collect cycle, it was still running ~16 after being told to stop).
When using Systemd (on Ubuntu) during normal operation it lists two proceses in the cgroup, /bin/sh and /usr/bin/telegraf, the first being the parent of the second. When
systemctl stop telegraf
is executed the first (/bin/sh) exits, but the second does not, it will still be running hours later.Relevant telegraf.conf:
System info:
Steps to reproduce:
ps aux | grep telegraf
ps aux | grep telegraf
ps aux | grep telegraf
againExpected behavior:
Telegraf should exit within a few seconds of being stopped by the OS (at absolute most, the next time it runs the collect).
Actual behavior:
It continues running indefinitely.
Additional info:
There is nothing abnormal in the logs, they just show telegraf running...
The text was updated successfully, but these errors were encountered: