-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changing system time while Telegraf is running produces wrong round_interval timestamps #5335
Comments
@danielnelson as you suggested in #4538, setting |
The time is rounded to nearest multiple of the precision using Time.Round after the metric has been collected, with half way values rounded up. |
Thank you, I can work with that for now. I still think this is a bug though because given smaller precisions time is more likely to drift and mess with timestamps after being updated by |
I just wanted to chime in and say that I've hit this issue now too having rolled Telegraf 1.9.4 out to ~160 Win2016 servers. In my case it is definitely input plug-in dependent because I'm using inputs.mem to get system-wide memory used percent and that metric data never comes in with this timestamp problem but metrics from win_perf_counters do intermittently have this issue. Examples:
The timestamps for Memory used_percent (from inputs.mem) always consistently end in 9 for the same intervals:
I'm going to experiment with the precision config parm to see if that successfully works around this issue. |
Relevant telegraf.conf:
interval = "1m"
round_interval = true
flush_interval = "1m"
System info:
CentOS 7
Telegraf 1.9.3 (git: HEAD 6ad8c8b)
Telegraf 1.9.1 (git: HEAD 2063609) (on a second VM)
InfluxDB shell version: 1.7.3
Steps to reproduce:
date
second
mark. For examplesudo date +%T -s "17:55:32"
(make sure current time is not close to being 32 to avoid not changing the second mark)precision rfc3339
select pid from procstat where time > now() - 5m
Expected behavior:
Actual behavior:
Additional info:
Restarting Telegraf fixes this issue. I found this because I was running some VMs on a computer which was put on power saving mode during lunchtime, and the VMs clocks desynchronized. Using
chronyc makestep
produced this unexpected behavior.This is probably the root cause of other similar issues, because if a system clock desynchronizes itself from official time and is later updated by chronyd, Telegraf will start recording metrics with the wrong timestamps.
One problem this causes is making a query which uses group by time between
2019-01-23T17:50:00Z
and2019-01-23T17:56:00Z
withfill(0)
can give you false nulls if there is other data with correct timestamps.The text was updated successfully, but these errors were encountered: