You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jun 29, 2022. It is now read-only.
Current situation
At the moment the envoy service monitor doesn't configure a scrape interval so it falls back to the default scrape interval of Prometheus which is 30 seconds.
Impact
Because the scrape interval is so large we suspect it smoothens our latency for that period and we are unable to see any latency spikes. When we compare our own application-specific monitoring we see spikes between 50ms ~ 100ms and occasionally 1s in p99 latency. However, when looking at the envoy latency metrics we can see any spikes at all for p99.
Ideal future situation
Lower the scrape interval of the envoy service monitor so we are able to notice spikes in our latency graphs
Implementation options
add a hardcoded lower scrape interval 1s,5s, 10s to the envoy service monitor
or make it configurable to specify a scrape interval for project contour component
Additional information
Currently using Lokomotive v0.4.1, however, there are no changes regarding this request in v0.5.0
The text was updated successfully, but these errors were encountered:
That's a fair question 👍 should have thought of that myself 😅 I've adjusted the interval manually for no, I'll give it some time to gather some data with the new interval and report back.
To test I increased the interval from the default 30s to the extreme 1s scrape interval. Below I've gathered some screenshots when using a 1m and a 2s range query to display the Envoy data.
This one display the 1m range query, you can see the spikes only go up to 4000ms
While in the example below when we are able to use a smaller time range of 2s, we can see spikes up to 10000ms, where the previous chart would smoothen the spikes to only half the actual latency.
Perhaps scrape interval of 1s would be too aggressive, but the default of 30s smoothens the spikes in this case of latency, but actually all the metrics. So, in general, it would be great if we could lower the scrape interval.
Thanks for checking it @niels-s. I wonder if there is some other way to find out about those latency spikes without increasing the scrape interval 🤔 Maybe tuning envoy histogram buckets?
Current situation
At the moment the envoy service monitor doesn't configure a scrape interval so it falls back to the default scrape interval of Prometheus which is 30 seconds.
Impact
Because the scrape interval is so large we suspect it smoothens our latency for that period and we are unable to see any latency spikes. When we compare our own application-specific monitoring we see spikes between 50ms ~ 100ms and occasionally 1s in p99 latency. However, when looking at the envoy latency metrics we can see any spikes at all for p99.
Ideal future situation
Lower the scrape interval of the envoy service monitor so we are able to notice spikes in our latency graphs
Implementation options
Additional information
Currently using Lokomotive v0.4.1, however, there are no changes regarding this request in v0.5.0
The text was updated successfully, but these errors were encountered: