Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Uptime] Snapshot count lags behind actual monitor states #58079

Closed
andrewvc opened this issue Feb 20, 2020 · 1 comment · Fixed by #58389
Closed

[Uptime] Snapshot count lags behind actual monitor states #58079

andrewvc opened this issue Feb 20, 2020 · 1 comment · Fixed by #58389
Labels
Team:Uptime - DEPRECATED Synthetics & RUM sub-team of Application Observability

Comments

@andrewvc
Copy link
Contributor

In Uptime 7.6 using the overhauled snapshot count queries are faster, but count but show too many monitors as down.

This patch improves the handling of timespans with snapshot counts. This feature originally worked, but suffered a regression when we increased the default timespan in the query context to 5m. This means that without this patch the counts you get are the maximum total number of monitors that were down over the past 5m, which is not really that useful.

andrewvc added a commit that referenced this issue Feb 24, 2020
Fixes #58079

This is an improved version of #58078

Note, this is a bugfix targeting 7.6.1 . I've decided to open this PR directly against 7.6 in the interest of time. We can forward-port this to 7.x / master later.

This patch improves the handling of timespans with snapshot counts. This feature originally worked, but suffered a regression when we increased the default timespan in the query context to 5m. This means that without this patch the counts you get are the maximum total number of monitors that were down over the past 5m, which is not really that useful.

We now use a scripted metric to always count precisely the number of up/down monitors. On my box this could process 400k summary docs in ~600ms. This should scale as shards are added.

I attempted to keep memory usage relatively slow by using simple maps of strings.
andrewvc added a commit to andrewvc/kibana that referenced this issue Feb 24, 2020
Fixes elastic#58079

This is an improved version of elastic#58078

Note, this is a bugfix targeting 7.6.1 . I've decided to open this PR directly against 7.6 in the interest of time. We can forward-port this to 7.x / master later.

This patch improves the handling of timespans with snapshot counts. This feature originally worked, but suffered a regression when we increased the default timespan in the query context to 5m. This means that without this patch the counts you get are the maximum total number of monitors that were down over the past 5m, which is not really that useful.

We now use a scripted metric to always count precisely the number of up/down monitors. On my box this could process 400k summary docs in ~600ms. This should scale as shards are added.

I attempted to keep memory usage relatively slow by using simple maps of strings.
@tsullivan tsullivan added the Team:Uptime - DEPRECATED Synthetics & RUM sub-team of Application Observability label Feb 24, 2020
@elasticmachine
Copy link
Contributor

Pinging @elastic/uptime (Team:uptime)

andrewvc added a commit that referenced this issue Feb 24, 2020
Fixes #58079

This is an improved version of #58078

Note, this is a bugfix targeting 7.6.1 . I've decided to open this PR directly against 7.6 in the interest of time. We can forward-port this to 7.x / master later.

This patch improves the handling of timespans with snapshot counts. This feature originally worked, but suffered a regression when we increased the default timespan in the query context to 5m. This means that without this patch the counts you get are the maximum total number of monitors that were down over the past 5m, which is not really that useful.

We now use a scripted metric to always count precisely the number of up/down monitors. On my box this could process 400k summary docs in ~600ms. This should scale as shards are added.

I attempted to keep memory usage relatively slow by using simple maps of strings.
andrewvc added a commit to andrewvc/kibana that referenced this issue Feb 24, 2020
…elastic#58389)

Fixes elastic#58079

This is an improved version of elastic#58078

Note, this is a bugfix targeting 7.6.1 . I've decided to open this PR directly against 7.6 in the interest of time. We can forward-port this to 7.x / master later.

This patch improves the handling of timespans with snapshot counts. This feature originally worked, but suffered a regression when we increased the default timespan in the query context to 5m. This means that without this patch the counts you get are the maximum total number of monitors that were down over the past 5m, which is not really that useful.

We now use a scripted metric to always count precisely the number of up/down monitors. On my box this could process 400k summary docs in ~600ms. This should scale as shards are added.

I attempted to keep memory usage relatively slow by using simple maps of strings.
elasticmachine added a commit to dhurley14/kibana that referenced this issue Feb 25, 2020
…elastic#58389) (elastic#58415)

Fixes elastic#58079

This is an improved version of elastic#58078

Note, this is a bugfix targeting 7.6.1 . I've decided to open this PR directly against 7.6 in the interest of time. We can forward-port this to 7.x / master later.

This patch improves the handling of timespans with snapshot counts. This feature originally worked, but suffered a regression when we increased the default timespan in the query context to 5m. This means that without this patch the counts you get are the maximum total number of monitors that were down over the past 5m, which is not really that useful.

We now use a scripted metric to always count precisely the number of up/down monitors. On my box this could process 400k summary docs in ~600ms. This should scale as shards are added.

I attempted to keep memory usage relatively slow by using simple maps of strings.

Co-authored-by: Elastic Machine <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:Uptime - DEPRECATED Synthetics & RUM sub-team of Application Observability
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants