Skip to content
This repository has been archived by the owner on Jan 31, 2024. It is now read-only.

Anomaly thresholds for user defined alert definitions #28

Open
mukeshelastic opened this issue Jul 15, 2020 · 3 comments
Open

Anomaly thresholds for user defined alert definitions #28

mukeshelastic opened this issue Jul 15, 2020 · 3 comments
Assignees
Labels
design For design issues [zube]: Ready

Comments

@mukeshelastic
Copy link

In logs alert creation, users need to specify a static threshold number. Instead of relying a static number, users can use machine learning for creating a dynamic threshold to help identify real outlier.

User experience for creating anomaly threshold may involved following user journeys:

When the count of log entries for <alert condition> is <operator> value from 0 to 100 anomaly score, within the last 5 minutes then take action. e.g. When the count of log entries for log.level=error is greater than 95 anomaly score within the last 5 minutes take action

When user clicks save in the alert flyout, single ML metric job is created with start time window as now.() - four weeks, the dataset and index to which the field belongs. Appropriate job creation status is shown in the alert flyout window. If the job is created successfully then flyout closes, if the job is not created, appropriate error message is shown in the alert flyout window.

If the job is created successfully, then a new read-only entry appears in the ML job list flyout with its status as enabled. There isn't any editing allowed on this ML job, Link to alert management is available in the ML job list flyout along with this new job so that user can visit the corresponding alert from the ML job list page and make updates to alert.

If the alert is deleted, ML job is disabled. If the alert is modified - such that condition or threshold is modified then ML job is updated with new parameters.

@mukeshelastic mukeshelastic added the design For design issues label Jul 15, 2020
@elasticmachine
Copy link

Pinging @elastic/observability-design (design)

@weltenwort
Copy link
Member

weltenwort commented Jul 15, 2020

A thought about how the user sets the anomaly score threshold: How is the user supposed to know what a reasonable value is? Would showing the more abstract intervals like warning, minor, major, and critical be easier?

@sgrodzicki
Copy link

sgrodzicki commented Jul 27, 2020

Would showing the more abstract intervals like warning, minor, major, and critical be easier?

I like this idea and we could probably stick with ML's thresholds:

Screenshot 2020-07-27 at 13 12 49

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
design For design issues [zube]: Ready
Projects
None yet
Development

No branches or pull requests

5 participants