-
Notifications
You must be signed in to change notification settings - Fork 490
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
alerts: threshold values for lowering alert levels #614
Comments
I'm experiencing the same problem. This would be a very useful feature to be implemented to avoid continuous alerts on changes hover up and down the threshold values. |
@rossmcdonald I would like to make a PR for this feature. Can you please tell me generally where should I begin? Which files should be involved? |
Perhaps the syntax could simply be with an optional extra argument to the alert level functions, e.g. .warn(ALERT_CRITERIA, [RESET_CRITERIA]) Example:
|
@melor I think using @minhdanh Thanks for stepping up. There is a function It can be found here: Then in the Finally, you will need to parse the ast.LambdaNode expressions into stateful expressions. Which is done starting here for the normal expressions. https://github.com/influxdata/kapacitor/blob/master/alert.go#L313 |
Thank you for your detailed reply, @nathanielc |
@minhdanh The tests can be found in the integrations/ package. Have a look first at TestStream_Alert. All tests basically replay data from a file found in |
Currently
Alert is generated at every alert level change when going up info -> warn -> critical and when coming down critical -> warn -> info -> ok.
Problem
When monitoring for example free disk space level, the value often hovers both below and above the threshold values, causing e.g. a warn -> ok -> warn -> ok -> warn ... cycle. Flapping() percentages cause delays and are not optimal for every purpose.
Problem example:
Proposed solution
provide optional, separate threshold functions for resetting a higher severity to a lower one.
Example:
In this example the value, once going below 20 and causing the initial "WARNING" alert, would be allowed to fluctuate between 10 and 25 without changing the alert level from warning.
The text was updated successfully, but these errors were encountered: