Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

alert with stateChangesOnly doesn't trigger OK on script start #744

Open
phemmer opened this issue Jul 23, 2016 · 4 comments
Open

alert with stateChangesOnly doesn't trigger OK on script start #744

phemmer opened this issue Jul 23, 2016 · 4 comments
Assignees
Milestone

Comments

@phemmer
Copy link

phemmer commented Jul 23, 2016

There are 2 related scenarios here.

  1. Lets say you have an alert with stateChangesOnly enabled, and kapacitor has triggered a CRITICAL alert. But then you stop the script, or kapacitor, or the whole host it's running on. While the script is stopped the condition clears. When kapacitor starts back up, it doesn't send an OK alert, and so any external systems still see the state as CRITICAL.
  2. Similarly you have an alert with stateChangesOnly enabled, but you have a bad tick script which results in some erroneous alerts. So you fix the script and then redefine it in kapacitor. When the new script is started, it comes up in OK state, but an OK alert doesn't get triggered. An alert gets triggered if it's still in an INFO, WARNING, or CRITICAL state, just not OK. This results in the false alerts not getting cleared.

I think it would be a good idea to allow triggering an OK alert when a script first starts up.

@phemmer phemmer changed the title alert with stateChangesOnly doesn't clear after re-define alert with stateChangesOnly doesn't trigger OK on script start Jul 23, 2016
@nathanielc
Copy link
Contributor

@phemmer I think the correct solution is to store the alert state in a persistent manner instead of loosing it on process restart or task restart.

But that is a lot of work, and while it is on the roadmap, it is a ways out.

I like the idea of triggering an OK on start. This will only work though if Kapacitor receives new data. For example say host A goes CRITICAL and then kapacitor stops and host A recovers. When Kapacitor starts back up it will not know that host A exists until it receives the first point from host A. At that point it could fire an OK alert but not before, (for the same reason that all state is lost during a restart).

Will that still work for your current needs?

@phemmer
Copy link
Author

phemmer commented Jul 25, 2016

The only reason why I didn't mention the persistence is that it won't solve scenario 2. If the script is changed, what is your persistence key going to be? Meaning how will you know which alert needs an OK to be sent?
Maybe this is something we just don't support. Or maybe we use the persistence and send an OK when whatever is used as the key is not present in the updated script (though this feels a little wrong).

But anyway, I think having to wait for data before sending an OK is reasonable. That is in fact what I would expect.

@dp1140a
Copy link

dp1140a commented Jan 17, 2018

It states this ticket is closed but what is the fix for this issue? We have noted similar behavior to this issue with the deadman node issuing a flurry of fals positive alerts upon Kapacitor restart

@dp1140a dp1140a reopened this Jan 17, 2018
@rbetts
Copy link

rbetts commented Jan 26, 2018

@dp1140a This bug was resolved with #1120. Changelog says this feature was delivered in 1.2.0 (2017-01-23).

What version are you seeing problems with? Can you provide a reproducer and attach it here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants