-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Metric alertmanager_alerts reports "active" alerts when there are none #1439
Comments
I've seen this discrepancy before too. IIRC the reason is that the metrics are computed based on the in-memory store of alerts which include the resolved ones. Those are deleted only every 30 minutes (by default). That being said, I agree that it would be nice if the numbers include only firing alerts. FWIW |
Right, maybe just an additional metric with the label |
👍 |
see prometheus#1439 Signed-off-by: Patrick Harböck <[email protected]>
see prometheus#1439 Signed-off-by: Patrick Harböck <[email protected]>
see prometheus#1439 Signed-off-by: Patrick Harböck <[email protected]>
see prometheus#1439 Signed-off-by: Patrick Harböck <[email protected]>
Seeing the same issue. We have a 3-node Alertmanager cluster. There are none showing in the web interface. But when I query
|
see prometheus#1439 Signed-off-by: Patrick Harböck <[email protected]>
see prometheus#1439 Signed-off-by: Patrick Harböck <[email protected]>
see prometheus#1439 Signed-off-by: Patrick Harböck <[email protected]>
see prometheus#1439 Signed-off-by: Patrick Harböck <[email protected]>
see prometheus#1439 Signed-off-by: Patrick Harböck <[email protected]>
see prometheus#1439 Signed-off-by: Patrick Harböck <[email protected]>
see prometheus#1439 Signed-off-by: Patrick Harböck <[email protected]>
see prometheus#1439 Signed-off-by: Patrick Harböck <[email protected]>
So, any news on this, or a way to go workaround it? |
Any update with this? |
Fixes prometheus#1439 and prometheus#2619. The previous metric is not _technically_ reporting incorrect results as the alerts _are_ still around and will be re-used if that same alert (equal fingerprint) is received before it is GCed. Therefore, I have kept the old metric under a new name `alertmanager_marked_alerts` and repurpose the current metric to match what the user sees in the UI.
Fixes prometheus#1439 and prometheus#2619. The previous metric is not _technically_ reporting incorrect results as the alerts _are_ still around and will be re-used if that same alert (equal fingerprint) is received before it is GCed. Therefore, I have kept the old metric under a new name `alertmanager_marked_alerts` and repurpose the current metric to match what the user sees in the UI.
Fixes prometheus#1439 and prometheus#2619. The previous metric is not _technically_ reporting incorrect results as the alerts _are_ still around and will be re-used if that same alert (equal fingerprint) is received before it is GCed. Therefore, I have kept the old metric under a new name `alertmanager_marked_alerts` and repurpose the current metric to match what the user sees in the UI. Signed-off-by: gotjosh <[email protected]>
Fixes prometheus#1439 and prometheus#2619. The previous metric is not _technically_ reporting incorrect results as the alerts _are_ still around and will be re-used if that same alert (equal fingerprint) is received before it is GCed. Therefore, I have kept the old metric under a new name `alertmanager_marked_alerts` and repurpose the current metric to match what the user sees in the UI. Signed-off-by: gotjosh <[email protected]>
…2943) * Alert metric reports different results to what the user sees via API Fixes #1439 and #2619. The previous metric is not _technically_ reporting incorrect results as the alerts _are_ still around and will be re-used if that same alert (equal fingerprint) is received before it is GCed. Therefore, I have kept the old metric under a new name `alertmanager_marked_alerts` and repurpose the current metric to match what the user sees in the UI. Signed-off-by: gotjosh <[email protected]>
…rometheus#2943) * Alert metric reports different results to what the user sees via API Fixes prometheus#1439 and prometheus#2619. The previous metric is not _technically_ reporting incorrect results as the alerts _are_ still around and will be re-used if that same alert (equal fingerprint) is received before it is GCed. Therefore, I have kept the old metric under a new name `alertmanager_marked_alerts` and repurpose the current metric to match what the user sees in the UI. Signed-off-by: gotjosh <[email protected]> Signed-off-by: Yijie Qin <[email protected]>
What did you do?
Query multiple Prometheus instances for
alertmanager_alerts
to get the active alert count for each one.What did you expect to see?
I was expecting to see
0
active alerts from Alertmanagers that currently don't have any active, unsilenced alerts, e.g.:What did you see instead? Under which circumstances?
From some Alertmanager instances, we get >0 alerts back, even if there are none showing in the web interface.
In this particular case, we did have 2 active but silenced alerts. The numbers still don't make sense - at least to me.
If this is how it's supposed to be, can we please also get a metric for the
silenced
label, so that we could getactive - suppressed - silenced = actually active
alerts?Environment
Kubernetes
1.10.5
The text was updated successfully, but these errors were encountered: