Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metric alertmanager_alerts reports "active" alerts when there are none #1439

Closed
jangrewe opened this issue Jun 27, 2018 · 6 comments · Fixed by #2943
Closed

Metric alertmanager_alerts reports "active" alerts when there are none #1439

jangrewe opened this issue Jun 27, 2018 · 6 comments · Fixed by #2943

Comments

@jangrewe
Copy link

What did you do?
Query multiple Prometheus instances for alertmanager_alerts to get the active alert count for each one.

alertmanager_alerts{state="active"}

What did you expect to see?
I was expecting to see 0 active alerts from Alertmanagers that currently don't have any active, unsilenced alerts, e.g.:

alertmanager_alerts{instance="alertmanager:80",job="alertmanager",state="active"} 0

What did you see instead? Under which circumstances?
From some Alertmanager instances, we get >0 alerts back, even if there are none showing in the web interface.
In this particular case, we did have 2 active but silenced alerts. The numbers still don't make sense - at least to me.

alertmanager_alerts{instance="alertmanager:80",job="alertmanager",state="active"} 3
alertmanager_alerts{instance="alertmanager:80",job="alertmanager",state="suppressed"} 1

If this is how it's supposed to be, can we please also get a metric for the silenced label, so that we could get active - suppressed - silenced = actually active alerts?

Environment
Kubernetes 1.10.5

  • System information:
...
  • Alertmanager version:
Branch: HEAD
BuildDate: 20180622-11:58:41
BuildUser: root@bec9939eb862
GoVersion: go1.10.3
Revision: 462c969d85cf1a473587754d55e4a3c4a2abc63c
Version: 0.15.0
  • Prometheus version:
Version: 2.3.1
Revision: 188ca45bd85ce843071e768d855722a9d9dabe03
Branch: HEAD
BuildUser: root@82ef94f1b8f7
BuildDate: 20180619-15:56:22
GoVersion: go1.10.3
  • Alertmanager configuration file:
...
  • Prometheus configuration file:
...
  • Logs:
...
@simonpasquier
Copy link
Member

I've seen this discrepancy before too. IIRC the reason is that the metrics are computed based on the in-memory store of alerts which include the resolved ones. Those are deleted only every 30 minutes (by default). That being said, I agree that it would be nice if the numbers include only firing alerts.

FWIW suppressed is the sum of inhibited and silenced alerts.

@jangrewe
Copy link
Author

jangrewe commented Jul 16, 2018

Right, maybe just an additional metric with the label firing that shows the number of alerts that would be seen on the the index page. Probably easier to implement than changing the whole purging stuff.

@goldyfruit
Copy link

👍

Pharb added a commit to Pharb/alertmanager that referenced this issue Oct 17, 2018
Pharb added a commit to Pharb/alertmanager that referenced this issue Oct 17, 2018
Pharb added a commit to Pharb/alertmanager that referenced this issue Dec 16, 2018
Pharb added a commit to Pharb/alertmanager that referenced this issue Mar 6, 2019
@gengwg
Copy link

gengwg commented May 9, 2019

Seeing the same issue. We have a 3-node Alertmanager cluster. There are none showing in the web interface. But when I query alertmanager_alerts, there are a lot active alerts. This really confused me.

alertmanager_alerts{instance="alertmanager1:9093",job="alertmanager",state="active"}    8
alertmanager_alerts{instance="alertmanager1:9093",job="alertmanager",state="suppressed"}        2
alertmanager_alerts{instance="alertmanager2:9093",job="alertmanager",state="active"}    63
alertmanager_alerts{instance="alertmanager2:9093",job="alertmanager",state="suppressed"}        2
alertmanager_alerts{instance="alertmanager3:9093",job="alertmanager",state="active"}    10
alertmanager_alerts{instance="alertmanager3:9093",job="alertmanager",state="suppressed"}        3

Pharb added a commit to Pharb/alertmanager that referenced this issue May 22, 2019
Pharb added a commit to Pharb/alertmanager that referenced this issue May 22, 2019
Pharb added a commit to Pharb/alertmanager that referenced this issue Jun 17, 2019
Pharb added a commit to Pharb/alertmanager that referenced this issue Jun 17, 2019
Pharb added a commit to Pharb/alertmanager that referenced this issue Oct 1, 2019
Pharb added a commit to Pharb/alertmanager that referenced this issue Oct 1, 2019
Pharb added a commit to Pharb/alertmanager that referenced this issue Nov 13, 2019
Pharb added a commit to Pharb/alertmanager that referenced this issue Nov 18, 2019
@shd4
Copy link

shd4 commented Jun 29, 2020

So, any news on this, or a way to go workaround it?

@FrankMormino
Copy link

Any update with this?

gotjosh added a commit to gotjosh/alertmanager that referenced this issue Jun 15, 2022
Fixes prometheus#1439 and prometheus#2619.

The previous metric is not _technically_ reporting incorrect results as the alerts _are_ still around and will be re-used if that same alert (equal fingerprint) is received before it is GCed. Therefore, I have kept the old metric under a new name `alertmanager_marked_alerts` and repurpose the current metric to match what the user sees in the UI.
gotjosh added a commit to gotjosh/alertmanager that referenced this issue Jun 15, 2022
Fixes prometheus#1439 and prometheus#2619.

The previous metric is not _technically_ reporting incorrect results as the alerts _are_ still around and will be re-used if that same alert (equal fingerprint) is received before it is GCed. Therefore, I have kept the old metric under a new name `alertmanager_marked_alerts` and repurpose the current metric to match what the user sees in the UI.
gotjosh added a commit to gotjosh/alertmanager that referenced this issue Jun 15, 2022
Fixes prometheus#1439 and prometheus#2619.

The previous metric is not _technically_ reporting incorrect results as the alerts _are_ still around and will be re-used if that same alert (equal fingerprint) is received before it is GCed. Therefore, I have kept the old metric under a new name `alertmanager_marked_alerts` and repurpose the current metric to match what the user sees in the UI.

Signed-off-by: gotjosh <[email protected]>
gotjosh added a commit to gotjosh/alertmanager that referenced this issue Jun 16, 2022
Fixes prometheus#1439 and prometheus#2619.

The previous metric is not _technically_ reporting incorrect results as the alerts _are_ still around and will be re-used if that same alert (equal fingerprint) is received before it is GCed. Therefore, I have kept the old metric under a new name `alertmanager_marked_alerts` and repurpose the current metric to match what the user sees in the UI.

Signed-off-by: gotjosh <[email protected]>
roidelapluie pushed a commit that referenced this issue Jun 16, 2022
…2943)

* Alert metric reports different results to what the user sees via API

Fixes #1439 and #2619.

The previous metric is not _technically_ reporting incorrect results as the alerts _are_ still around and will be re-used if that same alert (equal fingerprint) is received before it is GCed. Therefore, I have kept the old metric under a new name `alertmanager_marked_alerts` and repurpose the current metric to match what the user sees in the UI.

Signed-off-by: gotjosh <[email protected]>
qinxx108 pushed a commit to qinxx108/alertmanager that referenced this issue Dec 13, 2022
…rometheus#2943)

* Alert metric reports different results to what the user sees via API

Fixes prometheus#1439 and prometheus#2619.

The previous metric is not _technically_ reporting incorrect results as the alerts _are_ still around and will be re-used if that same alert (equal fingerprint) is received before it is GCed. Therefore, I have kept the old metric under a new name `alertmanager_marked_alerts` and repurpose the current metric to match what the user sees in the UI.

Signed-off-by: gotjosh <[email protected]>
Signed-off-by: Yijie Qin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants