Metric alertmanager_alerts reports "active" alerts when there are none #1439

jangrewe · 2018-06-27T11:04:59Z

What did you do?
Query multiple Prometheus instances for alertmanager_alerts to get the active alert count for each one.

alertmanager_alerts{state="active"}

What did you expect to see?
I was expecting to see 0 active alerts from Alertmanagers that currently don't have any active, unsilenced alerts, e.g.:

alertmanager_alerts{instance="alertmanager:80",job="alertmanager",state="active"} 0

What did you see instead? Under which circumstances?
From some Alertmanager instances, we get >0 alerts back, even if there are none showing in the web interface.
In this particular case, we did have 2 active but silenced alerts. The numbers still don't make sense - at least to me.

alertmanager_alerts{instance="alertmanager:80",job="alertmanager",state="active"} 3
alertmanager_alerts{instance="alertmanager:80",job="alertmanager",state="suppressed"} 1

If this is how it's supposed to be, can we please also get a metric for the silenced label, so that we could get active - suppressed - silenced = actually active alerts?

Environment
Kubernetes 1.10.5

System information:

...

Alertmanager version:

Branch: HEAD
BuildDate: 20180622-11:58:41
BuildUser: root@bec9939eb862
GoVersion: go1.10.3
Revision: 462c969d85cf1a473587754d55e4a3c4a2abc63c
Version: 0.15.0

Prometheus version:

Version: 2.3.1
Revision: 188ca45bd85ce843071e768d855722a9d9dabe03
Branch: HEAD
BuildUser: root@82ef94f1b8f7
BuildDate: 20180619-15:56:22
GoVersion: go1.10.3

Alertmanager configuration file:

...

Prometheus configuration file:

...

Logs:

...

The text was updated successfully, but these errors were encountered:

simonpasquier · 2018-07-02T12:56:05Z

I've seen this discrepancy before too. IIRC the reason is that the metrics are computed based on the in-memory store of alerts which include the resolved ones. Those are deleted only every 30 minutes (by default). That being said, I agree that it would be nice if the numbers include only firing alerts.

FWIW suppressed is the sum of inhibited and silenced alerts.

jangrewe · 2018-07-16T11:16:08Z

Right, maybe just an additional metric with the label firing that shows the number of alerts that would be seen on the the index page. Probably easier to implement than changing the whole purging stuff.

goldyfruit · 2018-08-22T17:17:09Z

👍

see prometheus#1439 Signed-off-by: Patrick Harböck <[email protected]>

gengwg · 2019-05-09T21:10:34Z

Seeing the same issue. We have a 3-node Alertmanager cluster. There are none showing in the web interface. But when I query alertmanager_alerts, there are a lot active alerts. This really confused me.

alertmanager_alerts{instance="alertmanager1:9093",job="alertmanager",state="active"}    8
alertmanager_alerts{instance="alertmanager1:9093",job="alertmanager",state="suppressed"}        2
alertmanager_alerts{instance="alertmanager2:9093",job="alertmanager",state="active"}    63
alertmanager_alerts{instance="alertmanager2:9093",job="alertmanager",state="suppressed"}        2
alertmanager_alerts{instance="alertmanager3:9093",job="alertmanager",state="active"}    10
alertmanager_alerts{instance="alertmanager3:9093",job="alertmanager",state="suppressed"}        3

see prometheus#1439 Signed-off-by: Patrick Harböck <[email protected]>

shd4 · 2020-06-29T12:24:20Z

So, any news on this, or a way to go workaround it?

FrankMormino · 2020-10-27T13:44:14Z

Any update with this?

Fixes prometheus#1439 and prometheus#2619. The previous metric is not _technically_ reporting incorrect results as the alerts _are_ still around and will be re-used if that same alert (equal fingerprint) is received before it is GCed. Therefore, I have kept the old metric under a new name `alertmanager_marked_alerts` and repurpose the current metric to match what the user sees in the UI.

Fixes prometheus#1439 and prometheus#2619. The previous metric is not _technically_ reporting incorrect results as the alerts _are_ still around and will be re-used if that same alert (equal fingerprint) is received before it is GCed. Therefore, I have kept the old metric under a new name `alertmanager_marked_alerts` and repurpose the current metric to match what the user sees in the UI. Signed-off-by: gotjosh <[email protected]>

…2943) * Alert metric reports different results to what the user sees via API Fixes #1439 and #2619. The previous metric is not _technically_ reporting incorrect results as the alerts _are_ still around and will be re-used if that same alert (equal fingerprint) is received before it is GCed. Therefore, I have kept the old metric under a new name `alertmanager_marked_alerts` and repurpose the current metric to match what the user sees in the UI. Signed-off-by: gotjosh <[email protected]>

…rometheus#2943) * Alert metric reports different results to what the user sees via API Fixes prometheus#1439 and prometheus#2619. The previous metric is not _technically_ reporting incorrect results as the alerts _are_ still around and will be re-used if that same alert (equal fingerprint) is received before it is GCed. Therefore, I have kept the old metric under a new name `alertmanager_marked_alerts` and repurpose the current metric to match what the user sees in the UI. Signed-off-by: gotjosh <[email protected]> Signed-off-by: Yijie Qin <[email protected]>

simonpasquier added the kind/enhancement label Jul 10, 2018

Pharb added a commit to Pharb/alertmanager that referenced this issue Oct 17, 2018

Add "firing" and "resolved" status to "alertmanager_alerts" metric

47b906f

see prometheus#1439 Signed-off-by: Patrick Harböck <[email protected]>

Pharb added a commit to Pharb/alertmanager that referenced this issue Oct 17, 2018

Add "firing" and "resolved" status to "alertmanager_alerts" metric

2e4734f

see prometheus#1439 Signed-off-by: Patrick Harböck <[email protected]>

Pharb mentioned this issue Oct 17, 2018

Add "alertmanager_provider_alerts" metric #1589

Closed

Pharb added a commit to Pharb/alertmanager that referenced this issue Dec 16, 2018

Add "firing" and "resolved" status to "alertmanager_alerts" metric

65c97b3

see prometheus#1439 Signed-off-by: Patrick Harböck <[email protected]>

Pharb added a commit to Pharb/alertmanager that referenced this issue Mar 6, 2019

Add new "alertmanager_provider_alerts" metric

1b49ff1

see prometheus#1439 Signed-off-by: Patrick Harböck <[email protected]>

Pharb added a commit to Pharb/alertmanager that referenced this issue May 22, 2019

Add new "alertmanager_provider_alerts" metric

86b50d2

see prometheus#1439 Signed-off-by: Patrick Harböck <[email protected]>

Pharb added a commit to Pharb/alertmanager that referenced this issue May 22, 2019

Add new "alertmanager_provider_alerts" metric

3b6d5fe

see prometheus#1439 Signed-off-by: Patrick Harböck <[email protected]>

Pharb added a commit to Pharb/alertmanager that referenced this issue Jun 17, 2019

Add new "alertmanager_provider_alerts" metric

6a2dccd

see prometheus#1439 Signed-off-by: Patrick Harböck <[email protected]>

Pharb added a commit to Pharb/alertmanager that referenced this issue Jun 17, 2019

Add new "alertmanager_provider_alerts" metric

091062b

see prometheus#1439 Signed-off-by: Patrick Harböck <[email protected]>

Pharb added a commit to Pharb/alertmanager that referenced this issue Oct 1, 2019

Add new "alertmanager_provider_alerts" metric

4ce97b3

see prometheus#1439 Signed-off-by: Patrick Harböck <[email protected]>

Pharb added a commit to Pharb/alertmanager that referenced this issue Oct 1, 2019

Add new "alertmanager_provider_alerts" metric

cb33b3c

see prometheus#1439 Signed-off-by: Patrick Harböck <[email protected]>

Pharb added a commit to Pharb/alertmanager that referenced this issue Nov 13, 2019

Add new "alertmanager_provider_alerts" metric

123903c

see prometheus#1439 Signed-off-by: Patrick Harböck <[email protected]>

Pharb added a commit to Pharb/alertmanager that referenced this issue Nov 18, 2019

Add new "alertmanager_provider_alerts" metric

74c22df

see prometheus#1439 Signed-off-by: Patrick Harböck <[email protected]>

gotjosh mentioned this issue Jun 15, 2022

Alert metric reports different results to what the user sees via API #2943

Merged

roidelapluie closed this as completed in #2943 Jun 16, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Metric alertmanager_alerts reports "active" alerts when there are none #1439

Metric alertmanager_alerts reports "active" alerts when there are none #1439

jangrewe commented Jun 27, 2018

simonpasquier commented Jul 2, 2018

jangrewe commented Jul 16, 2018 •

edited

Loading

goldyfruit commented Aug 22, 2018

gengwg commented May 9, 2019

shd4 commented Jun 29, 2020

FrankMormino commented Oct 27, 2020

Metric alertmanager_alerts reports "active" alerts when there are none #1439

Metric alertmanager_alerts reports "active" alerts when there are none #1439

Comments

jangrewe commented Jun 27, 2018

simonpasquier commented Jul 2, 2018

jangrewe commented Jul 16, 2018 • edited Loading

goldyfruit commented Aug 22, 2018

gengwg commented May 9, 2019

shd4 commented Jun 29, 2020

FrankMormino commented Oct 27, 2020

jangrewe commented Jul 16, 2018 •

edited

Loading