Latest Prometheus releases reports duplicate metrics in the rundeck exporter #108

BaCaRoZzo · 2025-01-09T17:01:35Z

See prometheus/prometheus#14089 for another example of this issue.

Newer releases of prometheus are generating error logs regarding duplicate metrics in the exporter:

ts=2025-01-09T15:01:16.644Z caller=scrape.go:1820 level=warn component="scrape manager" scrape_pool=rundeck target=http://rundeck/metrics msg="Error on ingesting samples with different value but same timestamp" num_dropped=7
ts=2025-01-09T15:06:15.441Z caller=scrape.go:1820 level=warn component="scrape manager" scrape_pool=rundeck target=http://rundeck/metrics msg="Error on ingesting samples with different value but same timestamp" num_dropped=7
ts=2025-01-09T15:21:14.829Z caller=scrape.go:1820 level=warn component="scrape manager" scrape_pool=rundeck target=http://rundeck/metrics msg="Error on ingesting samples with different value but same timestamp" num_dropped=14
ts=2025-01-09T15:31:14.470Z caller=scrape.go:1820 level=warn component="scrape manager" scrape_pool=rundeck target=http://rundeck/metrics msg="Error on ingesting samples with different value but same timestamp" num_dropped=14
ts=2025-01-09T15:36:13.986Z caller=scrape.go:1820 level=warn component="scrape manager" scrape_pool=rundeck target=http://rundeck/metrics msg="Error on ingesting samples with different value but same timestamp" num_dropped=7
ts=2025-01-09T15:46:14.199Z caller=scrape.go:1820 level=warn component="scrape manager" scrape_pool=rundeck target=http://rundeck/metrics msg="Error on ingesting samples with different value but same timestamp" num_dropped=7
ts=2025-01-09T15:51:14.408Z caller=scrape.go:1820 level=warn component="scrape manager" scrape_pool=rundeck target=http://rundeck/metrics msg="Error on ingesting samples with different value but same timestamp" num_dropped=14
ts=2025-01-09T16:16:14.381Z caller=scrape.go:1820 level=warn component="scrape manager" scrape_pool=rundeck target=http://rundeck/metrics msg="Error on ingesting samples with different value but same timestamp" num_dropped=7
ts=2025-01-09T16:21:13.518Z caller=scrape.go:1820 level=warn component="scrape manager" scrape_pool=rundeck target=http://rundeck/metrics msg="Error on ingesting samples with different value but same timestamp" num_dropped=7

Using the piped command from prometheus/prometheus#14089 (comment) in the linked issue, we can indeed find out that some metrics repeat themselves. The metrics that repeat for me are:

rundeck_project_execution_duration_seconds
rundeck_project_execution_status
rundeck_project_start_timestamp

They either repeat with the same value or different values. The duplication is not continuous but instead it happens for a few scrapes and then disappears. In the following picture you can see how often it happened in the last hour via the prometheus_target_scrapes_sample_duplicate_timestamp_total metric:

We are currently using release 2.7.0, although I don't see any change in the newer releases of the exporter that can help with this problem. Since we are alerting on prometheus_target_scrapes_sample_duplicate_timestamp_total, it is causing quite some noise in our notification channels.

Any idea about what can be causing the issue?

The text was updated successfully, but these errors were encountered:

phsmith · 2025-01-09T19:45:56Z

Hi @BaCaRoZzo, thanks for reporting this.

OK, I've just tried in my local environment and it looks like this only happens when the RUNDECK_PROJECTS_EXECUTIONS_CACHE or --rundeck.projects.executions.cache option is passed to the exporter. It looks like this behavior was hidden in Prometheus versions lower than 2.5.2.

I need to look into it further.
Will keep you posted.

phsmith · 2025-01-11T14:05:05Z

@BaCaRoZzo, can you confirm your Prometheus version?

I ask because after two days of testing, enabling, disabling, and playing with the exporter cache and cache TTL, I've only been able to reproduce the duplicate metrics twice, I've also upgraded my Prometheus to version 3.1.0 and set it to scrape the metrics every 15s.

Could you try the latest Prometheus version?

BaCaRoZzo · 2025-01-11T16:00:17Z

Hi @phsmith,

I'm using the latest 2.55.1 because we basically planned to bump to latest 2.x series to simplify the porting to 3.x.

There's no plan to port soon but I can give it a try and see if that solves the issue.

Which was the version that showed the problem for you?

phsmith · 2025-01-11T17:44:42Z

Got it! I was running Prometheus 2.52.0 when I got the error. Let me try with the same version as you.

By the way, I've noticed that the duplication can be eliminated by having a label in the metrics that can be updated each time the exporter metrics are scraped, I'm just checking out the better way to do this.

phsmith · 2025-01-13T14:39:22Z

I was able to confirm that Prometheus v2.55.1 (left) has the problem, but v3.1.0 doesn't (right):

BaCaRoZzo · 2025-01-13T17:14:41Z

@phsmith so is this a problem with prometheus itself, with the exporter or both? Looking at the issue I've linked, the problem seemed to be on the exporter side. However, your message hints at a problem on the prometheus side. 🤔

As said, we don't plan to update to 3.x soonish, we are focusing on other tasks at the moment which are more pressing. I'll definitely update in a few months, but I can't really point to a date.

So, if a fix for 2.x is coming - assuming that makes sense from your POV - it would be really appreciated.

phsmith · 2025-01-13T17:26:16Z

Yes, I have found the problem on the Prometheus side regarding duplicate metrics in version 2.x, but it is mainly due to the exporter cache option which sends the same metrics until the cache is invalidated.

I've found a way to fix this on the exporter side and will send a fix tonight.

BaCaRoZzo · 2025-01-13T17:45:52Z

@phsmith thanks a lot.

* chore: update requirements.txt * docs: update CHANGELOG.md * fix(#108): add timestamp to project_executions metrics

phsmith · 2025-01-13T22:00:47Z

@BaCaRoZzo, I've just released the exporter version v2.8.4 with the fix for this issue.

Please give it a try when you have the chance.

BaCaRoZzo · 2025-01-20T12:43:57Z

@phsmith apologies for the long wait.

The fix seems to be effective. I've deployed the new exporter since a few hours and I cannot see the alert spawning. On the basis of that I think we can close this issue.

Thanks so much for your prompt response and the quick fix!

phsmith added a commit that referenced this issue Jan 13, 2025

fix(#108): add timestamp to project_executions metrics

e98b0e8

phsmith added a commit that referenced this issue Jan 13, 2025

fix(#108): add timestamp to project executions metrics (#109)

49d7e05

* chore: update requirements.txt * docs: update CHANGELOG.md * fix(#108): add timestamp to project_executions metrics

BaCaRoZzo closed this as completed Jan 20, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Latest Prometheus releases reports duplicate metrics in the rundeck exporter #108

Latest Prometheus releases reports duplicate metrics in the rundeck exporter #108

BaCaRoZzo commented Jan 9, 2025 •

edited

Loading

phsmith commented Jan 9, 2025 •

edited

Loading

phsmith commented Jan 11, 2025

BaCaRoZzo commented Jan 11, 2025

phsmith commented Jan 11, 2025

phsmith commented Jan 13, 2025

BaCaRoZzo commented Jan 13, 2025

phsmith commented Jan 13, 2025

BaCaRoZzo commented Jan 13, 2025

phsmith commented Jan 13, 2025

BaCaRoZzo commented Jan 20, 2025

Latest Prometheus releases reports duplicate metrics in the rundeck exporter #108

Latest Prometheus releases reports duplicate metrics in the rundeck exporter #108

Comments

BaCaRoZzo commented Jan 9, 2025 • edited Loading

phsmith commented Jan 9, 2025 • edited Loading

phsmith commented Jan 11, 2025

BaCaRoZzo commented Jan 11, 2025

phsmith commented Jan 11, 2025

phsmith commented Jan 13, 2025

BaCaRoZzo commented Jan 13, 2025

phsmith commented Jan 13, 2025

BaCaRoZzo commented Jan 13, 2025

phsmith commented Jan 13, 2025

BaCaRoZzo commented Jan 20, 2025

BaCaRoZzo commented Jan 9, 2025 •

edited

Loading

phsmith commented Jan 9, 2025 •

edited

Loading