Skip to content
This repository has been archived by the owner on Apr 2, 2024. It is now read-only.

Alert and runbook for duplicate samples #1687

Closed
Harkishen-Singh opened this issue Oct 11, 2022 · 1 comment · Fixed by #1688
Closed

Alert and runbook for duplicate samples #1687

Harkishen-Singh opened this issue Oct 11, 2022 · 1 comment · Fixed by #1688
Labels
Improvement Improvements to the existing features

Comments

@Harkishen-Singh
Copy link
Member

This issue is based on a discusion in #promscale slack thread.

Promscale logs duplicate samples for reasons like:

  1. Prometheus retrying a batch
  2. Wrongly configured HA cluster

In both the above cases, poor ingestion throughput is the impact. We should alert the user about this early to limit the effect on ingestion and recommend necessary mitigation steps in the runbooks.

@Harkishen-Singh Harkishen-Singh added the Improvement Improvements to the existing features label Oct 11, 2022
@paulfantom
Copy link
Contributor

Good example of such alert from prometheus mixin:

alert: PrometheusDuplicateTimestamps
annotations:
  description: Prometheus {{$labels.instance}} is dropping {{ printf "%.4g" $value  }}
    samples/s with different values but duplicated timestamp.
  summary: Prometheus is dropping samples with duplicate timestamps.
expr: |
  rate(prometheus_target_scrapes_sample_duplicate_timestamp_total{job="prometheus"}[5m]) > 0
for: 10m
labels:
  severity: warning

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Improvement Improvements to the existing features
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants