-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ruler: add metric thanos alert sent by alert name #5368
ruler: add metric thanos alert sent by alert name #5368
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love this idea and I think this could be a nice addition to have.
@fgouteroux Can you please rebase your PR on main? This should fix the CI issue.
8de0141
to
0edf958
Compare
Signed-off-by: François Gouteroux <[email protected]>
0edf958
to
8481b12
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your work! However, what's the use-case here? If it's about tracing where specific alerts came from then I believe Alertmanager itself should be a better place to solve this. There were lots of discussions about the identity of an alert during Prometheus Day @ KubeCon last week. I don't think the alertname
is enough to identify an alert. And I believe that if we'd give users such a metric, the next question would be: what about other labels? Hence, I don't think that this is a good idea
It's not about tracing, it could be usefull to see which alertname is mostly triggered. In my case I'm using one ruler per tenant with a shared alertmanager, so each tenant manage his own alerts rules but have no access to the alertmanager. We only expose the alertmanager api through a custom proxy with authentification. Based on that metrics provided by the alertmanager are not very helpful as we only see the total of alert sent to an integration like PagerDuty. This is why ruler metrics fit to our design and let us provide a grafana dashboard with all ruler metrics related to the tenant. |
Good point. I think alertname is only unique in one rule group. But can be duplicated across groups. |
Understood, closing this as useless. |
Hello thanos team,
It will be great to have the total number of alerts sent by alertmanager and by alert name. It will be useful to make some stats about how often an alert is triggered.
I'm not sure that if it's relevant to add another metric, and about the name too.
Changes
Adding metric: Total number of alerts sent by alertmanager and alert name.
Verification
Trigger an alert
Check the metric thanos_alert_sender_alerts_sent_by_alertname_total in thanos ruler metrics endpoint