Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs: Expand ruler recommended alerting docs #1719

Merged
merged 3 commits into from
Nov 12, 2019

Conversation

bill3tt
Copy link
Contributor

@bill3tt bill3tt commented Nov 5, 2019

  • I added CHANGELOG entry for this change.
  • Change is not relevant to the end user.

Changes

This addition to the documentation would have answered this question that I raised on slack.

Added sentence to the documentation for Ruler component.

Verification

Not applicable.

Signed-off-by: Ian Billett <[email protected]>
Copy link
Member

@FUSAKLA FUSAKLA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, thanks for the contribution!
I'm not completely sure about stating the Query explicitly. Depending on the strategy this could be also caused by failing any of the StoreAPI called by the Query. Maybe at least state that it is "most probably" caused by it but tbh first thing you should probably do is to check the logs which will tell you what was the issue?

docs/components/rule.md Outdated Show resolved Hide resolved
Co-Authored-By: Bartlomiej Plotka <[email protected]>
Signed-off-by: Ian Billett <[email protected]>
@bill3tt
Copy link
Contributor Author

bill3tt commented Nov 11, 2019

@FUSAKLA thanks for the review 😄- I've accepted @bwplotka's suggestion.

Copy link
Member

@FUSAKLA FUSAKLA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks! :)

@@ -81,7 +81,9 @@ The most important metrics to alert on are:
indicate connection, incompatibility or misconfiguration problems.

* `prometheus_rule_evaluation_failures_total`. If greater than 0, it means that that rule failed to be evaluated, which results in
either gap in rule or potentially ignored alert. Alert heavily on this if this happens for longer than your alert thresholds.
either gap in rule or potentially ignored alert. This metric might indicate problems on the queryAPI endpoint you use.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This new line here breaks the formatting of the list. This makes the line “alert heavily...” appear out of the list, and then the list continues.

Signed-off-by: Ian Billett <[email protected]>
@bill3tt
Copy link
Contributor Author

bill3tt commented Nov 12, 2019

@squat good spot - updated to remove the newline

@squat squat merged commit 1291d96 into thanos-io:master Nov 12, 2019
@squat
Copy link
Member

squat commented Nov 12, 2019

Thanks :)

IKSIN pushed a commit to monitoring-tools/thanos that referenced this pull request Nov 26, 2019
* updates ruler docs

Signed-off-by: Ian Billett <[email protected]>

* Update docs/components/rule.md

Co-Authored-By: Bartlomiej Plotka <[email protected]>
Signed-off-by: Ian Billett <[email protected]>

* removes newline

Signed-off-by: Ian Billett <[email protected]>
Signed-off-by: Aleksey Sin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants