Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix MimirIngesterHasNotShippedBlocks alert #5396

Merged
merged 2 commits into from
Jul 5, 2023
Merged

Conversation

pracucci
Copy link
Collaborator

@pracucci pracucci commented Jul 1, 2023

What this PR does

Earlier today we've got an issue for which I was expecting MimirIngesterHasNotShippedBlocks to fire, but it didn't. Turns out the reason is that the metric thanos_shipper_last_successful_upload_time is not exported anymore (it's tracked in the code, but not mapped in tsdbMetrics). This PR fixes it.

User-facing changes:

  • Introduced cortex_ingester_shipper_last_successful_upload_timestamp_seconds
  • Removed cortex_ingester_shipper_dir_syncs_total because useless (never used in our mixing)
  • Removed cortex_ingester_shipper_dir_sync_failures_total because it was never incremented (never used in our mixin)

Note to reviewers:

  • The shipper metrics struct has been renamed to shipperMetrics and is now shared across all shipper instances. The metrics tracked are safe to be shared because they're either counters (safe) or cortex_ingester_shipper_last_successful_upload_timestamp_seconds (gauge) which when updated is always set to current time (so we effectively track the "last successful upload time").

Which issue(s) this PR fixes or relates to

N/A

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

@pracucci pracucci marked this pull request as ready for review July 1, 2023 15:35
@pracucci pracucci requested a review from a team as a code owner July 1, 2023 15:35
@krajorama krajorama self-requested a review July 1, 2023 15:53
Copy link
Contributor

@krajorama krajorama left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@pracucci pracucci force-pushed the fix-shipper-metrics branch from 00eca56 to c95a23f Compare July 1, 2023 16:58
@pracucci
Copy link
Collaborator Author

pracucci commented Jul 1, 2023

Thanks @krajorama for your review. However, after your review I changed the metric name to keep it consistent with the other shipper metrics (so having cortex_ingester_shipper_ prefix instead of cortex_shipper_).

@pracucci pracucci requested a review from krajorama July 1, 2023 17:00
Help: "Total number of TSDB block upload failures",
}),
lastSuccessfulUploadTime: promauto.With(reg).NewGauge(prometheus.GaugeOpts{
Name: "cortex_ingester_shipper_last_successful_upload_time",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_timestamp_seconds is recommended suffix for metric like this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in e266481

pracucci added 2 commits July 5, 2023 09:59
…_ingester_shipper_last_successful_upload_timestamp_seconds

Signed-off-by: Marco Pracucci <[email protected]>
@pracucci pracucci force-pushed the fix-shipper-metrics branch from c95a23f to e266481 Compare July 5, 2023 08:05
Copy link
Member

@pstibrany pstibrany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@pracucci pracucci enabled auto-merge (squash) July 5, 2023 08:15
@pracucci pracucci merged commit 7d9b3d4 into main Jul 5, 2023
@pracucci pracucci deleted the fix-shipper-metrics branch July 5, 2023 08:17
@grafanabot
Copy link
Contributor

The backport to r245 failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new branch
git switch --create backport-5396-to-r245 origin/r245
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x 7d9b3d42468c6f39aad236dbb7093895580400e6
# Push it to GitHub
git push --set-upstream origin backport-5396-to-r245
git switch main
# Remove the local backport branch
git branch -D backport-5396-to-r245

Then, create a pull request where the base branch is r245 and the compare/head branch is backport-5396-to-r245.

aknuds1 pushed a commit that referenced this pull request Jul 5, 2023
* Fix MimirIngesterHasNotShippedBlocks alert

Signed-off-by: Marco Pracucci <[email protected]>

* Renamed cortex_ingester_shipper_last_successful_upload_time to cortex_ingester_shipper_last_successful_upload_timestamp_seconds

Signed-off-by: Marco Pracucci <[email protected]>

---------

Signed-off-by: Marco Pracucci <[email protected]>
(cherry picked from commit 7d9b3d4)
aknuds1 added a commit that referenced this pull request Jul 5, 2023
* Fix MimirIngesterHasNotShippedBlocks alert
* Renamed cortex_ingester_shipper_last_successful_upload_time to cortex_ingester_shipper_last_successful_upload_timestamp_seconds

---------

Signed-off-by: Marco Pracucci <[email protected]>
(cherry picked from commit 7d9b3d4)

Co-authored-by: Marco Pracucci <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants