-
Notifications
You must be signed in to change notification settings - Fork 532
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid index errors for deleted tenants #2781
Conversation
It does not appear to be an error in s3 when you delete an object that is not present, so the s3 implementation has not been updated. |
67eed34
to
7c3001e
Compare
@@ -8,6 +8,7 @@ | |||
* [BUGFIX] Fix panic in metrics summary api [#2738](https://github.com/grafana/tempo/pull/2738) (@mdisibio) | |||
* [BUGFIX] Fix node role auth IDMSv1 [#2760](https://github.com/grafana/tempo/pull/2760) (@coufalja) | |||
* [BUGFIX] Only search ingester blocks that fall within the request time range. [#2783](https://github.com/grafana/tempo/pull/2783) (@joe-elliott) | |||
* [BUGFIX] Fix incorrect metrics for index failures [#2781](https://github.com/grafana/tempo/pull/2781) (@zalegrala) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did this have any impact besides metrics/extra logging?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not that I'm thinking of. We still leave the spurious objects in the backend, so perhaps we can have a conversation about cleaning that up also.
Hello @joe-elliott!
Please, if the current pull request addresses a bug fix, label it with the |
The backport to
To backport manually, run these commands in your terminal: # Fetch latest updates from GitHub
git fetch
# Create a new branch
git switch --create backport-2781-to-release-v2.2 origin/release-v2.2
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x e960fd03a8ee4a1208dbf4fe449a7700e61ed225
# When the conflicts are resolved, stage and commit the changes
git add . && git cherry-pick --continue If you have the GitHub CLI installed: # Create the PR body template
PR_BODY=$(gh pr view 2781 --json body --template 'Backport e960fd03a8ee4a1208dbf4fe449a7700e61ed225 from #2781{{ "\n\n---\n\n" }}{{ index . "body" }}')
# Push the branch to GitHub and a PR
echo "${PR_BODY}" | gh pr create --title "[release-v2.2] Avoid index errors for deleted tenants" --body-file - --label "type/bug" --label "backport" --base release-v2.2 --milestone release-v2.2 --web Or, if you don't have the GitHub CLI installed (we recommend you install it!): # If you don't have the GitHub CLI installed: Push the branch to GitHub and manually create a PR:
git push --set-upstream origin backport-2781-to-release-v2.2
# Remove the local backport branch
git switch main
git branch -D backport-2781-to-release-v2.2 Unless you've used the GitHub CLI above, now create a pull request where the |
* Catch backend.ErrDoesNotExist from WriteTenantIndex * Handle tenant index deletion when already deleted * Capture notfound from azure.Delete * Capture notfound from gcs.Delete * Update changelog * Add note for why we skip ErrDoesNotExist * Avoid handling the error twice (cherry picked from commit e960fd0)
* Catch backend.ErrDoesNotExist from WriteTenantIndex * Handle tenant index deletion when already deleted * Capture notfound from azure.Delete * Capture notfound from gcs.Delete * Update changelog * Add note for why we skip ErrDoesNotExist * Avoid handling the error twice (cherry picked from commit e960fd0)
* Catch backend.ErrDoesNotExist from WriteTenantIndex * Handle tenant index deletion when already deleted * Capture notfound from azure.Delete * Capture notfound from gcs.Delete * Update changelog * Add note for why we skip ErrDoesNotExist * Avoid handling the error twice (cherry picked from commit e960fd0) Co-authored-by: Zach Leslie <[email protected]>
The tenant index deletion was originally put in as TCO win, but did not have the desired effect and surfaced other issues in the system. Related grafana#2678 Related grafana#2754 Related grafana#2781 Related grafana#2878 Related grafana#3115 Related grafana#3223 Due to the number of issues here, and causing considerable noise on the pager, perhaps the right thing to do is back out the tenant deletion. Raising here for discussion.
What this PR does:
Here we ensure that when Tempo tries to delete a tenant index that has already been deleted, that this is not an error condition, and doesn't not log or count against the metrics. This can happen when an index was deleted because no blocks have been found, but for block paths that still contain spurious objects.
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]