-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Potential race condition between compactor applying retention or compaction and store gateway syncing metas. #564
Comments
This will be fixed by: #1528 help wanted (: |
Why we doesn't trigger a resync with remote in reader component (thanos store), if a block (deleted by compactor) isn't found? |
This issue/PR has been automatically marked as stale because it has not had recent activity. Please comment on status otherwise the issue will be closed in a week. Thank you for your contributions. |
@khyatisoneji is on it (: |
This issue/PR has been automatically marked as stale because it has not had recent activity. Please comment on status otherwise the issue will be closed in a week. Thank you for your contributions. |
We are super close to merge the fix! But not yet fixed. |
Fixed by #2136 |
This issue/PR has been automatically marked as stale because it has not had recent activity. Please comment on status otherwise the issue will be closed in a week. Thank you for your contributions. |
With compaction or retention logic we have one "writer" that creates new blocks (compaction) and deletes blocks that were source of it.
The problem with our readers (store) is that syncing is periodically every X seconds. So it might happen that we query store during time of compactor remove the block, but store did not sync yet. There is no watch logic for Bucket API.
The simplest solution is to defer deletion in some time in future to address potential eventual consistency of store Gateway internal state (and potentially bucket itself)
Acceptance criteria:
This unfortunately requires the heavy modification of compactor Plan logic to understand the edge cases like:
This has to be done and is planned to be done EOY
The text was updated successfully, but these errors were encountered: