-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage: fix GC of subsumed replicas #31988
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but I might split the two non-test changes into separate commits (so that we have separate bisection targets in case one of them has unexpected consequences)
Reviewed 3 of 3 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale)
Ack, will do after this CI cycle. |
When considering a subsumed replica R for GC, we need to prove that its left neighbor is generationally up to date. If we notice that this left neighbor is not generationally up to date, it is likely that it needs a GC run itself, so queue it. Release note: None
When a range is subsumed, there is a chance that it leaves behind replicas that are marked as destroyed with reason "merge pending." The replica GC queue was previously misinterpreting this to mean that the range had already been GC'd and thus the range would never get GC'd, resulting in a proliferation of intersecting snapshot errors. Release note: None
Ok, this is split into two separate commits. The test ended up being flaky (because of course it did). I'm going to merge with the flaky test skipped and fix and unskip it in a future commit so that this lands in time for the nightly roachtest run. bors r=bdarnell |
31988: storage: fix GC of subsumed replicas r=bdarnell a=benesch When a range is subsumed, there is a chance that it leaves behind replicas that are marked as destroyed with reason "merge pending." The replica GC queue was previously misinterpreting this to mean that the range had already been GC'd and thus the range would never get GC'd, resulting in a proliferation of intersecting snapshot errors. Additionally take the opportunity to teach the replica GC queue to proactively queue a replica's left neighbor when that left neighbor is blocking the replica from being GC'd. Release note: None Co-authored-by: Nikhil Benesch <[email protected]>
Build succeeded |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 3 files at r2, 2 of 2 files at r3.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale)
When a range is subsumed, there is a chance that it leaves behind
replicas that are marked as destroyed with reason "merge pending." The
replica GC queue was previously misinterpreting this to mean that the
range had already been GC'd and thus the range would never get GC'd,
resulting in a proliferation of intersecting snapshot errors.
Additionally take the opportunity to teach the replica GC queue to
proactively queue a replica's left neighbor when that left neighbor is
blocking the replica from being GC'd.
Release note: None