-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ccl/importccl: TestImportCSVStmt failed under stress #31719
Comments
There are unfortunately still known issues with range merges (cockroachdb#31719, perhaps among others). It's time to use the escape hatch before v2.1 ships. Release note: None
Womp womp. The merge commits before one of the LHS replicas has been replicated to all of the stores. Then the RHS is GC'd on the store with the missing LHS because there is no initialized LHS replica on that store. Then that store gets a Raft snapshot pre-merge, applies the merge, and there's no RHS, so 💥. |
Ok, there are two ways I see of fixing this. The first is to require that the every replica of the LHS is initialized before committing a merge transaction. This would be somewhat akin to the checking of the RHS's applied index, in that I think it would be best implemented as a side-channel RPC to each replica in the range. This is kind of gross, but it has the nice property of being an easily-backportable change. The second is to implement a proper range graveyard. I'm envisioning something like
I'm inclined towards the second approach, but the downside is backwards compatibility. It's not going to be easy to backport, and it's going to be wholly incompatible with the merge implementation that's currently on v2.1 and require a complicated cluster versioning story. Unless we decide to declare bankruptcy on merges for v2.1 and disable the setting entirely. |
Actually I think i see a trivial solution to this problem that’s along the lines of the first approach but requires no new RPCs. I’m going to pursue that in the hopes that it can be backported. |
Ok, I'm not going to have time to try the solution tonight. But here's a sketch of what I think could work. When an AdminMerge request arrives on the LHS, before launching a merge transaction, we look at the Raft status of the replica. If it doesn't exist, then we're not the Raft leader, and we bail out. No big deal. We can retry the merge later. If the Raft status does exist, then we verify that every replica of the LHS has a |
Sounds good to me. Just pointing out that this very similar in spirit to calling |
The strategy used by the replica GC queue to determine whether a subsumed range can be GC'd is flawed. If a replica of the LHS was uninitialized at the time the merge commmitted, there is a small window where the replica GC queue can think that it's safe to clean up an RHS replica when in fact the uninitialized LHS replica could still initialize and apply a merge trigger that required that RHS to be present. Make the replica GC queue's strategy valid by requiring that all replicas of the LHS are initialized before beginning a merge transaction. This closes the window during which a replica of the RHS could be incorrectly GC'd with a patch that is small enough to be backported to v2.1.1. Fix cockroachdb#31719. Release note: None
The strategy used by the replica GC queue to determine whether a subsumed range can be GC'd is flawed. If a replica of the LHS was uninitialized at the time the merge commmitted, there is a small window where the replica GC queue can think that it's safe to clean up an RHS replica when in fact the uninitialized LHS replica could still initialize and apply a merge trigger that required that RHS to be present. Make the replica GC queue's strategy valid by requiring that all replicas of the LHS are initialized before beginning a merge transaction. This closes the window during which a replica of the RHS could be incorrectly GC'd with a patch that is small enough to be backported to v2.1.1. Fix #31719. Release note: None
The strategy used by the replica GC queue to determine whether a subsumed range can be GC'd is flawed. If a replica of the LHS was uninitialized at the time the merge commmitted, there is a small window where the replica GC queue can think that it's safe to clean up an RHS replica when in fact the uninitialized LHS replica could still initialize and apply a merge trigger that required that RHS to be present. Make the replica GC queue's strategy valid by requiring that all replicas of the LHS are initialized before beginning a merge transaction. This closes the window during which a replica of the RHS could be incorrectly GC'd with a patch that is small enough to be backported to v2.1.1. Fix cockroachdb#31719. Release note: None
The strategy used by the replica GC queue to determine whether a subsumed range can be GC'd is flawed. If a replica of the LHS was uninitialized at the time the merge commmitted, there is a small window where the replica GC queue can think that it's safe to clean up an RHS replica when in fact the uninitialized LHS replica could still initialize and apply a merge trigger that required that RHS to be present. Make the replica GC queue's strategy valid by requiring that all replicas of the LHS are initialized before beginning a merge transaction. This closes the window during which a replica of the RHS could be incorrectly GC'd with a patch that is small enough to be backported to v2.1.1. Fix cockroachdb#31719. Release note: None
31842: storage: fix disappearing RHS merge bug r=tschottdorf,bdarnell a=benesch The strategy used by the replica GC queue to determine whether a subsumed range can be GC'd is flawed. If a replica of the LHS was uninitialized at the time the merge commmitted, there is a small window where the replica GC queue can think that it's safe to clean up an RHS replica when in fact the uninitialized LHS replica could still initialize and apply a merge trigger that required that RHS to be present. Make the replica GC queue's strategy valid by requiring that all replicas of the LHS are initialized before beginning a merge transaction. This closes the window during which a replica of the RHS could be incorrectly GC'd with a patch that is small enough to be backported to v2.1.1. Fix #31719. Release note: None Co-authored-by: Nikhil Benesch <[email protected]>
The strategy used by the replica GC queue to determine whether a subsumed range can be GC'd is flawed. If a replica of the LHS was uninitialized at the time the merge commmitted, there is a small window where the replica GC queue can think that it's safe to clean up an RHS replica when in fact the uninitialized LHS replica could still initialize and apply a merge trigger that required that RHS to be present. Make the replica GC queue's strategy valid by requiring that all replicas of the LHS are initialized before beginning a merge transaction. This closes the window during which a replica of the RHS could be incorrectly GC'd with a patch that is small enough to be backported to v2.1.1. Fix cockroachdb#31719. Release note: None
The strategy used by the replica GC queue to determine whether a subsumed range can be GC'd is flawed. If a replica of the LHS was uninitialized at the time the merge commmitted, there is a small window where the replica GC queue can think that it's safe to clean up an RHS replica when in fact the uninitialized LHS replica could still initialize and apply a merge trigger that required that RHS to be present. Make the replica GC queue's strategy valid by requiring that all replicas of the LHS are initialized before beginning a merge transaction. This closes the window during which a replica of the RHS could be incorrectly GC'd with a patch that is small enough to be backported to v2.1.1. Fix cockroachdb#31719. Release note: None
SHA: https://github.com/cockroachdb/cockroach/commits/d3f39a5cb60eb9cea64635f78e25f67a435057a5
Parameters:
To repro, try:
Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=980299&tab=buildLog
The text was updated successfully, but these errors were encountered: