-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
backupccl: retry restore's AdminSplit on ephemeral errors #85333
Conversation
Informs cockroachdb#84635,cockroachdb#84162 Release note: none
if err == nil { | ||
break | ||
} | ||
err := s.db.AdminSplit(ctx, newSplitKey, expirationTime) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it worth moving this retry logic one level deeper? Maybe we want to ask KV but don't we have a lot of callers outside restore that use db.AdminSplit
as well? import, index backfills etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or atleast auditing all the places we call db.AdminSplit
and deciding whether to switch them to this wrapper method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I don't see why the caller needs to handle these replication level ephemeral retries. Lemme ask kv what they think in slack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fix!
} | ||
err := s.db.AdminSplit(ctx, newSplitKey, expirationTime) | ||
|
||
if !(kvserver.IsRetriableReplicationChangeError(err)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might need to add IsLeaseTransferRejectedBecauseTargetMayNeedSnapshotError()
here or inside some lower level e.g. IsRetriableReplicationChangeError()
as this error is what #84891 is failing with.
Closing, as kvserver will handle this internally db2d653 |
Informs #84635,#84162
Release note: none