-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
backup: fix retry of fine-grained backup #43252
Conversation
Signed-off-by: hillium <[email protected]>
Signed-off-by: hillium <[email protected]>
Signed-off-by: hillium <[email protected]>
Signed-off-by: hillium <[email protected]>
[REVIEW NOTIFICATION] This pull request has been approved by:
To complete the pull request process, please ask the reviewers in the list to review by filling The full list of commands accepted by this bot can be found here. Reviewer can indicate their review by submitting an approval review. |
Signed-off-by: hillium <[email protected]>
|
||
// TotalSleepInMS returns the total sleeped time in ms. | ||
func (r *RetryWithBackoffer) TotalSleepInMS() int { | ||
return r.totalBackoff + r.bo.GetTotalSleep() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that the r.bo
is always forked, and only the forked bo
s are updated something. So is r.bo.GetTotalSleep
always 0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah indeed, but the origin logic is this. I'm not sure why it was implemented as this(and have no idea about how to improve that), so I just inherited that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anyway unlocking locks has a bounded backoff time, which is the TTL of the lock. I think omit the retry time it cost won't be really harmful, hopefully...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The sole difference between the original logic is that resolve lock may fail due to the accumulated sleep time reaches the sleep time limit. I think that would be fine to eliminate that fail case.
/merge |
This pull request has been accepted and is ready to merge. Commit hash: 886bc2b
|
/run-cherry-picker |
In response to a cherrypick label: new pull request created to branch |
In response to a cherrypick label: new pull request created to branch |
Signed-off-by: ti-chi-bot <[email protected]>
In response to a cherrypick label: new pull request created to branch |
Signed-off-by: ti-chi-bot <[email protected]>
In response to a cherrypick label: new pull request created to branch |
* cherry-pick 1a94e5d Signed-off-by: hillium <[email protected]> * run gofmt Signed-off-by: hillium <[email protected]> * fix a typo Signed-off-by: hillium <[email protected]> --------- Signed-off-by: hillium <[email protected]>
/run-cherry-picker |
In response to a cherrypick label: new pull request created to branch |
What problem does this PR solve?
Issue Number: close #43236
Problem Summary:
We are using the
tikv.Backoffer
to implement retry. Which isn't so customizable hence we cannot fully control the printed error messages and back off duration.What is changed and how it works?
This PR added a new structure
RetryWithBackoffer
to adapt the retry logic with the TiDB backoffer logic, at the same time it has enhanced the error message.This PR will also increase the max retry time of fine-grained backup. The reason is simple: given it begins the fine-grained backup, there must be some problems in the cluster. We need to be more patient.
Check List
Tests
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.