-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
domain,infoschema: make infoschema activity block GC safepoint advancing #58062
Conversation
Hi @tiancaiamao. Thanks for your PR. PRs from untrusted users cannot be marked as trusted with I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #58062 +/- ##
================================================
+ Coverage 73.1711% 75.0849% +1.9138%
================================================
Files 1674 1720 +46
Lines 461507 472153 +10646
================================================
+ Hits 337690 354516 +16826
+ Misses 103079 95413 -7666
- Partials 20738 22224 +1486
Flags with carried forward coverage won't be shown. Click here to find out more.
|
One possible case that this fix can not cover is that, the caller take an infoschema V2 instance but never use it. This case is rare and if we found it we can adjust at the caller side, so it should not be a real issue. |
[LGTM Timeline notifier]Timeline:
|
/hold |
pkg/infoschema/cache.go
Outdated
// GetAndResetRecentInfoSchemaTS provides the min start ts for infosync.InfoSyncer. | ||
func (h *InfoCache) GetAndResetRecentInfoSchemaTS(now uint64) uint64 { | ||
ret := atomic.LoadUint64(&h.Data.recentMinTS) | ||
atomic.StoreUint64(&h.Data.recentMinTS, now) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use atomic.Uint64
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible that an infoschemaV2
is acccessed in a lower frequence than ReportMinStartTS
, so that a tick of ReportMinStartTS
may miss this infoschemaV2
instance, which allows GC to proceed, and then the infoschemaV2
instance becomes unavailable again?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In theory, that is possible, so there is still potential risk. I have discussed that with @lcwangchao
But in practice I think this fix is good enough for the most common cases. @MyonKeminta
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. But I suggest these details can be noted in the comments in the code.
Rest LGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@MyonKeminta we need some review from the transaction team, PTAL |
@tiancaiamao: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: lcwangchao, MyonKeminta, wjhuang2016 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/test pull-unit-test-ddlv1 |
@tiancaiamao: Cannot trigger testing until a trusted user reviews the PR and leaves an In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/unhold |
In response to a cherrypick label: new pull request created to branch |
What problem does this PR solve?
Issue Number: close #57952
Problem Summary:
What changed and how does it work?
Currently, when we get a infoschema v2 instance, it's lifetime is in range [ts, ts + 10min)
If the caller hold a infoschema, and use it some time later, it might get error "GC lifetime is shorter than transaction duration" error. That's because infoschema v2 internally use meta package API, while the meta API does not block the GC safepoint advancing.
In this commit, I add a
keepAlive()
call to the infoschema API,that
keepAlive()
function will keep the minimal start ts of active infoschema API calls.The infosync.Syncer call
ReportMinStartTS()
periodically, amd now it will take the active infoschema ts into consideration, which is updated by thekeepAlive()
function.Check List
Tests
Side effects
Documentation
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.