Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

infoschema: maintain the snapshotTS in the infocache #25322

Merged
merged 4 commits into from
Jun 11, 2021

Conversation

nolouch
Copy link
Member

@nolouch nolouch commented Jun 10, 2021

Signed-off-by: nolouch [email protected]

What problem does this PR solve?

Issue Number: #25303

Problem Summary:
Infoschema cache is introduced by #24285, which mainly used for stale read and history read. It let executor can get the corresponding history schema. But even if we have the cache, every request still needs to go to TiKV to obtain the schema version to confirm the corresponding schema. which leads one TiKV to have many KV_Get, This may let one TiKV who has the meta region's leader consume too much CPU and become a bottleneck. And intuitively, the request is unbalanced:
image

What is changed and how it works?

  • Cache the relative ts in the infocache. each schema has only corresponded sanpshotTS, which is the oldest ts we have used with the schema.
  • Make the request can get the Infoschema directly with the specified timestamp.

Check List

Tests

  • Unit test

Release note

  • None

@ti-chi-bot ti-chi-bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jun 10, 2021
@github-actions github-actions bot added the sig/sql-infra SIG: SQL Infra label Jun 10, 2021
@nolouch nolouch force-pushed the improve-schema-cache branch from e8341ef to f3d4560 Compare June 10, 2021 05:06
@nolouch nolouch added needs-cherry-pick-release-5.1 do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. and removed do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. labels Jun 10, 2021
@nolouch nolouch force-pushed the improve-schema-cache branch 3 times, most recently from 68d656a to 5469654 Compare June 10, 2021 06:18
@nolouch nolouch changed the title inforschema: maintian the snapshotTS in the infocache inforschema: maintain the snapshotTS in the infocache Jun 10, 2021
@nolouch nolouch force-pushed the improve-schema-cache branch from 5469654 to ebdc91a Compare June 10, 2021 06:32
@nolouch nolouch requested review from xhebox and Yisaer June 10, 2021 06:38
@xhebox
Copy link
Contributor

xhebox commented Jun 10, 2021

Why not add an extra cache of snapshotTS->SchemaVersion? Then only https://github.com/pingcap/tidb/blob/ebdc91a4295d490be62b1b56c8fcdd8556048af9/domain/domain.go#L106-L111 needs modifications.

I did not like GetSnapshotInfoSchemaWithCache because the cache is not transparent anymore. Also, preprocess is not the only thing affects stale select. There are other two GetSnapshotInfoSchema in OptimizePreparedPlan() and session.go.

@nolouch
Copy link
Member Author

nolouch commented Jun 10, 2021

/rebuid

@nolouch
Copy link
Member Author

nolouch commented Jun 10, 2021

@xhebox Actually already has snapshots->SchemaVersion, because the snapshot's order should same as SchemaVersion, so I maintain them together. the snapshots->SchemaVersion cache cannot directly use on loadInfoSchema, because loadInfoSchema is also used to check the newest version with TiKV then load them, and it is also used by Reload, I think it will cause loadSchemaInLoop always failed.

@nolouch
Copy link
Member Author

nolouch commented Jun 10, 2021

/run-all-tests

1 similar comment
@nolouch
Copy link
Member Author

nolouch commented Jun 10, 2021

/run-all-tests

@nolouch
Copy link
Member Author

nolouch commented Jun 10, 2021

/run-integration-common-test

@nolouch nolouch marked this pull request as ready for review June 10, 2021 12:08
@ti-chi-bot ti-chi-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 10, 2021
@nolouch nolouch changed the title inforschema: maintain the snapshotTS in the infocache infoschema: maintain the snapshotTS in the infocache Jun 10, 2021
@nolouch nolouch force-pushed the improve-schema-cache branch from e903766 to 906a966 Compare June 10, 2021 17:27
@nolouch nolouch force-pushed the improve-schema-cache branch from 906a966 to 5b41099 Compare June 10, 2021 17:29
Copy link
Contributor

@xhebox xhebox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need a TestGetBySnapshotTS to cover all branches, but I could send a PR for that if you want to merge the PR first.

Rest LGTM.

infoschema/cache.go Outdated Show resolved Hide resolved
infoschema/cache.go Outdated Show resolved Hide resolved
infoschema/cache_test.go Show resolved Hide resolved
infoschema/cache_test.go Show resolved Hide resolved
nolouch added 2 commits June 11, 2021 01:54
Signed-off-by: nolouch <[email protected]>
Signed-off-by: nolouch <[email protected]>
@ti-chi-bot ti-chi-bot added the status/LGT1 Indicates that a PR has LGTM 1. label Jun 11, 2021
@nolouch
Copy link
Member Author

nolouch commented Jun 11, 2021

PTAL @JmPotato @Yisaer

Copy link
Member

@JmPotato JmPotato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ti-chi-bot
Copy link
Member

@JmPotato: Thanks for your review. The bot only counts approvals from reviewers and higher roles in list, but you're still welcome to leave your comments.

In response to this:

LGTM

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

Copy link
Contributor

@Yisaer Yisaer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 292 to +296
func (do *Domain) GetSnapshotInfoSchema(snapshotTS uint64) (infoschema.InfoSchema, error) {
// if the snapshotTS is new enough, we can get infoschema directly through sanpshotTS.
if is := do.infoCache.GetBySnapshotTS(snapshotTS); is != nil {
return is, nil
}
Copy link
Contributor

@Yisaer Yisaer Jun 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this may cause the infoschema inconsistency in following case.

1. infoschema cache holding `t1` as maxUpdatedSnapshotTS and infoschema for `schema version` v1
2. ddl worker running ddl job with `t2` in txn and increased `schema version` to v2 while infoschema cache have not been awared of this changing yet.
3. load infoschema with snapshotTS t3 (t3 > t2 > t1) and using infoschema with v1 schema version.

Copy link
Contributor

@xhebox xhebox Jun 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Infocache is used everywhere, it is promised to sync the latest infoschema for the current tidb instance(not necessarily the latest for all tidb instances, actually ddl owner tidb can not immediately get the latest, too).

And of course, there are edge cases. It is indeed not a 100% consistent method. But it is also impossible to avoid kvget by promising 100% consistency.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my example, I think we don't need to guarantee that the query must use the schema version v2's infoschema if the changing is not related to the tables we visit.

But I think the following checking is ncessary:

  1. check whether there exists any schema version changed bewtween t1(maxUpdatedSnapshotTS) and t3(snapshotTS)
  2. If exists, check whether the changing is related to the table we vistied.
  3. If related, reloading the infoschema with t3 snapshotTS to use the right infoschema.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my example, I think we don't need to guarantee that the query must use the schema version v2's infoschema if the changing is not related to the tables we visit.

But I think the following checking is necessary:

  1. check whether there exists any schema version changed bewtween t1(maxUpdatedSnapshotTS) and t3(snapshotTS)
  2. If exists, check whether the changing is related to the table we vistied.
  3. If related, reloading the infoschema with t3 snapshotTS to use the right infoschema.

Copy link
Contributor

@Yisaer Yisaer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@ti-chi-bot
Copy link
Member

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • Yisaer
  • xhebox

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Jun 11, 2021
@Yisaer
Copy link
Contributor

Yisaer commented Jun 11, 2021

/merge

@ti-chi-bot
Copy link
Member

@Yisaer: /merge is only allowed for the committers, you can assign this pull request to the committer in list by filling /assign @committer in the comment to help merge this pull request.

In response to this:

/merge

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@xhebox
Copy link
Contributor

xhebox commented Jun 11, 2021

/merge

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: fd16fb0

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Jun 11, 2021
@nolouch
Copy link
Member Author

nolouch commented Jun 11, 2021

/run-common-test

@nolouch
Copy link
Member Author

nolouch commented Jun 11, 2021

/run-all-tests

@ti-chi-bot ti-chi-bot merged commit 3ed6982 into pingcap:master Jun 11, 2021
ti-srebot pushed a commit to ti-srebot/tidb that referenced this pull request Jun 11, 2021
@ti-srebot
Copy link
Contributor

cherry pick to release-5.1 in PR #25376

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-cherry-pick-release-5.1 sig/sql-infra SIG: SQL Infra size/L Denotes a PR that changes 100-499 lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants