Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check partition placement constraints when local transactions read data #21426

Closed
djshow832 opened this issue Dec 2, 2020 · 4 comments · Fixed by #21840
Closed

Check partition placement constraints when local transactions read data #21426

djshow832 opened this issue Dec 2, 2020 · 4 comments · Fixed by #21840
Labels
sig/transaction SIG:Transaction type/enhancement The issue or PR belongs to an enhancement.

Comments

@djshow832
Copy link
Contributor

djshow832 commented Dec 2, 2020

Background

Issue #20827 proposes to forbid local transactions to write partitions placed at other DC. It intends to avoid lost updates because the local timestamps from 2 DC are not monotonically increasing.

In that proposal, local transactions are allowed to read partitions placed at other DC. However, this may violate linearizability. So this issue proposes to also check placement constraints when reading partitions.

Problem

Consider such a scenario:

image

A and B stand for one TiDB instance from two data centers. The partition p0 is placed in the data center B.

  1. One client connects to B and starts a local transaction Txn1.
  2. Txn1 writes to p0 and commits. The commit ts of Txn1 is 100.
  3. The same client disconnects from B and connects to A, then it starts a local transaction Txn2.
  4. Txn2 only reads data from p0. The start ts of Txn2 is 50. Even if Txn2 starts after Txn1 commits, but the local TSO is not monotonically increasing, so the start ts of Txn2 may be less than the commit ts of Txn1.
  5. Txn2 won't see the modifications made by Txn1. It violates linearizability.

Solutions

To resolve this problem, there are 2 solutions:

  1. Always guarantee that the start ts of Txn2 is greater than the commit ts of Txn1. It's not practical because each local transaction needs to synchronize the timestamp among all data centers.
  2. Forbid local transactions to read data from partitions placed at other DC. It's straightforward.

There are 2 methods to implement solution 2:

  1. Check the placement constraints in the executor every time a partition is read.
  2. Collect all the partitions read in the SQL and check the placement constraints before returning success.
  3. Collect all the partitions read in the transaction and check the placement constraints when committing the transaction.

Method 1 looks very inefficient because it checks placement constraints too many times.

Method 3 seems fine but it's not. Since Txn2 is a read-only transaction, the client (or the application) might not commit it explicitly.
For example:

BEGIN;
SELECT * FROM TABLE t PARTITION p0 WHERE id=1;       # It returns and the client continues.
# After a long time...
BEGIN;       # It commits the previous transaction automatically, which fails due to placement constraints.
# Do something else.

Implementations

// TODO

@djshow832 djshow832 added type/enhancement The issue or PR belongs to an enhancement. sig/transaction SIG:Transaction labels Dec 2, 2020
@Yisaer
Copy link
Contributor

Yisaer commented Dec 2, 2020

Maybe We can check the placement constraint in the TableReaderExecutor.buildResp? This Executor is responsible for the building the request and then send to the TiKV.

We can check the txn-scope and the request data range with placement policy here. No sure whether this could handle all this cases mentioned is this issue. @zz-jason WDYT?

@djshow832
Copy link
Contributor Author

djshow832 commented Dec 2, 2020

Maybe We can check the placement constraint in the TableReaderExecutor.buildResp? This Executor is responsible for the building the request and then send to the TiKV.

We can check the txn-scope and the request data range with placement policy here. No sure whether this could handle all this cases mentioned is this issue. @zz-jason WDYT?

Some other operators are missed. AFAIK, PointGetExecutor, BatchPointGetExec, IndexReaderExecutor, IndexMergeReaderExecutor will read table data, and they may appear without TableReaderExecutor. @Yisaer

@Yisaer
Copy link
Contributor

Yisaer commented Dec 3, 2020

Maybe We can check the placement constraint in the TableReaderExecutor.buildResp? This Executor is responsible for the building the request and then send to the TiKV.
We can check the txn-scope and the request data range with placement policy here. No sure whether this could handle all this cases mentioned is this issue. @zz-jason WDYT?

Some other operators are missed. AFAIK, PointGetExecutor, BatchPointGetExec, IndexReaderExecutor, IndexMergeReaderExecutor will read table data, and they may appear without TableReaderExecutor. @Yisaer

How about both checking the constraint in the store/tikv/coprocessor (covered table scan cases) and the kv.snapshot?(covered pointGet cases).

WDYT? @breeswish @XuHuaiyu

@Yisaer
Copy link
Contributor

Yisaer commented Dec 7, 2020

After discussion with @djshow832 @XuHuaiyu , the constraint checking will be placed into executorBuilder.build. For all the Executor which reads data, we will check the data constraint with txn-scope.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig/transaction SIG:Transaction type/enhancement The issue or PR belongs to an enhancement.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants