Check partition placement constraints when local transactions read data #21426

djshow832 · 2020-12-02T05:10:14Z

Background

Issue #20827 proposes to forbid local transactions to write partitions placed at other DC. It intends to avoid lost updates because the local timestamps from 2 DC are not monotonically increasing.

In that proposal, local transactions are allowed to read partitions placed at other DC. However, this may violate linearizability. So this issue proposes to also check placement constraints when reading partitions.

Problem

Consider such a scenario:

A and B stand for one TiDB instance from two data centers. The partition p0 is placed in the data center B.

One client connects to B and starts a local transaction Txn1.
Txn1 writes to p0 and commits. The commit ts of Txn1 is 100.
The same client disconnects from B and connects to A, then it starts a local transaction Txn2.
Txn2 only reads data from p0. The start ts of Txn2 is 50. Even if Txn2 starts after Txn1 commits, but the local TSO is not monotonically increasing, so the start ts of Txn2 may be less than the commit ts of Txn1.
Txn2 won't see the modifications made by Txn1. It violates linearizability.

Solutions

To resolve this problem, there are 2 solutions:

Always guarantee that the start ts of Txn2 is greater than the commit ts of Txn1. It's not practical because each local transaction needs to synchronize the timestamp among all data centers.
Forbid local transactions to read data from partitions placed at other DC. It's straightforward.

There are 2 methods to implement solution 2:

Check the placement constraints in the executor every time a partition is read.
Collect all the partitions read in the SQL and check the placement constraints before returning success.
Collect all the partitions read in the transaction and check the placement constraints when committing the transaction.

Method 1 looks very inefficient because it checks placement constraints too many times.

Method 3 seems fine but it's not. Since Txn2 is a read-only transaction, the client (or the application) might not commit it explicitly.
For example:

BEGIN;
SELECT * FROM TABLE t PARTITION p0 WHERE id=1;       # It returns and the client continues.
# After a long time...
BEGIN;       # It commits the previous transaction automatically, which fails due to placement constraints.
# Do something else.

Implementations

// TODO

The text was updated successfully, but these errors were encountered:

Yisaer · 2020-12-02T05:29:24Z

Maybe We can check the placement constraint in the TableReaderExecutor.buildResp? This Executor is responsible for the building the request and then send to the TiKV.

We can check the txn-scope and the request data range with placement policy here. No sure whether this could handle all this cases mentioned is this issue. @zz-jason WDYT?

djshow832 · 2020-12-02T08:28:40Z

Maybe We can check the placement constraint in the TableReaderExecutor.buildResp? This Executor is responsible for the building the request and then send to the TiKV.

We can check the txn-scope and the request data range with placement policy here. No sure whether this could handle all this cases mentioned is this issue. @zz-jason WDYT?

Some other operators are missed. AFAIK, PointGetExecutor, BatchPointGetExec, IndexReaderExecutor, IndexMergeReaderExecutor will read table data, and they may appear without TableReaderExecutor. @Yisaer

Yisaer · 2020-12-03T06:33:11Z

Maybe We can check the placement constraint in the TableReaderExecutor.buildResp? This Executor is responsible for the building the request and then send to the TiKV.
We can check the txn-scope and the request data range with placement policy here. No sure whether this could handle all this cases mentioned is this issue. @zz-jason WDYT?

Some other operators are missed. AFAIK, PointGetExecutor, BatchPointGetExec, IndexReaderExecutor, IndexMergeReaderExecutor will read table data, and they may appear without TableReaderExecutor. @Yisaer

How about both checking the constraint in the store/tikv/coprocessor (covered table scan cases) and the kv.snapshot?(covered pointGet cases).

WDYT？ @breeswish @XuHuaiyu

Yisaer · 2020-12-07T07:20:42Z

After discussion with @djshow832 @XuHuaiyu , the constraint checking will be placed into executorBuilder.build. For all the Executor which reads data, we will check the data constraint with txn-scope.

djshow832 added type/enhancement The issue or PR belongs to an enhancement. sig/transaction SIG:Transaction labels Dec 2, 2020

djshow832 mentioned this issue Dec 3, 2020

Local/Global Transaction in Cross-DC Deployment Tasklist #20448

Open

14 tasks

This was referenced Dec 7, 2020

Support check partition placement constraints when local transactions read data #21527

Closed

ddl, distsql: Support forbiding cross txnScope query all *Reader Executor #21650

Merged

nolouch mentioned this issue Dec 18, 2020

executor: support forbiding cross dc read for pointGet #21840

Merged

ti-srebot closed this as completed in #21840 Dec 22, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Check partition placement constraints when local transactions read data #21426

Check partition placement constraints when local transactions read data #21426

djshow832 commented Dec 2, 2020 •

edited

Loading

Yisaer commented Dec 2, 2020

djshow832 commented Dec 2, 2020 •

edited

Loading

Yisaer commented Dec 3, 2020 •

edited

Loading

Yisaer commented Dec 7, 2020

Check partition placement constraints when local transactions read data #21426

Check partition placement constraints when local transactions read data #21426

Comments

djshow832 commented Dec 2, 2020 • edited Loading

Background

Problem

Solutions

Implementations

Yisaer commented Dec 2, 2020

djshow832 commented Dec 2, 2020 • edited Loading

Yisaer commented Dec 3, 2020 • edited Loading

Yisaer commented Dec 7, 2020

djshow832 commented Dec 2, 2020 •

edited

Loading

djshow832 commented Dec 2, 2020 •

edited

Loading

Yisaer commented Dec 3, 2020 •

edited

Loading