Always splitting update events if partition key changes #11297
Labels
affects-6.5
This bug affects the 6.5.x(LTS) versions.
affects-7.1
This bug affects the 7.1.x(LTS) versions.
affects-7.5
This bug affects the 7.5.x(LTS) versions.
affects-8.1
This bug affects the 8.1.x(LTS) versions.
area/ticdc
Issues or PRs related to TiCDC.
type/enhancement
The issue or PR belongs to an enhancement.
Background
When using the index-value or columns dispatcher to distribute data across different Kafka partitions based on the key, multiple consumer processes in the downstream consumer group consume Kafka topic partitions independently. Due to different consumption progress, data inconsistency might occur. Take the following SQL as an example:
If the UPDATE event is not split (ref #11211), and use the index-value or columns dispatcher:
THEN, the preceding DML will be distributed to the following partitions (change events contain both new and old values):
Since partitions are consumed in parallel, different consumption sequences will correspond to different results, in which only one consumption sequence can guarantee the final consistency (p1-1, p2-1, p1-2, p3-1).
Solution
Design
TiCDC should always split the UPDATE event, which changes the partition key, into DELETE and INSERT events. And the preceding DML events will be distributed to the following partitions:
Compatibility
In order to be compatible with the historical version and to ensure that the user's behavior does not change after upgrading, we should add a parameter
split-update-partition-key
and set it to false by default.The text was updated successfully, but these errors were encountered: