Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sink(ticdc): split RowChangeEvent if unique key is updated #9437

Merged

Conversation

sdojjy
Copy link
Member

@sdojjy sdojjy commented Jul 27, 2023

What problem does this PR solve?

Issue Number: close #9430

What is changed and how it works?

split and sort RowChangeEvent if unique key is updated
split the update to delete and insert, then sort the whole txn with the order delete>update>insert

Check List

Tests

  • Unit test
  • Integration test

Questions

Will it cause performance regression or break compatibility?
Do you need to update user documentation, design documentation or monitoring documentation?

Release note

`None`.

@ti-chi-bot ti-chi-bot bot added release-note-none Denotes a PR that doesn't merit a release note. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jul 27, 2023
@sdojjy sdojjy changed the title sink(ticdc): split RowChangeEvent if unique key is updated (WIP) sink(ticdc): split RowChangeEvent if unique key is updated Jul 27, 2023
@ti-chi-bot ti-chi-bot bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 27, 2023
cdc/model/sink.go Outdated Show resolved Hide resolved
}

func (e RowChangedEvents) Less(i, j int) bool {
return len(e[i].Columns)-len(e[i].PreColumns) <
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not easy to understand directly, please add some comments about how the condition comes up.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed to another implementation

@@ -719,6 +748,115 @@ func (t *SingleTableTxn) GetCommitTs() uint64 {
return t.CommitTs
}

// TrySplitAndSortUpdateEvent split update events if unique key is updated
func (t *SingleTableTxn) TrySplitAndSortUpdateEvent() error {
if len(t.Rows) < 2 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible there is only an update event, shall we split it here ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, it's not needed to split a single update event.

Copy link
Contributor

@3AceShowHand 3AceShowHand Aug 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if this update event has unique key columns changed? I think this update event also should be splitted

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is almost identical to the convertRowChangedEvents method.

After the enable-old-value is removed, this 2 method can be merged into one.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this pr focuses on the duplicated key case, the pre-condition is more than two update events emitted.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

convertRowChangedEvents checks "enable-old-value" and handle key.

cdc/model/sink.go Outdated Show resolved Hide resolved
cdc/model/sink.go Outdated Show resolved Hide resolved
@sdojjy
Copy link
Member Author

sdojjy commented Aug 1, 2023

/retest-required

1 similar comment
@sdojjy
Copy link
Member Author

sdojjy commented Aug 1, 2023

/retest-required

@sdojjy
Copy link
Member Author

sdojjy commented Aug 2, 2023

/test all

@ti-chi-bot ti-chi-bot bot removed the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Aug 2, 2023
@ti-chi-bot ti-chi-bot bot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Aug 2, 2023
@ti-chi-bot ti-chi-bot bot added needs-1-more-lgtm Indicates a PR needs 1 more LGTM. approved labels Aug 10, 2023
cdc/model/sink.go Outdated Show resolved Hide resolved
cdc/model/sink.go Outdated Show resolved Hide resolved
for i := range updateEvent.Columns {
col := updateEvent.Columns[i]
preCol := updateEvent.PreColumns[i]
if col != nil && (col.Flag.IsUniqueKey() || col.Flag.IsHandleKey()) &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does this condition " col.Flag.IsHandleKey()) && preCol != nil && (preCol.Flag.IsUniqueKey()" mean?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here we check if the column is a part of the handle key or a unique key, if so we should check if the value has been changed,
for the condition " col.Flag.IsHandleKey()) && preCol != nil && (preCol.Flag.IsUniqueKey()" I think it's safe to split the row because we can always split an update to delete + insert.

@sdojjy
Copy link
Member Author

sdojjy commented Aug 11, 2023

/retest-required

@ti-chi-bot ti-chi-bot bot added the lgtm label Aug 14, 2023
@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Aug 14, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: asddongmen, nongfushanquan

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot removed the needs-1-more-lgtm Indicates a PR needs 1 more LGTM. label Aug 14, 2023
@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Aug 14, 2023

[LGTM Timeline notifier]

Timeline:

  • 2023-08-10 07:19:55.102367145 +0000 UTC m=+183559.651383127: ☑️ agreed by asddongmen.
  • 2023-08-14 01:16:28.792919307 +0000 UTC m=+507353.341935296: ☑️ agreed by nongfushanquan.

@ti-chi-bot ti-chi-bot bot merged commit 7f42fce into pingcap:master Aug 14, 2023
@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created to branch release-6.5: #9558.

ti-chi-bot pushed a commit to ti-chi-bot/tiflow that referenced this pull request Aug 14, 2023
@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created to branch release-7.1: #9559.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved lgtm needs-cherry-pick-release-6.5 Should cherry pick this PR to release-6.5 branch. needs-cherry-pick-release-7.1 Should cherry pick this PR to release-7.1 branch. release-note-none Denotes a PR that doesn't merit a release note. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Disordering events in a transaction with TiCDC may cause conflicts in downstream execution
5 participants