Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

expression: don't cast collation for in expression is the new collation is disabled #52812

Merged
merged 5 commits into from
Apr 23, 2024

Conversation

wjhuang2016
Copy link
Member

@wjhuang2016 wjhuang2016 commented Apr 22, 2024

What problem does this PR solve?

Issue Number: close #52772

Problem Summary:
If the new collation is disabled, we should not cast collation for in expression, which may lead to panic in the issue case.

What changed and how does it work?

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No need to test
    • I checked and no code files have been changed.

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

Signed-off-by: wjhuang2016 <[email protected]>
@ti-chi-bot ti-chi-bot bot added do-not-merge/needs-tests-checked release-note-none Denotes a PR that doesn't merit a release note. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed do-not-merge/needs-tests-checked labels Apr 22, 2024
if err != nil && !collate.NewCollationEnabled() {
// args[0] can't be nil here.
// Use the collation of the first argument instead of returning an error.
et = &ExprCollation{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'll cause some behavior change 🤔 . For example:

select convert('1' using utf8) collate utf8_bin = convert('1' using utf8) collate utf8_general_ci;

It'll return an error before this PR no matter whether new_collation is enabled or not, but after this PR, it'll return without any error.

For this issue, I propose two other options, but none of them is perfect 🤦 :

  1. Avoid constant propogation in this case. We can replace the NewFunctionInternal with NewFunction in the ColumnSubstituteImpl, and check the err. If it returns an err, we can avoid substitute in this case.

  2. Add back the newExpr.SetCoercibility(v.Coercibility()) removed in expression, cmd: fix ColumnSubstitute and allow some cases to substitute #38826. It sounds better because actually the coercibility (after substitute) should be the Column (2). I'm not sure 100% sure, but maybe we can fix the issue found in expression, cmd: fix ColumnSubstitute and allow some cases to substitute #38826 by Clone the newExpr before SetCoercibility.

Copy link
Member

@YangKeao YangKeao Apr 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proposal 2 is not correct, because the following case will fail:

create table t (a char(10) collate utf8mb4_bin, b char(10) collate utf8mb4_general_ci);
insert into t values ('a', 'A');
select * from t t1, t t2 where t1.a=t2.b and t2.b='a' collate utf8mb4_general_ci;

It's expected to return 0 row, but it returns 1 row.

Signed-off-by: wjhuang2016 <[email protected]>
Copy link

codecov bot commented Apr 22, 2024

Codecov Report

Merging #52812 (db923bf) into master (aa2c694) will increase coverage by 2.5463%.
Report is 20 commits behind head on master.
The diff coverage is 100.0000%.

Additional details and impacted files
@@               Coverage Diff                @@
##             master     #52812        +/-   ##
================================================
+ Coverage   72.3233%   74.8697%   +2.5463%     
================================================
  Files          1473       1478         +5     
  Lines        427451     433899      +6448     
================================================
+ Hits         309147     324859     +15712     
+ Misses        99013      89220      -9793     
- Partials      19291      19820       +529     
Flag Coverage Δ
integration 56.5756% <100.0000%> (?)
unit 71.7213% <100.0000%> (+0.5319%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
dumpling 53.9957% <ø> (ø)
parser ∅ <ø> (∅)
br 49.8739% <ø> (+8.6646%) ⬆️

Signed-off-by: wjhuang2016 <[email protected]>
Signed-off-by: wjhuang2016 <[email protected]>
@wjhuang2016 wjhuang2016 force-pushed the fix_collation_panic branch from 9440aa5 to 0d7d490 Compare April 23, 2024 02:52
@ti-chi-bot ti-chi-bot bot added the sig/planner SIG: Planner label Apr 23, 2024
@@ -1951,6 +1951,9 @@ func (er *expressionRewriter) castCollationForIn(colLen int, elemCnt int, stkLen
if colLen != 1 {
return
}
if !collate.NewCollationEnabled() {
Copy link
Member

@YangKeao YangKeao Apr 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment here to tell why castCollationForIn is disabled when new collation is not enabled? It's not intuitive. IIUC, it works because it didn't set coercibility to 0, so that the following panic (mix collation with 0 coercibility) will not happen.

I personally prefer to also enable the collation check in ColumnSubstituteImpl for old collation, but it may cause big performance impact and need manually workaround (rewriting SQL), so it's not good.
This solution is also fine for me, as I cannot think up any counter example in old collation.

@ti-chi-bot ti-chi-bot bot added the needs-1-more-lgtm Indicates a PR needs 1 more LGTM. label Apr 23, 2024
@wjhuang2016 wjhuang2016 changed the title expression: allows illegalMixCollationErr when collation disable expression: don't cast collation is the new collation is disabled Apr 23, 2024
@wjhuang2016 wjhuang2016 changed the title expression: don't cast collation is the new collation is disabled expression: don't cast collation for in expression is the new collation is disabled Apr 23, 2024
Signed-off-by: wjhuang2016 <[email protected]>
Copy link

ti-chi-bot bot commented Apr 23, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: winoros, YangKeao

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added approved lgtm and removed needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Apr 23, 2024
Copy link

ti-chi-bot bot commented Apr 23, 2024

[LGTM Timeline notifier]

Timeline:

  • 2024-04-23 04:54:16.365001698 +0000 UTC m=+63213.104904609: ☑️ agreed by YangKeao.
  • 2024-04-23 15:38:03.061166769 +0000 UTC m=+101839.801069681: ☑️ agreed by winoros.

@ti-chi-bot ti-chi-bot bot merged commit cf5c68e into pingcap:master Apr 23, 2024
23 checks passed
@ti-chi-bot ti-chi-bot added needs-cherry-pick-release-8.1 Should cherry pick this PR to release-8.1 branch. needs-cherry-pick-release-7.1 Should cherry pick this PR to release-7.1 branch. labels Apr 24, 2024
@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created to branch release-8.1: #52863.

@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created to branch release-7.1: #52864.

ti-chi-bot pushed a commit to ti-chi-bot/tidb that referenced this pull request Apr 24, 2024
ti-chi-bot bot pushed a commit that referenced this pull request Apr 24, 2024
ti-chi-bot bot pushed a commit that referenced this pull request Apr 24, 2024
@ti-chi-bot ti-chi-bot added the needs-cherry-pick-release-7.5 Should cherry pick this PR to release-7.5 branch. label May 7, 2024
@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created to branch release-7.5: #53054.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved lgtm needs-cherry-pick-release-7.1 Should cherry pick this PR to release-7.1 branch. needs-cherry-pick-release-7.5 Should cherry pick this PR to release-7.5 branch. needs-cherry-pick-release-8.1 Should cherry pick this PR to release-8.1 branch. release-note-none Denotes a PR that doesn't merit a release note. sig/planner SIG: Planner size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TiDB panic when new collation is disabled and two expressions use different collation
4 participants