Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The result of inl_merge_join with the enum type join column is wrong #24473

Closed
lilinghai opened this issue May 8, 2021 · 11 comments · Fixed by #24775
Closed

The result of inl_merge_join with the enum type join column is wrong #24473

lilinghai opened this issue May 8, 2021 · 11 comments · Fixed by #24775
Assignees
Labels
severity/critical sig/planner SIG: Planner type/bug The issue is confirmed as a bug.

Comments

@lilinghai
Copy link
Contributor

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

CREATE TABLE `x` (
  `a` enum('y','b','1','x','0','null') DEFAULT NULL,
  KEY `a` (`a`)
);
insert into x values("x"),("x"),("b"),("y");
SELECT /*+ inl_merge_join (t2,t3) */ t2.a,t3.a FROM x t2 inner join x t3 on t2.a = t3.a;

The plan is

+-----------------------------+---------+-----------+----------------------+-----------------------------------------------------------------------+
| id                          | estRows | task      | access object        | operator info                                                         |
+-----------------------------+---------+-----------+----------------------+-----------------------------------------------------------------------+
| IndexMergeJoin_18           | 5.00    | root      |                      | inner join, inner:IndexReader_16, outer key:rs.x.a, inner key:rs.x.a  |
| ├─IndexReader_31(Build)     | 4.00    | root      |                      | index:IndexFullScan_30                                                |
| │ └─IndexFullScan_30        | 4.00    | cop[tikv] | table:t2, index:a(a) | keep order:false, stats:pseudo                                        |
| └─IndexReader_16(Probe)     | 1.25    | root      |                      | index:Selection_15                                                    |
|   └─Selection_15            | 1.25    | cop[tikv] |                      | not(isnull(rs.x.a))                                                   |
|     └─IndexRangeScan_14     | 1.25    | cop[tikv] | table:t3, index:a(a) | range: decided by [eq(rs.x.a, rs.x.a)], keep order:true, stats:pseudo |
+-----------------------------+---------+-----------+----------------------+-----------------------------------------------------------------------+

2. What did you expect to see? (Required)

+------+------+
| a    | a    |
+------+------+
| y    | y    |
| b    | b    |
| x    | x    |
| x    | x    |
| x    | x    |
| x    | x    |
+------+------+

3. What did you see instead (Required)

+------+------+
| a    | a    |
+------+------+
| y    | y    |
+------+------+

4. What is your TiDB version? (Required)

Release Version: v4.0.0-beta.2-2807-g289dcfefd-dirty
Edition: Community
Git Commit Hash: 289dcfe
Git Branch: master
UTC Build Time: 2021-05-06 15:11:16
GoVersion: go1.13
Race Enabled: false
TiKV Min Version: v3.0.0-60965b006877ca7234adaced7890d7b029ed1306
Check Table Before Drop: false

@lilinghai lilinghai added type/bug The issue is confirmed as a bug. sig/planner SIG: Planner severity/critical labels May 8, 2021
@lilinghai
Copy link
Contributor Author

merge_join plan also have the same problem

@sylzd
Copy link
Contributor

sylzd commented May 12, 2021

/assign

@sylzd sylzd removed their assignment May 17, 2021
@sylzd
Copy link
Contributor

sylzd commented May 17, 2021

/assign

@wshwsh12
Copy link
Contributor

It seems enum shouldn't use index to participate merge_join/inl_merge_join, because enum index is order by values, not name.
In this case, 'y' have the bigest ascii and the smallest value 1, it leads the result wrong.
For example, if I change the elements in enum struct, using alphabetical order, the result will be different. @winoros PTAL

drop table x;
CREATE TABLE `x` (
  `a` enum('b','x','y','1','0','null') DEFAULT NULL, 
  KEY `a` (`a`)
);
insert into x values("x"),("x"),("b"),("y");
SELECT /*+ inl_merge_join (t2,t3) */ t2.a,t3.a FROM x t2 inner join x t3 on t2.a = t3.a;
+------+------+
| a    | a    |
+------+------+
| b    | b    |
| x    | x    |
| x    | x    |
| x    | x    |
| x    | x    |
| y    | y    |
+------+------+
6 rows in set (0.001 sec)

@sylzd
Copy link
Contributor

sylzd commented May 18, 2021

It seems enum shouldn't use index to participate merge_join/inl_merge_join, because enum index is order by values, not name.
In this case, 'y' have the bigest ascii and the smallest value 1, it leads the result wrong.
For example, if I change the elements in enum struct, using alphabetical order, the result will be different. @winoros PTAL

drop table x;
CREATE TABLE `x` (
  `a` enum('b','x','y','1','0','null') DEFAULT NULL, 
  KEY `a` (`a`)
);
insert into x values("x"),("x"),("b"),("y");
SELECT /*+ inl_merge_join (t2,t3) */ t2.a,t3.a FROM x t2 inner join x t3 on t2.a = t3.a;
+------+------+
| a    | a    |
+------+------+
| b    | b    |
| x    | x    |
| x    | x    |
| x    | x    |
| x    | x    |
| y    | y    |
+------+------+
6 rows in set (0.001 sec)

I agree, unless compareFunc can be added a enum type, or we can make field type as int instead of string to avoid comparing by string?
Anyway, index order & compare rule should be the same. Or let's throw the merge join planner of enum type away.

@sylzd
Copy link
Contributor

sylzd commented May 26, 2021

It seems enum shouldn't use index to participate merge_join/inl_merge_join, because enum index is order by values, not name.
In this case, 'y' have the bigest ascii and the smallest value 1, it leads the result wrong.
For example, if I change the elements in enum struct, using alphabetical order, the result will be different. @winoros PTAL

drop table x;
CREATE TABLE `x` (
  `a` enum('b','x','y','1','0','null') DEFAULT NULL, 
  KEY `a` (`a`)
);
insert into x values("x"),("x"),("b"),("y");
SELECT /*+ inl_merge_join (t2,t3) */ t2.a,t3.a FROM x t2 inner join x t3 on t2.a = t3.a;
+------+------+
| a    | a    |
+------+------+
| b    | b    |
| x    | x    |
| x    | x    |
| x    | x    |
| x    | x    |
| y    | y    |
+------+------+
6 rows in set (0.001 sec)

I agree, unless compareFunc can be added a enum type, or we can make field type as int instead of string to avoid comparing by string?
Anyway, index order & compare rule should be the same. Or let's throw the merge join planner of enum type away.

Logic conflict with order and join in enum type. I think it is resonable to cancel the merge join plan of enum type.

@winoros
Copy link
Member

winoros commented May 28, 2021

You can disable the index merge join for enum at constructIndexMergeJoin by checking the column type.
@sylzd

@winoros
Copy link
Member

winoros commented May 28, 2021

And for merge join. It may be caused by that we just push the order property down to the index but without telling the index which order to follow. So it goes wrong. A quick fix is also banning the enum type for merge join at GetMergeJoin.
@sylzd

@sylzd
Copy link
Contributor

sylzd commented Jun 2, 2021

And for merge join. It may be caused by that we just push the order property down to the index but without telling the index which order to follow. So it goes wrong. A quick fix is also banning the enum type for merge join at GetMergeJoin.
@sylzd

I will try.

@ti-srebot
Copy link
Contributor

Please edit this comment or add a new comment to complete the following information

Not a bug

  1. Remove the 'type/bug' label
  2. Add notes to indicate why it is not a bug

Duplicate bug

  1. Add the 'type/duplicate' label
  2. Add the link to the original bug

Bug

Note: Make Sure that 'component', and 'severity' labels are added
Example for how to fill out the template: #20100

1. Root Cause Analysis (RCA) (optional)

2. Symptom (optional)

3. All Trigger Conditions (optional)

4. Workaround (optional)

5. Affected versions

6. Fixed versions

@jebter
Copy link

jebter commented Jun 7, 2021

The release-5.0 has the same problem

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
severity/critical sig/planner SIG: Planner type/bug The issue is confirmed as a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants