-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DDL: fix a bug in column charset and collate when create table and modify column #11300
DDL: fix a bug in column charset and collate when create table and modify column #11300
Conversation
Codecov Report
@@ Coverage Diff @@
## master #11300 +/- ##
===========================================
Coverage 82.1169% 82.1169%
===========================================
Files 424 424
Lines 93636 93636
===========================================
Hits 76891 76891
Misses 11441 11441
Partials 5304 5304 |
ddl/ddl_api.go
Outdated
if err := setCharsetCollationFlenDecimal(colDef.Tp, tblCharset, dbCharset); err != nil { | ||
// The specified collate is in colDef.Options, but the charset is in colDef.Tp. | ||
specifiedCollate := "" | ||
for _, op := range colDef.Options { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The colDef.options is iterated in the https://github.com/pingcap/tidb/blob/master/ddl/ddl_api.go#L2062, Could we eliminate the iteration here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The colDef.options is iterated in the https://github.com/pingcap/tidb/blob/master/ddl/ddl_api.go#L2062, Could we eliminate the iteration here?
I've checked it!
The buildColumnAndConstraint
func is called by two different logic: one is create table, the other is modify column.
For the later there is an option loop indeed before call this func, if we try to make use of loop, we need pass parameter, it's ok. Consequently, for the former that's means we have no option loop neither in the func or before call it, it will cause an error.
So we still need the loop here.
Rest LGTM |
ddl/ddl_api.go
Outdated
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove this line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove this line.
Roger that!
00d307c
to
7d52034
Compare
/run-all-tests |
/run-common-test |
7d52034
to
a396a72
Compare
ddl/ddl_api.go
Outdated
func extractCollateFromOption(def *ast.ColumnDef) string { | ||
specifiedCollate := "" | ||
collateIndex := -1 | ||
for i, op := range def.Options { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you may see, it is possible that we encounter multiple ast.ColumnOptionCollate
, we should choose the last one as the final result:
For example, in MySQL 8.0:
mysql> create table tm(a varchar(20) collate utf8_unicode_ci collate utf8_general_ci) default charset=utf8mb4;
Query OK, 0 rows affected, 2 warnings (0.08 sec)
mysql> show create table tm;
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| tm | CREATE TABLE `tm` (
`a` varchar(20) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.01 sec)
Notice that collation of column a
is utf8_general_ci
.
Please also notice that the when we resolve multiple collations, their charsets must be the same:
mysql> create table tm(a varchar(20) collate ascii_bin collate utf8_general_ci) default charset=utf8mb4;
ERROR 1253 (42000): COLLATION 'utf8_general_ci' is not valid for CHARACTER SET 'ascii'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can follow similar existing logics to handle multiple ColumnOptionCollate
:
Line 109 in b3f7186
case ast.DatabaseOptionCollate: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can follow similar existing logics to handle multiple
ColumnOptionCollate
:Line 109 in b3f7186
case ast.DatabaseOptionCollate:
That's great !!!
From mysql syntax, charset is unique specified, but collate can be multiple.
That's means I need to follow mysql's behavior when handling multiple collate.
And this reference code do me a great favor!!!
a396a72
to
3866f0c
Compare
|
||
// Test charset and multiple collate modification in column | ||
tk.MustExec("drop table if exists modify_column_multiple_collate;") | ||
tk.MustExec("create table modify_column_multiple_collate (a char(1) collate utf8_bin collate utf8_general_ci) charset utf8mb4 collate utf8mb4_bin") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add some cases for ErrCollationCharsetMismatch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
// When handle charset and collate of a column, we should take collates(may multiple) in option into | ||
// consideration rather than handling it separately. Then the following option processing logic should | ||
// ignores this cause we will delete them from option here | ||
func extractCollateFromOption(def *ast.ColumnDef) []string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the golang comment needs start with the function name, // extractCollateFromOption ....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the golang comment needs start with the function name,
// extractCollateFromOption ....
ok. got it.
900cfba
to
a730428
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/run-all-tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/run-cherry-picker |
cherry pick to release-3.0 failed |
cherry pick to release-2.1 failed |
What problem does this PR solve?
This PR tries to fix the following bugs when creating a table and modify the columns:
Column
a
should use utf8_bin as collate and derive utf8 as it's charset as we expected.The reason for the unexpected error is that: the second sql will use the table's charset and collate, the new col is consist of utf8mb4 charset from the table and utf8_general_ci collate from the specified one before it is assigned to old col.
What is changed and how it works?
When creating tables and modifying columns, the specified collate info is in
colDef.Option
rathercolDef.collate
.So we should take this collate info into consideration when we set them or judge conflict or derive the corresponding charset.
Check List
Tests
Code changes
Add some logic code to handle collate info in
colDef.Option
and remove the option when we done to prevent double check.