Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved support for MySQL Shell dump and load #24515

Open
1 of 6 tasks
dveeden opened this issue May 10, 2021 · 6 comments
Open
1 of 6 tasks

Improved support for MySQL Shell dump and load #24515

dveeden opened this issue May 10, 2021 · 6 comments
Labels
type/enhancement The issue or PR belongs to an enhancement.

Comments

@dveeden
Copy link
Contributor

dveeden commented May 10, 2021

Development Task

MySQL Shell is a relatively new tool from Oracle which is used as the new CLI for various DBA and developer tasks. One of the features of this tool is dumping and loading data. This is somewhat similar to mysqldump and mysqlpump, but more focus is placed on parallel dumping and restoring to get better performance. This also adds Oracle Cloud features for storing and loading umps.

Dumping data from TiDB 5.0

[dvaneeden@dve-carbon ~]$ mysqlsh mysql://root@localhost:4000 -- util dump-schemas employees --output-url=/tmp/mysqlsh_dump_employees
Acquiring global read lock
NOTE: Error acquiring global read lock: MySQL Error 1105 (HY000): FLUSH TABLES WITH READ LOCK is not supported.  Please use @@tidb_snapshot
Global read lock has been released
ERROR: Unable to acquire global read lock: MySQL Error 1105 (HY000): FLUSH TABLES WITH READ LOCK is not supported.  Please use @@tidb_snapshot
[dvaneeden@dve-carbon ~]$ mysqlsh mysql://root@localhost:4000 -- util dump-schemas employees --output-url=/tmp/mysqlsh_dump_employees --consistent=false
WARNING: Failed to fetch value of @@GLOBAL.GTID_EXECUTED.
Gathering information - done
WARNING: The dumped value of gtid_executed is not guaranteed to be consistent
ERROR: User 'root'@'%' is missing the following privilege(s) for table `employees`.`salaries`: TRIGGER.
[dvaneeden@dve-carbon ~]$ mysqlsh mysql://root@localhost:4000 -- util dump-schemas employees --output-url=/tmp/mysqlsh_dump_employees --consistent=false --triggers=false
WARNING: Failed to fetch value of @@GLOBAL.GTID_EXECUTED.
Gathering information - done
WARNING: The dumped value of gtid_executed is not guaranteed to be consistent
Writing global DDL files
Writing DDL for schema `employees`
Writing DDL for view `employees`.`current_dept_emp`
Writing DDL for view `employees`.`dept_emp_latest_date`
Preparing data dump for table `employees`.`salaries`
Data dump for table `employees`.`salaries` will be chunked using column `emp_no`
Writing DDL for table `employees`.`salaries`
Preparing data dump for table `employees`.`titles`
Data dump for table `employees`.`titles` will be chunked using column `emp_no`
Preparing data dump for table `employees`.`dept_emp`
Data dump for table `employees`.`dept_emp` will be chunked using column `emp_no`
Preparing data dump for table `employees`.`dept_manager`
Data dump for table `employees`.`dept_manager` will be chunked using column `emp_no`
Preparing data dump for table `employees`.`departments`
Data dump for table `employees`.`departments` will be chunked using column `dept_no`
Preparing data dump for table `employees`.`employees`
Data dump for table `employees`.`employees` will be chunked using column `emp_no`
Running data dump using 4 threads.
NOTE: Progress information uses estimated values and may not be accurate.
Writing DDL for table `employees`.`titles`
Writing DDL for table `employees`.`dept_emp`
Writing DDL for table `employees`.`dept_manager`
Writing DDL for table `employees`.`departments`
Writing DDL for table `employees`.`employees`
Data dump for table `employees`.`dept_manager` will be written to 1 file
Data dump for table `employees`.`dept_emp` will be written to 1 file
Data dump for table `employees`.`salaries` will be written to 1 file
Data dump for table `employees`.`titles` will be written to 1 file
Data dump for table `employees`.`departments` will be written to 1 file                                         
Data dump for table `employees`.`employees` will be written to 1 file                                           
1 thds dumping - 100% (3.92M rows / ~3.92M rows), 960.18K rows/s, 34.68 MB/s uncompressed, 9.12 MB/s compressed 
Duration: 00:00:04s                                                                                            
Schemas dumped: 1                                                                                              
Tables dumped: 6                                                                                               
Uncompressed data size: 141.50 MB                                                                              
Compressed data size: 37.19 MB                                                                                 
Compression ratio: 3.8                                                                                         
Rows written: 3919015                                                                                          
Bytes written: 37.19 MB                                                                                        
Average uncompressed throughput: 34.79 MB/s                                                                    
Average compressed throughput: 9.14 MB/s                                                                       

The issues here are:

  • No support for FLUSH TABLES WITH READ LOCK. The workaround is to use --consistent=false.
  • No support for how mysqlsh checks for trigger privileges. The workaround is to use --triggers=false. I assume it only checks for the permission, even when there are no triggers.

Loading data

When loading with mysqlsh mysql://root@localhost:4000 -- util loadDump /tmp/mysqlsh_dump_employees the tool fails without a proper error message. Inspecting the TiDB tools shows that the issue is related to the isolation level support.

[2021/05/10 08:01:01.316 +02:00] [WARN] [session.go:1429] ["run statement failed"] [conn=155] [schemaVersion=138] [error="[variable:8048]The isolation level 'READ-UNCOMMITTED' is not supported. Set tidb_skip_isolation_level_check=1 to skip this error"] [session="{\n  \"currDBName\": \"employees\",\n  \"id\": 155,\n  \"status\": 2,\n  \"strictMode\": false,\n  \"user\": {\n    \"Username\": \"root\",\n    \"Hostname\": \"127.0.0.1\",\n    \"CurrentUser\": false,\n    \"AuthUsername\": \"root\",\n    \"AuthHostname\": \"%\"\n  }\n}"]

Running set global tidb_skip_isolation_level_check=1 makes the tool skip over this issue.

The next issue is with LOAD DATA LOCAL INFILE...REPLACE INTO which isn't supported by TiDB. Unfortunately this incompatibility is not listed on https://docs.pingcap.com/tidb/stable/mysql-compatibility and I don't know of a good workaround.

[dvaneeden@dve-carbon ~]$ mysqlsh mysql://root@localhost:4000 -- util loadDump /tmp/mysqlsh_dump_employees --resetProgress
Loading DDL and Data from '/tmp/mysqlsh_dump_employees' using 4 threads.
Opening dump...
Target is MySQL 5.7.25-TiDB-v5.0.1. Dump was produced from MySQL 5.7.25-TiDB-v5.0.1
NOTE: Load progress file detected for the instance but 'resetProgress' option was enabled. Load progress will be discarded and the whole dump will be reloaded.
Checking for pre-existing objects...
Executing common preamble SQL
Executing DDL script for schema `employees`
[Worker003] Executing DDL script for `employees`.`employees`
[Worker002] Executing DDL script for `employees`.`dept_manager`
[Worker001] Executing DDL script for `employees`.`dept_emp`
[Worker000] Executing DDL script for `employees`.`departments`
[Worker003] Executing DDL script for `employees`.`titles`
[Worker003] Executing DDL script for `employees`.`salaries`
[Worker000] Executing DDL script for `employees`.`current_dept_emp` (placeholder for view)
[Worker001] Executing DDL script for `employees`.`dept_emp_latest_date` (placeholder for view)
Executing DDL script for view `employees`.`dept_emp_latest_date`
Executing DDL script for view `employees`.`current_dept_emp`
ERROR: [Worker003] employees@titles@@0.tsv.zst: MySQL Error 1105 (HY000): Load Data: don't support load data with replace field: LOAD DATA LOCAL INFILE '/tmp/mysqlsh_dump_employees/employees@titles@@0.tsv.zst' REPLACE INTO TABLE `employees`.`titles` CHARACTER SET 'utf8mb4' FIELDS TERMINATED BY '	' ESCAPED BY '\\' LINES STARTING BY '' TERMINATED BY '\n' (`emp_no`, `title`, `from_date`, `to_date`)
ERROR: [Worker001] employees@employees@@0.tsv.zst: MySQL Error 1105 (HY000): Load Data: don't support load data with replace field: LOAD DATA LOCAL INFILE '/tmp/mysqlsh_dump_employees/employees@employees@@0.tsv.zst' REPLACE INTO TABLE `employees`.`employees` CHARACTER SET 'utf8mb4' FIELDS TERMINATED BY '	' ESCAPED BY '\\' LINES STARTING BY '' TERMINATED BY '\n' (`emp_no`, `birth_date`, `first_name`, `last_name`, `gender`, `hire_date`)
ERROR: [Worker000] employees@salaries@@0.tsv.zst: MySQL Error 1105 (HY000): Load Data: don't support load data with replace field: LOAD DATA LOCAL INFILE '/tmp/mysqlsh_dump_employees/employees@salaries@@0.tsv.zst' REPLACE INTO TABLE `employees`.`salaries` CHARACTER SET 'utf8mb4' FIELDS TERMINATED BY '	' ESCAPED BY '\\' LINES STARTING BY '' TERMINATED BY '\n' (`emp_no`, `salary`, `from_date`, `to_date`)
ERROR: Aborting load...                                               
ERROR: [Worker002] employees@dept_emp@@0.tsv.zst: MySQL Error 1105 (HY000): Load Data: don't support load data with replace field: LOAD DATA LOCAL INFILE '/tmp/mysqlsh_dump_employees/employees@dept_emp@@0.tsv.zst' REPLACE INTO TABLE `employees`.`dept_emp` CHARACTER SET 'utf8mb4' FIELDS TERMINATED BY '	' ESCAPED BY '\\' LINES STARTING BY '' TERMINATED BY '\n' (`emp_no`, `dept_no`, `from_date`, `to_date`)
                                                                      
No data loaded.
4 errors and 0 warnings messages were reported during the load.
ERROR: Error loading dump

Other compatibility issues

The TiDB logs show that the tool tries to run SELECT @@SESSION.sql_generate_invisible_primary_key which is not supported by TiDB. However as this variable was recently introduced in MySQL 8.0 the tool ignores this error.

Why compatibility is needed

  • This may be an alternative to using dumpling/DM for smaller setups
  • MySQL Shell might be a tool that is already installed when doing migrations from MySQL to TiDB
  • We should allow users to pick the tools they like, trust, etc.
  • There are probably other tools that have similar requirements

Tasks

Important:

  • Support for LOAD DATA LOCAL INFILE ... REPLACE INTO
  • Document LOAD DATA LOCAL INFILE ... REPLACE INTO MySQL compatibility

Minor and/or nice to have:

@dveeden dveeden added the type/enhancement The issue or PR belongs to an enhancement. label May 10, 2021
@dveeden
Copy link
Contributor Author

dveeden commented May 10, 2021

For the TRIGGER permission:

mysql 8.0.22 > CREATE USER foo;
Query OK, 0 rows affected (0.01 sec)

mysql 8.0.22 > GRANT TRIGGER ON *.* TO foo;
Query OK, 0 rows affected (0.01 sec)

mysql 8.0.22 > SHOW GRANTS FOR foo\G
*************************** 1. row ***************************
Grants for foo@%: GRANT TRIGGER ON *.* TO `foo`@`%`
1 row in set (0.00 sec)
tidb 5.7.25-TiDB-v5.0.1 > CREATE USER foo;
Query OK, 0 rows affected (0.03 sec)

tidb 5.7.25-TiDB-v5.0.1 > GRANT TRIGGER ON *.* TO foo;
Query OK, 0 rows affected (0.03 sec)

tidb 5.7.25-TiDB-v5.0.1 > SHOW GRANTS FOR foo\G
*************************** 1. row ***************************
Grants for foo@%: GRANT Trigger ON *.* TO 'foo'@'%'
1 row in set (0.00 sec)

It may be comparing TRIGGER against Trigger, but I'm not sure yet

dveeden added a commit to dveeden/docs that referenced this issue May 10, 2021
- See pingcap/tidb#24515

```
MySQL Error 1105 (HY000): Load Data: don't support load data with replace field: LOAD DATA LOCAL INFILE '/tmp/mysqlsh_dump_employees/employees@titles@@0.tsv.zst' REPLACE INTO TABLE `employees`.`titles` CHARACTER SET 'utf8mb4' FIELDS TERMINATED BY '	' ESCAPED BY '\\' LINES STARTING BY '' TERMINATED BY '\n' (`emp_no`, `title`, `from_date`, `to_date`)
```
dveeden added a commit to dveeden/docs that referenced this issue May 10, 2021
- See pingcap/tidb#24515

```
MySQL Error 1105 (HY000): Load Data: don't support load data with replace field: LOAD DATA LOCAL INFILE '/tmp/mysqlsh_dump_employees/employees@titles@@0.tsv.zst' REPLACE INTO TABLE `employees`.`titles` CHARACTER SET 'utf8mb4' FIELDS TERMINATED BY '	' ESCAPED BY '\\' LINES STARTING BY '' TERMINATED BY '\n' (`emp_no`, `title`, `from_date`, `to_date`)
```
dveeden added a commit to dveeden/docs that referenced this issue May 10, 2021
- See pingcap/tidb#24515

```
MySQL Error 1105 (HY000): Load Data: don't support load data with replace field: LOAD DATA LOCAL INFILE '/tmp/mysqlsh_dump_employees/employees@titles@@0.tsv.zst' REPLACE INTO TABLE `employees`.`titles` CHARACTER SET 'utf8mb4' FIELDS TERMINATED BY '	' ESCAPED BY '\\' LINES STARTING BY '' TERMINATED BY '\n' (`emp_no`, `title`, `from_date`, `to_date`)
```
TomShawn pushed a commit to pingcap/docs that referenced this issue May 12, 2021
- See pingcap/tidb#24515

```
MySQL Error 1105 (HY000): Load Data: don't support load data with replace field: LOAD DATA LOCAL INFILE '/tmp/mysqlsh_dump_employees/employees@titles@@0.tsv.zst' REPLACE INTO TABLE `employees`.`titles` CHARACTER SET 'utf8mb4' FIELDS TERMINATED BY '	' ESCAPED BY '\\' LINES STARTING BY '' TERMINATED BY '\n' (`emp_no`, `title`, `from_date`, `to_date`)
```
@aioas
Copy link

aioas commented Dec 31, 2021

When will LOAD DATA LOCAL INFILE ... REPLACE INTO be supported? This is very important to us!!!

@dveeden
Copy link
Contributor Author

dveeden commented Dec 31, 2021

When will LOAD DATA LOCAL INFILE ... REPLACE INTO be supported? This is very important to us!!!

@aioas How are you using or planning to use LOAD DATA LOCAL INFILE ... REPLACE INTO? any specific usecase? With MySQL Shell dump and load or something else? Information like this can help with prioritization and also to make sure we fully cover that usecase.

@zn9f
Copy link

zn9f commented Dec 31, 2021

When will LOAD DATA LOCAL INFILE ... REPLACE INTO be supported? This is very important to us!!!

@aioas How are you using or planning to use LOAD DATA LOCAL INFILE ... REPLACE INTO? any specific usecase? With MySQL Shell dump and load or something else? Information like this can help with prioritization and also to make sure we fully cover that usecase.

For example, in a scenario, I use load data to import 10,000 pieces of data into TiDB, but it fails when the 1000th item is imported. In order to ensure the consistency of the data, I need to import these 10,000 pieces of data from the beginning. I don’t want to delete the imported data through programs or manually. At this time, I need to use load data...replace... to ensure the correctness of the repeated imported data.

@aioas
Copy link

aioas commented Jan 4, 2022

When will LOAD DATA LOCAL INFILE ... REPLACE INTO be supported? This is very important to us!!!

@aioas How are you using or planning to use LOAD DATA LOCAL INFILE ... REPLACE INTO? any specific usecase? With MySQL Shell dump and load or something else? Information like this can help with prioritization and also to make sure we fully cover that usecase.

We have a large number of offline computing scenarios of big data, and regularly update user data to the latest, new users are inserted, and old users are updated. The most suitable use is LOAD DATA LOCAL INFILE ... REPLACE INTO. Currently, there is no better alternative, so we really hope to support it.

@lance6716
Copy link
Contributor

LOAD DATA ... REPLACE INTO should be supported in #41947 . Welcome to test it in the coming release

@dveeden dveeden mentioned this issue Mar 20, 2024
18 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/enhancement The issue or PR belongs to an enhancement.
Projects
None yet
Development

No branches or pull requests

4 participants