Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PS-3890: Native Partitioning for TokuDB and RocksDB for 8.0 #2551

Conversation

vlad-lesin
Copy link
Contributor

@vlad-lesin vlad-lesin commented Sep 13, 2018

partition_base.* are copied from ha_partition.* in 5.7 with corresponding fixes.

5.7 required some additional code to read partition info from .frm files, in
8.0 this functionality is not required, the corresponding code was removed.

Currently "inline alter partition" and "exchange partition" are not implemented.

As "inline alter partition" is not implemented, all "alter partition" operations
involve full table copying. There can be such errors as "no partition found"
duriring copying, which must be ignored, what is also implemented in the server
code.

Testing: https://ps.cd.percona.com/view/8.0/job/percona-server-8.0-param/40/
Testing with "--big-test": https://ps.cd.percona.com/view/8.0/job/percona-server-8.0-param/46/

@@ -1,3 +1,3 @@
rocksdb_update_cf_options_basic : native partitioning
rocksdb_update_cf_options_basic : enable when https://bugs.mysql.com/bug.php?id=91763 is fixed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather that we didn't disable this as upstream has acknowledged the issue and a fix coming. We will likely forget to re-enable and verify the result.

unique_sec : native partitioning
add_index_inplace : inplace alter partitioning has not yet been implemented
add_index_inplace_crash : inplace alter partitioning has not yet been implemented
rocksdb_cf_per_partition : enable when https://bugs.mysql.com/bug.php?id=91763 is fixed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same, let's leave it enabled and accept the failure until upstream fix is merged.

@@ -1,5 +1,7 @@
--source include/have_debug_sync.inc
--source include/have_rocksdb.inc
# Remove the restriction when https://jira.percona.com/browse/PS-4705 is fixed
--source ../include/have_write_committed.inc
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's again accept the failure until we have merged the upstream fix.

@@ -1,5 +1,7 @@
--source include/have_debug_sync.inc
--source include/have_rocksdb.inc
# Remove when https://jira.percona.com/browse/PS-4705 is fixed
--source ../include/have_write_committed.inc
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's again accept the failure until we have merged the upstream fix.

@@ -1,6 +1,8 @@
--source include/have_rocksdb.inc
--source include/have_binlog_format_row.inc
--source include/master-slave.inc
# Remove when https://jira.percona.com/browse/PS-4705 is fixed
--source ../include/have_write_committed.inc
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's again accept the failure until we have merged the upstream fix.

CLUSTERING KEY `y` (`y`)
) ENGINE=TokuDB DEFAULT CHARSET=latin1
/*!50100 PARTITION BY HASH (x)
KEY `y` (`y`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs rebased to PS 8.0 HEAD and re-recording as clustering keys are now properly supported.

sql/handler.h Outdated
@@ -4164,6 +4164,10 @@ class handler {
*/

virtual bool is_ignorable_error(int error);
virtual bool continue_partition_copying_on_error(int error) {
(void)(error);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this and not attribute unused?

@george-lorch
Copy link
Contributor

george-lorch commented Sep 18, 2018

Mock merge to 8.0 HEAD and test, some failure:

tokudb.type_temporal_fractional [ fail ]
Test ended at 2018-09-18 21:14:19

CURRENT_TEST: tokudb.type_temporal_fractional
mysqltest: At line 6769: query 'CREATE TABLE t1 (
col_time_5_not_null TIME(5) NOT NULL,
col_timestamp_4_not_null_key TIMESTAMP(4) NOT NULL DEFAULT '1971-01-01 00:00:00.0000',
col_timestamp_1 TIMESTAMP(1) NOT NULL DEFAULT '1971-01-01 00:00:00.0',
pk TIMESTAMP NOT NULL DEFAULT '1971-01-01 00:00:00',
col_date_key date DEFAULT NULL,
col_time TIME DEFAULT NULL,
col_timestamp TIMESTAMP NOT NULL DEFAULT '1971-01-01 00:00:00',
col_time_3_not_null_key TIME(3) NOT NULL,
PRIMARY KEY (pk),
KEY col_timestamp_4_not_null_key (col_timestamp_4_not_null_key),
KEY col_date_key (col_date_key),
KEY col_time_3_not_null_key (col_time_3_not_null_key)
) ENGINE=MEMORY /*!50100 PARTITION BY KEY (pk)PARTITIONS 2 */' failed: 1178: The storage engine for the table doesn't support native partitioning

@george-lorch
Copy link
Contributor

Mock merge to 8.0 HEAD and test, some failure:
tokudb.parts.partition_mgm_lc1_tokudb [ fail ]
Test ended at 2018-09-18 21:18:21

Server log is:
2018-09-18T21:18:20.015452Z 0 [Warning] [MY-010099] [Server] Insecure configuration for --secure-file-priv: Data directory is accessible through --secure-file-priv. Consider choosing a different directory.
2018-09-18T21:18:20.015498Z 0 [Warning] [MY-010101] [Server] Insecure configuration for --secure-file-priv: Location is accessible to all OS users. Consider choosing a different directory.
2018-09-18T21:18:20.015553Z 0 [System] [MY-010116] [Server] /home/glorch/dev/8.0/percona-server-install-8.0/bin/mysqld (mysqld 8.0.12-1-debug) starting as process 27110
2018-09-18T21:18:20.302128Z 1 [ERROR] [MY-011087] [Server] Different lower_case_table_names settings for server ('1') and data dictionary ('0').
2018-09-18T21:18:20.302624Z 0 [ERROR] [MY-010020] [Server] Data Dictionary initialization failed.
2018-09-18T21:18:20.302651Z 0 [ERROR] [MY-010119] [Server] Aborting
2018-09-18T21:18:21.539132Z 0 [System] [MY-010910] [Server] /home/glorch/dev/8.0/percona-server-install-8.0/bin/mysqld: Shutdown complete (mysqld 8.0.12-1-debug) Source distribution.

I believe you need to add --bootstrap --lower-case-table-names=1 to the head of partition_mgm_lc1_tokudb-master.opt

@george-lorch
Copy link
Contributor

Overall it looks OK, but, I have concerns over the massive differences in result content from tokudb.parts.part_supported_sql_func_tokudb and the difference in rocksdb.partition. It almost looks like drop partition isn't working or something.

@laurynas-biveinis
Copy link
Contributor

@georgelorchpercona , I disagree with having multiple test failures for a few months while waiting for the upstream fix.

@george-lorch
Copy link
Contributor

With any luck, it is not months but weeks and hopefully before 8.0 GA. We did not disable failing tests in 5.7 when the issue initially arose, so why disable on the port. There are other tests outside of parts that are also failing due to same issue, we did not disable those either. Leave them disabled if you wish then, but I prefer to have them fail in my face knowing that they need to be re-evaluated when the fix comes. Tests that get disabled tend to stay disabled forever.

@laurynas-biveinis
Copy link
Contributor

I didn't realize we have intentionally-failing tests on 5.7, they should be disabled too, our MTR results are already extremely poor as is. The re-enablement should be handled by JIRA - I am not even sure having them around helps there, because we are already desensitized to them

@laurynas-biveinis
Copy link
Contributor

Discussed offline, OK to leave enabled

@vlad-lesin
Copy link
Contributor Author

tokudb.type_temporal_fractional [ fail ]
...
) ENGINE=MEMORY /*!50100 PARTITION BY KEY (pk)PARTITIONS 2 */' failed: 1178: The storage engine for the table doesn't support native partitioning

Storage engine here is "MEMORY". I didn't touch any storage engines except TokuDB and RocksDB.

@vlad-lesin
Copy link
Contributor Author

Overall it looks OK, but, I have concerns over the massive differences in result content from tokudb.parts.part_supported_sql_func_tokudb and the difference in rocksdb.partition. It almost looks like drop partition isn't working or something.

The difference in rocksdb.partition result file is only in warnings.

About tokudb.parts.part_supported_sql_func_tokudb - yes, there is a difference between inplace and non-inplace drop partition algorithm. For inplace drop partition just corresponding files are removed. For non-inplace - all rows are copied from the old table to the new one. And if any row from dropped partition suits for any partition in the table, the row will be copied to the new(altered) table, not removed.

This is common algorithm which is implemented in the server code, not engine-specific feature.

@laurynas-biveinis
Copy link
Contributor

Should I review it now or wait until #2493 comments are addressed?

@vlad-lesin vlad-lesin force-pushed the native-partitioning-toku-rocks-non-inplace-alter-partition-PR-8.0.12 branch 3 times, most recently from 59077b9 to 9ca5028 Compare October 2, 2018 09:02

--echo # Remove copied files
--remove_file $DATA_ARCH_PATH
--exec rm -rf $MYSQLD_DATADIR1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replace with force-rmdir

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced.

#
#insert into t1 values (1, repeat('0123456789', 9));
#
#drop table t1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do not comment out code like this. If this is temporary, add reason and track somewhere in JIRA, otherwise remove

--source include/have_rocksdb.inc
--source ../include/have_write_committed.inc

#Zipped data directory was created with default 16K page size
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Space after #

ha_tokupart *file = new (mem_root) ha_tokupart(hton, table);
if (file && file->init_partitioning(mem_root)) {
destroy(file);
return (NULL);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nullptr

@vlad-lesin vlad-lesin force-pushed the native-partitioning-toku-rocks-non-inplace-alter-partition-PR-8.0.12 branch 5 times, most recently from fd3eb9c to 41079ef Compare October 9, 2018 14:53
@vlad-lesin
Copy link
Contributor Author

@laurynas-biveinis , ready for the next round.

rename_error:
to_name = to_names.begin();
for (const std::string &from_name : from_names) {
// TODO: fix nullptr here. Currently tokudb does not process dd::Table
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Log a JIRA issue if not done already?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no need in Jira issue. Fixed the comment as it was discussed 1on1.

@@ -0,0 +1,118 @@
/* Copyright (C) 2018 Percona
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this file be listed in CMakeLists.txt?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's already listed. Please see storage/rocksdb/CMakeLists.txt

@@ -120,6 +120,7 @@ INCLUDE_DIRECTORIES(${CMAKE_CURRENT_BINARY_DIR}/${TOKU_FT_DIR_NAME}/buildheader)
INCLUDE_DIRECTORIES(${CMAKE_CURRENT_BINARY_DIR}/${TOKU_FT_DIR_NAME}/portability)



Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spurious diff. But where is ha_tokupart.cc included?

@vlad-lesin vlad-lesin force-pushed the native-partitioning-toku-rocks-non-inplace-alter-partition-PR-8.0.12 branch from 41079ef to 1473861 Compare October 22, 2018 09:22
partition_base.* are copied from ha_partition.* in 5.7 with corresponding fixes.

5.7 required some additional code to read partition info from .frm files, in
8.0 this functionality is not required, the corresponding code was removed.

Currently "inline alter partition" and "exchange partition" are not implemented.

As "inline alter partition" is not implemented, all "alter partition" operations
involve full table copying. There can be such errors as "no partition found"
duriring copying, which must be ignored, what is also implemented in the server
code.

The warning is added on alter partitioning to notify that inplace partition
altering is not supported.

The following test results were rewritten due to new warning:

rocksdb      partition
tokudb.bugs  alter_part_tokudb_bug_155
tokudb.parts partition_mgm_lc10_tokudb partition_special_tokudb
             partition_alter3_tokudb part_supported_sql_func_tokudb

Testcase for http://bugs.mysql.com/bug.php?id=71270 was ported from 5.7. See
bug71270.* files.
@vlad-lesin vlad-lesin force-pushed the native-partitioning-toku-rocks-non-inplace-alter-partition-PR-8.0.12 branch from 1473861 to c5dec6e Compare October 22, 2018 11:28
@vlad-lesin vlad-lesin merged commit 3be6cd7 into percona:8.0 Oct 22, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants