-
Notifications
You must be signed in to change notification settings - Fork 484
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PS-3889: Native Partitioning for TokuDB and RocksDB for 5.7 #2493
PS-3889: Native Partitioning for TokuDB and RocksDB for 5.7 #2493
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a lot to review and understand here. Fortunately, the great bulk of the working code is literally directly copied from the existing generic partition handler. The most interesting thing is how most of the existing partitioning tests pass without any modifications (except for the deprecation warning). Ttat is a very good sign. I like the reuse/factoring out of the common code to be shared by the two engines and how minimal the actual engine changes are. I am a little concerned about the changes needed in the server code, but, they are not terribly invasive and seem needed in order to use the .frm files and not the .par files. Other than the one 'dev' test, nothing really jumps out at me as a potential problem. I will be looking at this again a few times to see if anything else comes out, but, for now I wouldn't hesitate to approve this pending a review from @laurynas-biveinis
@@ -0,0 +1,244 @@ | |||
# This test is for debugging |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not understanding the purpose of this test. Is it purely for development purposes and should be removed or does it contain useful test cases that cover cases that are not covered in other tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test is only for debugging, forgot to remove.
One thing I realized is missing that we talked about a while ago is the ability to enable/disable the use of native parts in 5.7. I don't know how difficult this might be. Ideally we want new options tokudb_enable_native_partition and rocksdb_enable_native_partition with defaults to off. This is specifically because we are releasing this mid-series 5.7 and this changes the default behavior of partitioning. We need try to maintain the existing behavior as well as offer a sort of 'emergency exit' just in case a user runs into some critical issue with the native partitioning. These options do not need to be dynamic, they can be global read-only. @vlad-lesin do you think this would be difficult or take too much time? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(partial review as discussed)
--exec echo $MYSQL_TMP_DIR/bootstrap.log | ||
|
||
--echo # Restart the server | ||
--exec echo "restart: --loose-skip-log-bin --skip-log-slave-updates --datadir=$MYSQLD_DATADIR1 $ADDITIONAL_OPTS" > $MYSQLTEST_VARDIR/tmp/mysqld.1.expect |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use restart_mysqld.inc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
--file_exists $DATA_ARCH_PATH | ||
--exec unzip -qo $DATA_ARCH_PATH -d $MYSQL_TMP_DIR | ||
|
||
--echo # Restart the server |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likewise
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
@@ -0,0 +1,244 @@ | |||
# This test is for debugging |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More descriptive comment please,for exampple "RocksDB native partitioning development tests"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test is only for debugging, forgot to remove.
CREATE TABLE t1 (i INT, j INT, k INT, PRIMARY KEY (i)) ENGINE = ROCKSDB PARTITION BY KEY(i) PARTITIONS 4; | ||
ALTER TABLE t1 CHECK PARTITION p1; | ||
DROP TABLE t1; | ||
--exit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Either enable all of the test, either truncate here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test is only for debugging, forgot to remove.
--source ../include/have_write_committed.inc | ||
|
||
# For data directory created in previous major version branch. | ||
--source include/not_valgrind.inc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just copied from somewhere. Removed.
sql/partitioning/partition_base.h
Outdated
virtual int index_end() { return Partition_helper::ph_index_end(); } | ||
|
||
/** | ||
@breif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
brief
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the original comment. I was not so bored to fix the original typos :) Fixed.
sql/partitioning/partition_base.h
Outdated
private: | ||
static const uint NO_CURRENT_PART_ID; | ||
int loop_extra(enum ha_extra_function operation); | ||
void late_extra_cache(uint partition_id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO remove? Seems MyISAM-related
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, removed MyISAM-related stuff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The testing will be latter.
sql/partitioning/partition_base.cc
Outdated
if (!my_stat(path, &frm_stat_info, MYF(0))) | ||
{ | ||
// if .frm file does not exist don't treat it as error | ||
if (my_errno() == ENOENT) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (cond) return true; return false; -> return cond;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
sql/partitioning/partition_base.cc
Outdated
if (head[0] != (uchar)254 || head[1] != 1 || | ||
!(head[2] == FRM_VER || head[2] == FRM_VER + 1 || | ||
(head[2] >= FRM_VER + 3 && head[2] <= FRM_VER + 4))) | ||
// Upon failure, return NULL, but here, we have to close the file first. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment differs from code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed the comment. The file will be closed on scope exit.
sql/partitioning/partition_base.cc
Outdated
part_name_length+= subpart_name_length; | ||
} | ||
|
||
create_partition_name(out_buf, path, part_name, NORMAL_PART_NAME, FALSE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/FALSE/false/g
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
sql/partitioning/partition_base.cc
Outdated
create_partition_name(out_buf, path, part_name, NORMAL_PART_NAME, FALSE); | ||
} | ||
|
||
Parts_share_refs::Parts_share_refs() : num_parts(0), ha_shares(NULL) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/NULL/nullptr/g
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
sql/partitioning/partition_base.cc
Outdated
|
||
Parts_share_refs::~Parts_share_refs() | ||
{ | ||
uint i; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Declare at use
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
sql/partitioning/partition_base.h
Outdated
for this since the MySQL Server sometimes allocating the handler object | ||
without freeing them. | ||
*/ | ||
ulong m_low_byte_first; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
sql/partitioning/partition_base.cc
Outdated
DATA DIRECTORY and INDEX DIRECTORY are never applied to the whole | ||
partitioned table, only its parts. | ||
*/ | ||
my_bool from_alter= (create_info->data_file_name == (const char *)-1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/my_bool/bool/g
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
sql/partitioning/partition_base.cc
Outdated
file= m_is_clone_of->m_file; | ||
|
||
/* | ||
The following code can be refactored. The same pattern of iteration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove the comment :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
sql/partitioning/partition_base.cc
Outdated
implicit_emptied= m_file[0]->implicit_emptied; | ||
/* | ||
Add 2 bytes for partition id in position ref length. | ||
ref_length=max_in_all_par |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not done?
sql/table.cc
Outdated
{ | ||
uchar *pos, *buf; | ||
uint names, length; | ||
uint names, length; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spurious change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is autoformatting with clang-format-diff. Removed as discussed.
sql/table.h
Outdated
bool update_generated_read_fields(uchar *buf, TABLE *table, | ||
uint active_index= MAX_KEY); | ||
|
||
bool update_generated_write_fields(const MY_BITMAP *bitmap, TABLE *table); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whitespace-only changes above the added declaration
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same, fixed.
storage/rocksdb/ha_rocksdb.cc
Outdated
DBUG_ASSERT(!mysqld_embedded); | ||
|
||
if (rocksdb_db_options->max_open_files > (long) open_files_limit) { | ||
if (rocksdb_db_options->max_open_files > (long)open_files_limit) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whitespace-only change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same, fixed.
storage/rocksdb/ha_rocksdb.h
Outdated
@@ -865,6 +871,13 @@ class ha_rocksdb : public my_core::handler { | |||
return m_store_row_debug_checksums && (rand() % 100 < m_checksums_pct); | |||
} | |||
|
|||
int rename_partitioned_table(const char *const from, const char *const to, | |||
const std::string &partition_string) | |||
MY_ATTRIBUTE((__warn_unused_result__)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/.../MY_NODISCARD/g
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is fixed for the changed code.
storage/tokudb/ha_tokudb.h
Outdated
MY_NODISCARD int send_upsert_message(List<Item> &update_fields, | ||
List<Item> &update_values, | ||
DB_TXN *txn); | ||
MY_NODISCARD int send_upsert_message(List<Item>& update_fields, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Formatting-only change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Completed a review pass in individual comments
8256425
to
0f89dbe
Compare
764df7f
to
d5edfcf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you show how the MTR run log runs with your added combinations?
The MTR combination feature looks good, but I am not sure whether its application here is as good - it adds half a million .result file lines, which are impossible to review manually. The way this ideally should be done that the tests run in combinations, but using a single output file, by disabling warnings etc. @georgelorchpercona what do you think?
@@ -9,5 +9,7 @@ There should be *no* long test name listed below: | |||
select variable_name as `There should be *no* variables listed below:` from t2 | |||
left join t1 on variable_name=test_name where test_name is null ORDER BY variable_name; | |||
There should be *no* variables listed below: | |||
rocksdb_enable_native_partition |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a sys_vars test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added both for MyRocks and TokuDB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With tests added, this all_vars diff should have disappeared
@@ -9,5 +9,7 @@ There should be *no* long test name listed below: | |||
select variable_name as `There should be *no* variables listed below:` from t2 | |||
left join t1 on variable_name=test_name where test_name is null ORDER BY variable_name; | |||
There should be *no* variables listed below: | |||
rocksdb_enable_native_partition |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With tests added, this all_vars diff should have disappeared
sql/CMakeLists.txt
Outdated
@@ -475,6 +475,16 @@ ELSE() | |||
SET(MYSQLD_SOURCE main.cc ${DTRACE_PROBES_ALL}) | |||
ENDIF() | |||
|
|||
CHECK_CXX_COMPILER_FLAG(-std=gnu++11 CXX11_OPTION_SUPPORTED) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment not addressed
sql/handler.cc
Outdated
lock_type == TL_READ_NO_INSERT || | ||
(lock_type != TL_IGNORE && thd->lex->sql_command != SQLCOM_SELECT)) && | ||
thd->lex->sql_command != SQLCOM_ALTER_TABLE && | ||
thd->lex->sql_command != SQLCOM_CHECK) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vlad-lesin @georgelorchpercona what's the state of this?
sql/partitioning/partition_base.cc
Outdated
})) | ||
DBUG_RETURN(0); | ||
|
||
auto part_name_i= part_name_collection.begin(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cbegin
storage/rocksdb/ha_rockspart.cc
Outdated
|
||
err: | ||
delete new_handler; | ||
DBUG_RETURN(NULL); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nullptr
storage/tokudb/ha_tokupart.cc
Outdated
/* If this->table == NULL, then the current handler has been created but not | ||
opened. Prohibit cloning such handler. */ | ||
if (!table) | ||
DBUG_RETURN(NULL); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nullptr
storage/tokudb/ha_tokupart.cc
Outdated
new_handler = new (mem_root) | ||
ha_tokupart(ht, table_share, m_part_info, this, mem_root); | ||
if (!new_handler) | ||
DBUG_RETURN(NULL); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nullptr
storage/tokudb/ha_tokupart.cc
Outdated
|
||
err: | ||
delete new_handler; | ||
DBUG_RETURN(NULL); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nullptr
storage/tokudb/hatoku_hton.cc
Outdated
ha_tokupart* file = new (mem_root) ha_tokupart(hton, table); | ||
if (file && file->init_partitioning(mem_root)) { | ||
delete file; | ||
return (NULL); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nullptr
e6ab001
to
f4d409c
Compare
The comment about this #2493 (comment) note. If we remove the change there will be test fails like the following:
See also https://jira.percona.com/browse/PS-4762. |
558581d
to
54ac323
Compare
Also comments were added for #2493 (comment) and #2493 (comment), and removed for #2493 (comment). |
54ac323
to
1940d6d
Compare
#2493 (comment) - sys_vars testing: https://jenkins.percona.com/view/5.7/job/mysql-5.7-param/1925/ . |
Ready for the next round. |
I don't see any added comments for e.g. sql/cmakelists.txt comment |
sql/CMakeLists.txt is fixed, see build tests here: https://jenkins.percona.com/view/5.7/job/mysql-5.7-param/1926/ . |
14b0d42
to
6cf118e
Compare
Ready for the next round. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please let me know if again some of the comments appear to be replies to hidden/missing comments
Is @georgelorchpercona OK with https://github.com/percona/percona-server/pull/2493/files#r210890922 ?
sql/partitioning/partition_base.cc
Outdated
implicit_emptied= m_file[0]->implicit_emptied; | ||
/* | ||
Add 2 bytes for partition id in position ref length. | ||
ref_length=max_in_all_par |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not done?
sql/partitioning/partition_base.h
Outdated
}; | ||
enum_handler_status m_handler_status; | ||
|
||
uint m_num_locks; // For engines like ha_blackhole, which needs no locks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not done?
308624c
to
8f5299c
Compare
All the above fixes are done. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, please address @georgelorchpercona comment and merge
aafc858
to
a250c78
Compare
TokuDB and MyRocks native partitioning is implemented as a copy of partition storage engine in 5.7 with corresponding changes. The base class - Partition_base does the most of the work. The general differences from ha_partition class are the following: 1) all code which works with .par files was removed; 2) the code to read partition information from .frm files was added; 3) the responsibility for creating new handlers for separate partitions is moved to Partition_base class descendants; 4) the initialization sequence for native and non-native partitioning handlers is different, that is why create() and open() functions are changed 5) delete and rename table functions are changed both for Partition_base and it's descendants(ha_tokupart and ha_rockspart) because it's necessary to determine if the table is partitioned before calling the corresponding functionality. The changes in the server code are the following: 1) remove .frm file after a table has been removed to let handler the ability to read partition info from .frm file during "delete table" execution; 2) remove "static" keyword from some functions signature to use them outside of their module. Two new system variables were added: tokudb_enable_native_patition - enable native partitioning for TokuDB, read-only, default value is 'off', rocksdb_enable_native_patition - enable native partitioning for RocksDB, read-only, default value is 'off'. The ability to have custom result directory for the certain options combination was added to MTR. To use it just add option "mtr-result-dir" in the certain options group in 'combinations' file. The ability to add only the certain list of tests for the certain option combination is added to MTR. For this purpose the new "mtr-tests-list" option is introduced. This option contains comma-separated list of tests in the current suite, which will be executed for the current options combination. If the option is missed then all tests from the suite will be executed for the combination. As an example of the new options see changed and newly created "combination" files of this commit. The above features are used to test TokuDB and RocksDB native and non-native partitioning, that is why the corresponding test suites have two directories for result files for one directory of tests.
a250c78
to
826fa7c
Compare
TokuDB and MyRocks native partitioning is implemented as a copy of partition
storage engine in 5.7 with corresponding changes.
The base class - Partition_base does the most of the work.
The general differences from ha_partition class are the following:
to Partition_base class descendants;
is different, that is why create() and open() functions are changed
it's descendants(ha_tokupart and ha_rockspart) because it's necessary to
determine if the table is partitioned before calling the corresponding
functionality.
The changes in the server code are the following:
to read partition info from .frm file during "delete table" execution;
their module.
To see the difference between the original ha_partition and Partition_base classes use the following command: diff -u -w ./storage/partition/ha_partition.cc ./sql/partitioning/partition_base.cc
Testing: https://jenkins.percona.com/view/5.7/job/mysql-5.7-param/1908/