-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Data corruption in binary and varbinary columns #157
Comments
Can you post an example dataset and query so we can add an integration test? |
Does the InlineVerifier catch this corruption? It should, right? |
should be reproducable with literally any update of a BINARY column type that does uses the full length of the column but happens to have 0-bytes at the end. I can work on writing an integration test in the project - I have a patch that should fix it and can commit it together with the change Unfortunately it looks like the replication library cannot know of the truncation (unless I'm misreading the docs, the mysql replication protocol doesn't transmit the max-size of a column update), so we'll need to extend the schema extraction and handle the padding in ghostferry (but it seems really simple).
I am not overly familiar with this part yet, but from what I have learned so far, I would assume that it does |
There seems to be a bug in the propagation of data from the binlog streamer to the binlog writer when using fixed-length BINARY columns. This commit adds an integration test that currently works only under very specific conditions, and prepares the code for the fix to the data propagation logic. Change-Id: Icef7ae558a38af4088280bb633f331711b835502
THIS PULL REQUEST IS NOT INTENDED TO BE MERGED, JUST FOR INTERNAL DISCUSSION. WE WILL NEED TO MODIFY THE UPSTREAM VERSION OF THE VENDOR'ED MODULE INSTEAD OF MODIFYING THE IMPORTED VERSION! This commit addresses a data corruption bug in the binlog streaming phase, where the binlog writer incorrectly propagates values in fixed-length BINARY columns that have trailing 0s in the original value. These trailing 0s are removed from the binlog by the SQL master and therefore do not show up in the WHERE clause for update/delete statements executed by the binlog writer. Change-Id: I7cd01c953ef2027e4f4fa71a62c5d7b30caa83c5
ok, sent you 2 pull requests - the first one is to illustrate the problem, the second one contains a potential fix - but the second one should probably not be merged as-is, as I'm actually changing the vendor'ed module to show what I'd propose changing let's discuss the details there |
There seems to be a bug in the propagation of data from the binlog streamer to the binlog writer when using fixed-length BINARY columns. This commit adds an integration test that currently works only under very specific conditions, and prepares the code for the fix to the data propagation logic. Change-Id: I0e8ca7bc3aa97b3a1a266b5e3bc5972abc6c39b4
There seems to be a bug in the propagation of data from the binlog streamer to the binlog writer when using fixed-length BINARY columns. This commit adds an integration test that currently works only under very specific conditions, and prepares the code for the fix to the data propagation logic. Change-Id: I0e8ca7bc3aa97b3a1a266b5e3bc5972abc6c39b4
THIS PULL REQUEST IS NOT INTENDED TO BE MERGED, JUST FOR INTERNAL DISCUSSION. WE WILL NEED TO MODIFY THE UPSTREAM VERSION OF THE VENDOR'ED MODULE INSTEAD OF MODIFYING THE IMPORTED VERSION! This commit addresses a data corruption bug in the binlog streaming phase, where the binlog writer incorrectly propagates values in fixed-length BINARY columns that have trailing 0s in the original value. These trailing 0s are removed from the binlog by the SQL master and therefore do not show up in the WHERE clause for update/delete statements executed by the binlog writer. Change-Id: I7cd01c953ef2027e4f4fa71a62c5d7b30caa83c5
This commit addresses a data corruption bug in the binlog streaming phase, where the binlog writer incorrectly propagates values in fixed-length BINARY columns that have trailing 0s in the original value. These trailing 0s are removed from the binlog by the SQL master and therefore do not show up in the WHERE clause for update/delete statements executed by the binlog writer. NOTE: This commit requires changes to one of the vendor'ed modules in github.com/siddontang/go-mysql that we patch directly in the local repo. We are working on getting these changes into the upstream module and will need to merge these changes once we can use the latest upstream module version. Change-Id: Ib9c1b7308e8198f1fd38439c37f444d9a8154e6a
This commit addresses a data corruption bug in the binlog streaming phase, where the binlog writer incorrectly propagates values in fixed-length BINARY columns that have trailing 0s in the original value. These trailing 0s are removed from the binlog by the SQL master and therefore do not show up in the WHERE clause for update/delete statements executed by the binlog writer. NOTE: This commit requires changes to one of the vendor'ed modules in github.com/siddontang/go-mysql that we patch directly in the local repo. We are working on getting these changes into the upstream module and will need to merge these changes once we can use the latest upstream module version. Change-Id: Ib9c1b7308e8198f1fd38439c37f444d9a8154e6a
FYI: tracking the upstream vendor module change here: let's see what's the feedback we get there NOTE: To increase chances to get it into upstream, I've made the change a bit more generic (added not only the |
This commit addresses a data corruption bug in the binlog streaming phase, where the binlog writer incorrectly propagates values in fixed-length BINARY columns that have trailing 0s in the original value. These trailing 0s are removed from the binlog by the SQL master and therefore do not show up in the WHERE clause for update/delete statements executed by the binlog writer. NOTE: This commit requires changes to one of the vendor'ed modules in github.com/siddontang/go-mysql that we patch directly in the local repo. We are working on getting these changes into the upstream module and will need to merge these changes once we can use the latest upstream module version. Change-Id: Ib9c1b7308e8198f1fd38439c37f444d9a8154e6a
This commit addresses a data corruption bug in the binlog streaming phase, where the binlog writer incorrectly propagates values in fixed-length BINARY columns that have trailing 0s in the original value. These trailing 0s are removed from the binlog by the SQL master and therefore do not show up in the WHERE clause for update/delete statements executed by the binlog writer. NOTE: This commit requires changes to one of the vendor'ed modules in github.com/siddontang/go-mysql that we patch directly in the local repo. We are working on getting these changes into the upstream module and will need to merge these changes once we can use the latest upstream module version. Change-Id: Ib9c1b7308e8198f1fd38439c37f444d9a8154e6a
This commit addresses a data corruption bug in the binlog streaming phase, where the binlog writer incorrectly propagates values in fixed-length BINARY columns that have trailing 0s in the original value. These trailing 0s are removed from the binlog by the SQL master and therefore do not show up in the WHERE clause for update/delete statements executed by the binlog writer. NOTE: This commit requires changes to one of the vendor'ed modules in github.com/siddontang/go-mysql that we patch directly in the local repo. We are working on getting these changes into the upstream module and will need to merge these changes once we can use the latest upstream module version. Change-Id: Ib9c1b7308e8198f1fd38439c37f444d9a8154e6a
This commit addresses a data corruption bug in the binlog streaming phase, where the binlog writer incorrectly propagates values in fixed-length BINARY columns that have trailing 0s in the original value. These trailing 0s are removed from the binlog by the SQL master and therefore do not show up in the WHERE clause for update/delete statements executed by the binlog writer. NOTE: This commit requires changes to one of the vendor'ed modules in github.com/siddontang/go-mysql that we patch directly in the local repo. We are working on getting these changes into the upstream module and will need to merge these changes once we can use the latest upstream module version. Change-Id: Ib9c1b7308e8198f1fd38439c37f444d9a8154e6a
This commit addresses a data corruption bug in the binlog streaming phase, where the binlog writer incorrectly propagates values in fixed-length BINARY columns that have trailing 0s in the original value. These trailing 0s are removed from the binlog by the SQL master and therefore do not show up in the WHERE clause for update/delete statements executed by the binlog writer. NOTE: This commit requires changes to one of the vendor'ed modules in github.com/siddontang/go-mysql that we patch directly in the local repo. We are working on getting these changes into the upstream module and will need to merge these changes once we can use the latest upstream module version. Change-Id: Ib9c1b7308e8198f1fd38439c37f444d9a8154e6a
This commit addresses a data corruption bug in the binlog streaming phase, where the binlog writer incorrectly propagates values in fixed-length BINARY columns that have trailing 0s in the original value. These trailing 0s are removed from the binlog by the SQL master and therefore do not show up in the WHERE clause for update/delete statements executed by the binlog writer. NOTE: This commit requires changes to one of the vendor'ed modules in github.com/siddontang/go-mysql that we patch directly in the local repo. We are working on getting these changes into the upstream module and will need to merge these changes once we can use the latest upstream module version. Change-Id: Ib9c1b7308e8198f1fd38439c37f444d9a8154e6a
This commit addresses a data corruption bug in the binlog streaming phase, where the binlog writer incorrectly propagates values in fixed-length BINARY columns that have trailing 0s in the original value. These trailing 0s are removed from the binlog by the SQL master and therefore do not show up in the WHERE clause for update/delete statements executed by the binlog writer. NOTE: This commit requires changes to one of the vendor'ed modules in github.com/siddontang/go-mysql that we patch directly in the local repo. We are working on getting these changes into the upstream module and will need to merge these changes once we can use the latest upstream module version. Change-Id: Ib9c1b7308e8198f1fd38439c37f444d9a8154e6a
This commit addresses a data corruption bug in the binlog streaming phase, where the binlog writer incorrectly propagates values in fixed-length BINARY columns that have trailing 0s in the original value. These trailing 0s are removed from the binlog by the SQL master and therefore do not show up in the WHERE clause for update/delete statements executed by the binlog writer. NOTE: This commit requires changes to one of the vendor'ed modules in github.com/siddontang/go-mysql that we patch directly in the local repo. We are working on getting these changes into the upstream module and will need to merge these changes once we can use the latest upstream module version. Change-Id: Ib9c1b7308e8198f1fd38439c37f444d9a8154e6a
When retrieving updates via binlog , the replication module is stripping trailing 0-bytes from the data in data updates. This affects the data inserted via insert/update statements as well as the "where" clause for update/delete statements.
As a result, a target DB starts to diverge in the data that it stores.
IMO this is a bug in the upstream library we use, and I have opened a ticket with them:
go-mysql-org/go-mysql#477
However, existing users of the library may break if the library is changed, which is why I think we may need to work around the bug in ghostferry directly.
To do this, we need to update go-mysql to parse BINARY/VAR_BINARY table column definition (it currently assumes they are simple string columns) and extend input array before using the data in calls like BinlogUpdateEvent. AsSQLString
The text was updated successfully, but these errors were encountered: