Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Binary(16) field with trailing 0 in bytes is truncated #169

Open
criccomini opened this issue May 16, 2017 · 8 comments
Open

Binary(16) field with trailing 0 in bytes is truncated #169

criccomini opened this issue May 16, 2017 · 8 comments

Comments

@criccomini
Copy link

criccomini commented May 16, 2017

We are observing that binary(16) fields appear to be truncated when they have trailing 0 bytes (i.e. null terminated) in them. A detailed write up is here:

https://issues.jboss.org/browse/DBZ-254

We are on Debezium 0.4.0, which seems to use 0.9.0 of the mysql-binlog-connector-java library.

@shyiko
Copy link
Owner

shyiko commented May 16, 2017

@shyiko
Copy link
Owner

shyiko commented May 16, 2017

@criccomini it looks like I'm not gonna be able to fix this on the mysql-binlog-connector-java side.

BINARY data is zero-padded (see https://dev.mysql.com/doc/refman/5.7/en/binary-varbinary.html). As you probably know, binary log does not contain information about the types (there is no way to distinguish between BINARY and CHAR for example) and so there is just not enough information to zero-pad it automatically. Debezium (considering that it knows the precise column types) will have to zero-pad the values on its own (whenever the value is shorter then expected).

I'll keep ticket open until I update the readme.md (quirks section).

@criccomini
Copy link
Author

criccomini commented May 16, 2017

Got it. Thanks for looking into this!

@criccomini
Copy link
Author

Debezium (considering that it knows the precise column types) will have to zero-pad the values on its own (whenever the value is shorter then expected).

Will this work, though? If I INSERT a\0 and INSERT a\0\0 into a BINARY(3), I don't think DBZ can tell the difference no matter what, right? This doesn't seem fixable to me.

@rhauch
Copy link

rhauch commented May 16, 2017

@criccomini I guess it depends on what comes back from the MySQL binlog event. For example, given a column of type BINARY(3) and a value of a, is the length as read by the binlog connector really 1 or 3? If 1 then here couldn't we detect the difference in actual vs column length and right-pad the byte[]? It's not ideal, but it should work. Any chance you have debugged the code?

@criccomini
Copy link
Author

If 1 then here couldn't we detect the difference in actual vs column length and right-pad the byte[]?

But if it's 1, wouldn't you not know whether a was inserted or a\0 or a\0\0?

@shyiko
Copy link
Owner

shyiko commented May 16, 2017

It's going to be 1 in both cases (a\0 and a\0\0) (right-padding is stripped completely). Personally I think the right thing to do here would be to use proper data type - varbinary(16) instead of binary(16) (\0s are NOT stripped away in case of var*s).

@criccomini
Copy link
Author

criccomini commented May 16, 2017

@shyiko if that's the case, I think it's actually a pretty fair argument to make DBZ assume that a fixed-length BINARY column should include the right padded 0's. Basically, my take away is if you're using fixed-length BINARY cols in MySQL, you must be writing binary values that are ONLY that length (or are following a VERY predictable write pattern). Otherwise, you just can't use this data type predictably.

@rhauch what do you think about making DBZ assume, and force right padded 0s?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants