Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data corruption in ArraySegment using unsafe deserializer #114

Closed
BaldMan82 opened this issue Dec 14, 2015 · 5 comments
Closed

Data corruption in ArraySegment using unsafe deserializer #114

BaldMan82 opened this issue Dec 14, 2015 · 5 comments
Labels

Comments

@BaldMan82
Copy link

I have a strange issue with the unsafe deserializer. I'm using Bond in a project to serialize about 2000 million objects to disc.

While running an integrity check on the data files, I have noticed that very occasionally Guid properties become corrupt (using ArraySegment). When I deserialize that peculiar byte array with a safe deserializer, the data is correct. So luckely it is not a problem in the serializer.

I have created a small test project to demo this, included a piece of problematic data as a resource byte array. Would be great if you could take a look at it.

https://drive.google.com/file/d/0B9HTQM8Ned3eMGVpYnlsVm12WDA/view?usp=sharing

@sapek
Copy link
Contributor

sapek commented Dec 14, 2015

The bug is on line 45 of TrackingReport.cs. You can't assume that ArraySegment has offset zero.

BTW, there is a cleaner, simpler and more performant way to use Guid in your Bond generated types: custom alias type mapping.

@sapek
Copy link
Contributor

sapek commented Dec 14, 2015

Another comment, which is unrelated to your question, but something I've noticed in your schema definitions. You have declared many nullable fields. Often a better approach is to declare such fields with default nothing (we need to refactor the docs so that this info is not hidden in C++ manual, #115).

@BaldMan82
Copy link
Author

Hi sapek, thanks for the quick reply.

I have checked the code but I'm using the ToArray() extension on the ArraySegment struct which takes in offset and count. It would also not quite explain why the unsafe deserializer code works in 99% of the cases and the safe deserializer is 100% correct on the same data piece.

In the attached demo project, only one of the 318 items in the TrackingReport_Container.TrackingReports is corrupt using the unsafe deserializer, while with the safe deserializer they are all correct. That's strange right?

I've tested the ToArray() function and it works as expected as far as I can see...

            var arr = new[] { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };
            var arrSegs = new ArraySegment<int>[3];
            arrSegs[0] = new ArraySegment<int>(arr, 0, 3);
            arrSegs[1] = new ArraySegment<int>(arr, 3, 3);
            arrSegs[2] = new ArraySegment<int>(arr, 6, 3);

            Console.WriteLine("Array 1: " + string.Join(",", arrSegs[0].ToArray()));
            Console.WriteLine("Array 2: " + string.Join(",", arrSegs[1].ToArray()));
            Console.WriteLine("Array 3: " + string.Join(",", arrSegs[2].ToArray()));

            Console.ReadLine();

            Ouputs:
            Array 1: 0,1,2
            Array 2: 3,4,5
            Array 3: 6,7,8

@sapek sapek added the bug label Dec 15, 2015
@sapek
Copy link
Contributor

sapek commented Dec 15, 2015

You are right. I only glanced at your code and misread it as Array.

This is bug in InputStream. Thanks a lot for reporting. I'll push a fix soon.

@sapek sapek closed this as completed in 9e59cd5 Dec 15, 2015
@BaldMan82
Copy link
Author

Can confirm this is fixed. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants