Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alignment issues in serializers #305

Closed
codeandroid opened this issue Jan 18, 2017 · 4 comments
Closed

Alignment issues in serializers #305

codeandroid opened this issue Jan 18, 2017 · 4 comments
Assignees
Labels

Comments

@codeandroid
Copy link
Contributor

We began using Bond in a project compiled with emscripten. [1]

Running deserializers for CompactBinary, FastBinary and SimpleJson always resulted in an "alignment fault" when executing the resulting asm.js code.

The following exemplary piece of code demonstrates the issue and a possible fix. (There are other similar pieces of code here and there.)

It is located in cpp/inc/bond/stream/input_buffer.h:

template <typename T>
void Read(T& value)
{
    if (sizeof(T) > _blob.length() - _pointer)
    {
        EofException(sizeof(T));
    }
    
    // ORIGINAL
    //value = *reinterpret_cast<const T*>(_blob.content() + _pointer);

    // FIX
    std::memcpy(&value, _blob.content() + _pointer, sizeof(T));

    _pointer += sizeof(T);
}

Emscripten is a platform requiring loads and stores to be aligned (like MIPS, PowerPC, etc. but not like x86 which allows unaligned access though not without impact).

Also, I think, the original code violates the C++ strict aliasing rules...

Using std::memcpy should be safe and most compilers should detect the use of std::memcpy and generate properly optimized code.

I do not understand this part of C++ deep enough to be a 100% sure that this is the right approach as I'm developing for x86 and x86_64 where unaligned access just works. Otherwise I would've prepared a PR already. :)

I also tested SimpleJson (when the others failed) and also ran into alignment issues but this time most likely due to RapidJson itself. A fix for RapidJson ( Tencent/rapidjson#353 ) has been merged into master and should be included in the next release ( http://rapidjson.org/md__c_h_a_n_g_e_l_o_g.html ).

->

If you could provide some guidance on how to deal with this issue, whether that's the right direction or whether there's a different solution... that'd be much appreciated.

[1] This is part of an experimental web app sharing code and data between our desktop applications and existing services talking Bond.Comm.

@chwarr
Copy link
Member

chwarr commented Jan 18, 2017

std::memcpy is the C++ standards compliant way to do these sort of type punning conversions. We've made similar fixes elsewhere in the code when we've found them. In particular, I in commit bc4f7d8 I made a very similar change in random_protocol.h.

std::memcpy can only be used on trivially copyable items, so we'd need to static assert or constrain the Read overload. Unfortunately, std::is_trivially_copyable is only available in C++11, and we still support C++03. I couldn't find a Boost equivalent for this. (Most implementations use a compiler trait, so it is hard to support this on compilers with out such a trait.) We may be able to use the same std::is_arithmetic assertion in these cases like I did in random_protocol.h.

We'll need to do some research to check whether std::memcpy transparently handles alignment or not.

@chwarr chwarr added the bug label Jan 18, 2017
@codeandroid
Copy link
Contributor Author

I should've searched for existing memcpy occurrences.

Please, let me know if I can be of help.

@jasonzio
Copy link

Consider using boost::has_trivial_copy as a proxy in C++03 and std::is_trivially_copyable in C++11 and beyond. It's not perfect, but it gets you mostly there.

chwarr added a commit to chwarr/bond that referenced this issue Jan 25, 2017
This is currently broken.

Initial attempt at fixing microsoft#305
@chwarr
Copy link
Member

chwarr commented Jan 25, 2017

is_arithmetic || is_enum looks to be enough for Read. Though, I broke some other things in my attempt to fix this and haven't had a chance to figure out what yet... :-)

@chwarr chwarr self-assigned this Mar 1, 2017
chwarr added a commit to chwarr/bond that referenced this issue Mar 10, 2017
We now use memcpy to pun from one type to another, avoiding aliasing
violations as well as alignment errors. x86-like chips never failed due
to the alignment errors, but others, like some ARM chips, did. This
should also fix emscripten cross-compilation.

Fixes microsoft#305
chwarr added a commit to chwarr/bond that referenced this issue Mar 10, 2017
We now use memcpy to pun from one type to another, avoiding aliasing
violations as well as alignment errors. x86-like chips never failed due
to the alignment errors, but others, like some ARM chips, did. This
should also fix emscripten cross-compilation.

Fixes microsoft#305

TODO:

* [ ] Check how performance has changed, if at all
chwarr added a commit to chwarr/bond that referenced this issue Mar 21, 2017
We now use memcpy to pun from one type to another, avoiding aliasing
violations as well as alignment errors. x86-like chips never failed due
to the alignment errors, but others, like some ARM chips, did. This
should also fix emscripten cross-compilation.

Fixes microsoft#305

TODO:

* [ ] Check how performance has changed, if at all
chwarr added a commit to chwarr/bond that referenced this issue Mar 22, 2017
We now use memcpy to pun from one type to another, avoiding aliasing
violations as well as alignment errors. x86-like chips never failed due
to the alignment errors, but others, like some ARM chips, did. This
should also fix emscripten cross-compilation.

Fixes microsoft#305
chwarr added a commit to chwarr/bond that referenced this issue Mar 22, 2017
We now use memcpy to pun from one type to another, avoiding aliasing
violations as well as alignment errors. x86-like chips never failed due
to the alignment errors, but others, like some ARM chips, did. This
should also fix emscripten cross-compilation.

Fixes microsoft#305
chwarr added a commit to chwarr/bond that referenced this issue Mar 23, 2017
We now use memcpy to pun from one type to another, avoiding aliasing
violations as well as alignment errors. x86-like chips never failed due
to the alignment errors, but others, like some ARM chips, did. This
should also fix emscripten cross-compilation.

Fixes microsoft#305
@chwarr chwarr closed this as completed in 04ad85f Mar 24, 2017
fuzhouch pushed a commit to fuzhouch/bond that referenced this issue Apr 9, 2017
We now use memcpy to pun from one type to another, avoiding aliasing
violations as well as alignment errors. x86-like chips never failed due
to the alignment errors, but others, like some ARM chips, did. This
should also fix emscripten cross-compilation.

Fixes microsoft#305
Closes microsoft#375
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants