-
Notifications
You must be signed in to change notification settings - Fork 278
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document endian-dependent behavior, if any #393
Comments
So I've added two tests that should break, and they did indeed fail on powerpc and powerpc64, while they pass on powerpc64el: https://travis-ci.org/gnzlbg/stdsimd/builds/356837387 This is good news, since it means that we can at least use the powerpc and powerpc64 travis build bots to test big-endian hardware. It is also good news, because the only way to do a bit-cast per the RFC is to use |
I'm not really familiar with this beyond:
It's quite possible that whatever behavior LLVM's ARM-specific built-ins exhibit in the big-endian mode is preferable in terms of portability between little-endian and big-endian ARM compared to how ARM itself had defined them. I don't know. According to Wikipedia, ARM was little-endian before it had a big-endian option. It's really hard to find information of any actual uses of big-endian ARM. It seems that people doing embedded systems in Rust are far more concerned about support for non-ARM architectures than about enabling big-endian ARM. So testing the ARM vendor intrinsics in practice in the big-endian case to find out if they behave portably may prove hard. But if it's too hard to find a way to test them, maybe it's an indication that they don't need to be tested in that configuration. Also, the behavior of ARM intrinsics is moot to the extent this issue is about portable SIMD. |
Thanks @hsivonen . FWIW there is a PR documenting endian-dependent behavior that has some more tests: #394 and the travis powerpc buildbots allow us to test this just fine, so that problem is solved I guess. For the RFC, the only way to trigger this behavior is by using unsafe code ( An API for safe bit-casts should in my opinion provide endian-independent behavior by default. We could and should experiment with doing something like what ARM headers do in the |
Has been documented in https://github.com/rust-lang-nursery/stdsimd/blob/master/crates/coresimd/tests/endian_tests.rs Basically, transmuting between arrays of same element type and length as vectors works in an endian-independent way, but the behavior of transmuting to vectors of the same size but different numbers of lanes is endian dependent. Transmuting vectors to/from tuples of the same element type and number of elements happens to work but is undefined behavior, and transmuting to other tuples produces garbage since the elements of the tuple can be reordered and is therefore undefined behavior. |
@sunfish raised the point that we should specify how portable packed SIMD vector types behave with respect to the endianness of the architecture.
The order of the elements within a vector depends on the endianness of the machine. This could not only affect the results one gets from the indexed operations (
extract
,store
,replace
) and memory load and stores, but also the results of bit casts, for example, when bit-casting i8x16to
i16x8the
i8s might be ordered differently inside each
i16` lanes on big endian machines, affecting the results.The weird thing is that the portable vector type tests are running on PowerPC64 and PowerPC64el and all 3000 tests currently pass. These tests are pretty generic, so maybe we are unlucky and these tests are not hitting any endianness issues.
Or maybe LLVM handles this for us already somehow? We should add a couple of tests that we expect to break in either big-endian or little-endian and see what happens.
If LLVM handles this for us, we should investigate what exactly LLVM does and document that.
If LLVM does not handle this for us, there are many ways about how to proceed:
make endianness machine specific: this would make the vector types not
portable anymore, because asserts like those above would fail or pass dependin
on endianness.
renumber indices: if
extract
,store
, andreplace
would take compile-timeindices, we could easily re-number them in big-endian machines to make the
asserts in the code above pass.
but IMO we should figure out if LLVM handles this for us first before spending time thinking about any hypothetical solutions. If someone has access to real big-endian hardware, it would be nice if that person could run
stdsimd
tests and report if whatever tests we come up with here pass or break.IIUC, the following asserts might pass/fail depending on endianness:
cc @hsivonen who is more familiar with these issues than I am
The text was updated successfully, but these errors were encountered: