-
Notifications
You must be signed in to change notification settings - Fork 277
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move vector combine intrisics to arm/neon.rs #1363
Conversation
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @Amanieu (or someone else) soon. |
4c4cb8d
to
5d75d5f
Compare
The "v7" feature should be under a |
Did that. The test results don't seem very promising:
Looking on godbolt to the output of #include <arm_neon.h>
int test() {
return (int) vcombine_u32;
} For Aarch64: test: // @test
adrp x0, vcombine_u32
add x0, x0, :lo12:vcombine_u32
ret
vcombine_u32: // @vcombine_u32
mov v0.d[1], v1.d[0]
ret For ARMv7: test:
movw r0, :lower16:vcombine_u32
movt r0, :upper16:vcombine_u32
bx lr
vcombine_u32:
bx lr I feel like |
The code on arm is actually correct: the Just change all the |
I was just starting to realize the same; I just wrote out the assembly manually, and it started to dawn on me. Thanks for making it explicit to me, because it would've taken another hour or two! If you and the CI are happy with what's here, I'll squash and force-push. |
LGTM! You can actually change the instruction to |
You can also remove the vcombine intrinsics from this list so they get tested against the C intrinsics: https://github.com/rust-lang/stdarch/blob/master/crates/intrinsic-test/missing_arm.txt#L166 |
WASM CI seems to have been broken by rust-lang/rust#105395. In the meantime you can squash the commits, I will merge it when CI is fixed. |
dbe1df7
to
51730b1
Compare
Done! Thanks for the guidance! :-) |
I could not find evidence of the
vcombine
-family of instructions being Aarch64-only. This moves the instructions intoarm_shared/neon/mod.rs
such that they are usable on ARMv7 platforms. The ARM website seems to agree.This code was originally introduced in #546, together with some table lookup instructions; they might also be affected. Of these, I'm only using
vqtbx1q_u8
, and that's A64-only, so I did not dig deeper.This is the first time I'm submitting, so please advice if I should change something!