Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move vector combine intrisics to arm/neon.rs #1363

Merged
merged 3 commits into from
Dec 11, 2022

Conversation

rubdos
Copy link
Contributor

@rubdos rubdos commented Dec 5, 2022

I could not find evidence of the vcombine-family of instructions being Aarch64-only. This moves the instructions into arm_shared/neon/mod.rs such that they are usable on ARMv7 platforms. The ARM website seems to agree.

This code was originally introduced in #546, together with some table lookup instructions; they might also be affected. Of these, I'm only using vqtbx1q_u8, and that's A64-only, so I did not dig deeper.

This is the first time I'm submitting, so please advice if I should change something!

@rustbot
Copy link
Collaborator

rustbot commented Dec 5, 2022

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @Amanieu (or someone else) soon.

@rubdos rubdos force-pushed the some-armv7-neon-intrinsics branch 2 times, most recently from 4c4cb8d to 5d75d5f Compare December 5, 2022 14:33
@Amanieu
Copy link
Member

Amanieu commented Dec 7, 2022

The "v7" feature should be under a cfg_attr(target_arch = "arm").

@rubdos
Copy link
Contributor Author

rubdos commented Dec 7, 2022

Did that. The test results don't seem very promising:

---- core_arch::arm_shared::neon::assert_vcombine_p64_mov stdout ----
disassembly for stdarch_test_shim_vcombine_p64_mov: 
	 0: bx lr
thread 'core_arch::arm_shared::neon::assert_vcombine_p64_mov' panicked at 'failed to find instruction `mov` in the disassembly', crates/stdarch-test/src/lib.rs:174:9

bx lr does not sound like it does what it should do, or am I missing something?

Looking on godbolt to the output of

#include <arm_neon.h>
int test() {
  return (int) vcombine_u32;                
}                                     

For Aarch64:

test:                                   // @test
        adrp    x0, vcombine_u32
        add     x0, x0, :lo12:vcombine_u32
        ret
vcombine_u32:                           // @vcombine_u32
        mov     v0.d[1], v1.d[0]
        ret

For ARMv7:

test:
        movw    r0, :lower16:vcombine_u32
        movt    r0, :upper16:vcombine_u32
        bx      lr
vcombine_u32:
        bx      lr

I feel like vcombine_u32 (or any of the others) is actually unimplemented for ARMv7 in LLVM. I'm not confident in how to proceed here, would you have a suggestion?

@Amanieu
Copy link
Member

Amanieu commented Dec 7, 2022

The code on arm is actually correct: the d0 and d1 registers alias with the q0 register, so no movement needs to be performed to combine the values.

Just change all the assert_instr to nop instead of mov to disable the instruction tests for these intrinsics.

@rubdos
Copy link
Contributor Author

rubdos commented Dec 7, 2022

The code on arm is actually correct: the d0 and d1 registers alias with the q0 register, so no movement needs to be performed to combine the values.

I was just starting to realize the same; I just wrote out the assembly manually, and it started to dawn on me. Thanks for making it explicit to me, because it would've taken another hour or two!

If you and the CI are happy with what's here, I'll squash and force-push.

@Amanieu
Copy link
Member

Amanieu commented Dec 8, 2022

LGTM! You can actually change the instruction to nop on aarch64 as well, there's not much point doing instruction tests for intrinsics that only perform data movement since the compiler has a lot of flexibility on how to implement them.

@Amanieu
Copy link
Member

Amanieu commented Dec 8, 2022

You can also remove the vcombine intrinsics from this list so they get tested against the C intrinsics: https://github.com/rust-lang/stdarch/blob/master/crates/intrinsic-test/missing_arm.txt#L166

@Amanieu
Copy link
Member

Amanieu commented Dec 8, 2022

WASM CI seems to have been broken by rust-lang/rust#105395.

In the meantime you can squash the commits, I will merge it when CI is fixed.

@rubdos rubdos force-pushed the some-armv7-neon-intrinsics branch from dbe1df7 to 51730b1 Compare December 8, 2022 18:11
@rubdos
Copy link
Contributor Author

rubdos commented Dec 8, 2022

In the meantime you can squash the commits, I will merge it when CI is fixed.

Done! Thanks for the guidance! :-)

@Amanieu Amanieu merged commit a0c30f3 into rust-lang:master Dec 11, 2022
@rubdos rubdos deleted the some-armv7-neon-intrinsics branch February 8, 2023 12:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants