Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reimplement some x86 intrinsics without arch-specific LLVM intrinsics #1463

Merged
merged 13 commits into from
Aug 30, 2023

Conversation

eduardosm
Copy link
Contributor

Reimplements some x86 without using arch-specific LLVM intrinsics:

  • Store unaligned (_mm*_storeu_*): Use <*mut _>::write_unaligned instead of llvm.x86.*.storeu.*.
  • Shift by immediate (_mm*_s{ll,rl,ra}i_epi*): Use if (srl, sll) or min (sra) to simulate the behaviour when the RHS is out of range. RHS is constant, so the if/min will be optimized away.

The advantages are:

  • codegen will not have to handle those LLVM instrinsics.
  • miri will be able to emulate them without specific shims

@rustbot
Copy link
Collaborator

rustbot commented Aug 29, 2023

r? @Amanieu

(rustbot has picked a reviewer for you, use r? to override)

@@ -732,7 +752,11 @@ pub unsafe fn _mm_srl_epi32(a: __m128i, count: __m128i) -> __m128i {
#[stable(feature = "simd_x86", since = "1.27.0")]
pub unsafe fn _mm_srli_epi64<const IMM8: i32>(a: __m128i) -> __m128i {
static_assert_uimm_bits!(IMM8, 8);
transmute(psrliq(a.as_i64x2(), IMM8))
if IMM8 >= 32 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be 64?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@Amanieu Amanieu merged commit ff07f35 into rust-lang:master Aug 30, 2023
bors added a commit to rust-lang-ci/rust that referenced this pull request Sep 6, 2023
…dtwco,bjorn3

Update stdarch submodule and remove special handling in cranelift codegen for some AVX and SSE2 LLVM intrinsics

rust-lang/stdarch#1463 reimplemented some x86 intrinsics to avoid using some x86-specific LLVM intrinsics:

* Store unaligned (`_mm*_storeu_*`) use `<*mut _>::write_unaligned` instead of `llvm.x86.*.storeu.*`.
* Shift by immediate (`_mm*_s{ll,rl,ra}i_epi*`) use `if` (srl, sll) or `min` (sra) to simulate the behaviour when the RHS is out of range. RHS is constant, so the `if`/`min` will be optimized away.

This PR updates the stdarch submodule to pull these changes and removes special handling for those LLVM intrinsics from cranelift codegen. I left gcc codegen untouched because there are some autogenerated lists.
bjorn3 pushed a commit to rust-lang/rustc_codegen_cranelift that referenced this pull request Sep 7, 2023
Those were removed from stdarch in rust-lang/stdarch#1463 (`<*mut _>::write_unaligned` is used instead)
bjorn3 pushed a commit to rust-lang/rustc_codegen_cranelift that referenced this pull request Sep 7, 2023
…ediate intrinsics

Those were removed from stdarch in rust-lang/stdarch#1463 (`simd_shl` and `simd_shr` are used instead)
bjorn3 pushed a commit to rust-lang/rustc_codegen_cranelift that referenced this pull request Sep 7, 2023
Update stdarch submodule and remove special handling in cranelift codegen for some AVX and SSE2 LLVM intrinsics

rust-lang/stdarch#1463 reimplemented some x86 intrinsics to avoid using some x86-specific LLVM intrinsics:

* Store unaligned (`_mm*_storeu_*`) use `<*mut _>::write_unaligned` instead of `llvm.x86.*.storeu.*`.
* Shift by immediate (`_mm*_s{ll,rl,ra}i_epi*`) use `if` (srl, sll) or `min` (sra) to simulate the behaviour when the RHS is out of range. RHS is constant, so the `if`/`min` will be optimized away.

This PR updates the stdarch submodule to pull these changes and removes special handling for those LLVM intrinsics from cranelift codegen. I left gcc codegen untouched because there are some autogenerated lists.
@eduardosm eduardosm deleted the x86-intrinsics branch September 13, 2023 16:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants