-
Notifications
You must be signed in to change notification settings - Fork 277
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inlining is broken in ARMv7+NEON #427
Comments
My guess is that this has to do with rustc's feature list containg |
But you can reproduce with rustc right? @hsivonen has a PR with targets for armv7neon-unknown-linux-gnu and friends, so those might solve this problem. |
Yes I can reproduce this with rustc. The LLVM IR emitted for this is: ; ModuleID = 'wut0-50fb5e8e8d375368727bf982ba5dd3c0.rs'
source_filename = "wut0-50fb5e8e8d375368727bf982ba5dd3c0.rs"
target datalayout = "e-m:e-p:32:32-i64:64-v128:64:128-a:0:32-n32-S64"
target triple = "armv7-unknown-linux-gnueabihf"
; wut::bar
; Function Attrs: inlinehint norecurse nounwind
define internal fastcc void @_ZN3wut3bar17h1815328d92c8a501E(<8 x i8>* noalias nocapture dereferenceable(8), <8 x i8> %x.val) unnamed_addr #0 {
start:
%1 = shl <8 x i8> %x.val, <i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1>
store <8 x i8> %1, <8 x i8>* %0, align 8
ret void
}
; Function Attrs: nounwind
define <8 x i8> @foo(<8 x i8> %x) unnamed_addr #1 {
start:
%tmp_ret = alloca <8 x i8>, align 8
%0 = getelementptr inbounds <8 x i8>, <8 x i8>* %tmp_ret, i32 0, i32 0
; call wut::bar
call fastcc void @_ZN3wut3bar17h1815328d92c8a501E(<8 x i8>* noalias nocapture nonnull dereferenceable(8) %tmp_ret, <8 x i8> %x)
%1 = load <8 x i8>, <8 x i8>* %tmp_ret, align 8
ret <8 x i8> %1
}
attributes #0 = { inlinehint norecurse nounwind "target-features"="+neon,+v7" }
attributes #1 = { nounwind }
attributes #2 = { argmemonly nounwind }
attributes #3 = { norecurse nounwind readnone "no-frame-pointer-elim"="true" } Notably |
If when compiling with llc works, maybe we are communicating this incorrectly? |
This works for me with the example above: opt -mattr=+neon,+v7 test.ll -inline -S -o - Does rustc use setFunctionAttributes(CPUStr, FeaturesStr, *M) or does it just pass the target-features in when creating the TargetMachine? |
@gnzlbg I'm not sure. I can't get @tstellar currently we pass |
@alexcrichton Ok, rustc will need to manually add the -C target-feature-specified features to each function as well. The TargetMachine does not do this automatically. |
Alternatively, maybe we should filter the function attributes we set to just those that are different than the target machine? |
I also should mention that when you specify the target-features attribute for a function, the LLVM backends are supposed to ignore the FeatureString that you passed to the TargetMachine when doing codegen for that function, so @cuviper 's suggestion wouldn't work. |
Then we really do need to set complete features for all functions. Does clang do that too? |
Yes it does:
|
@tstellar thanks, I think we are not currently doing that anywhere. |
AFAIR, LLVM will not inline if the caller doesn't have a superset of features, so isn't this expected behaviour? |
@parched note that I am compiling with |
@gnzlbg, ah yes I misread, ignore my comment then. |
I've opened a PR for this at rust-lang/rust#50188 |
Add `-C target-feature` to all functions Previously the features specified to LLVM via `-C target-feature` were only reflected in the `TargetMachine` but this change *also* reflects these and the base features inside each function itself. This change matches clang and... Closes rust-lang/stdarch#427
MWE:
produces:
built using
RUSTFLAGS="-C target-feature=+neon,+v7 -C lto=fat -C codegen-units=1 --emit asm -C debuginfo=0" CARGO_INCREMENTAL=0 cargo build --release --target=armv7-unknown-linux-gnueabihf
.The text was updated successfully, but these errors were encountered: