-
Notifications
You must be signed in to change notification settings - Fork 12.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AArch64] Overlapping registers allocated to input and lateout constraints #106380
Comments
@llvm/issue-subscribers-backend-aarch64 Author: Nikita Popov (nikic)
https://llvm.godbolt.org/z/xz6PYTaW3
```llvm
target triple = "aarch64-unknown-linux-gnu"
define i1 @test(ptr %ptr) "target-features"="+lse" {
<inline asm>:3:6: error: unpredictable STXP instruction, status is also a source
|
The fact that one is a subreg doesn't seem to be relevant (same behavior if using i64 result and dropping the The fact that the operand is Maybe there is an (incorrect) assumption that clobbering does not matter for undef operands? |
This sounds like it might be similar to the issue #77770 was addressing, IIRC. I don't know much more about it than that though (like why the register allocator is doing it or why it isn't better to fix as part of the register allocator). |
Maybe @qcolombet has some insight into the regalloc behavior here. |
It perturbs the allocation because it says that we do not care about the content of the register. Hence, we end up with an empty live-range which doesn't interfere with anything else and we end up reusing something that is already used. This is normally fine, but breaks the verifier here. Technically I would say you get what you asked for: undef stored value ==> undef behavior for STXP. I don't see an easy fix in the allocator though. |
I don't think this interpretation is quite right in this case, because an undef operand here should behave as-if it were any arbitrary value, while STXP with two equal registers is "constrained unpredictable" in a way that extends beyond that. That is, it may exhibit behavior that would not be observable for any specific choice of the undef value. Independently of the STXP case, my expectation would also be that an inline asm block can only observe a single value for an input register, even if it has an undef value. Though given IR undef semantics, I can see how a reasonable person might disagree with that expectation... |
I agree the current behavior is wrong; ensuring no overlap should be more intuitive for people writing inline asm. (Granted, I doubt this has much practical effect either way... if you end up with an inline asm with an undef input, you're probably doing something wrong.) This is essentially the same as #77770. |
It seems like making InitUndef work on AArch64 is the right way forward here. I've put up #106744 to make it easier to extend the pass to additional targets. |
This is a variant of #106380 without inline assembly.
#108353 enables the pass for all targets and this does fix the equivalent issue when using intrinsics. Just enabling the pass doesn't help the inline asm case though, so something more is missing... |
The missing bit turned out to be that the pass iterates over defs() instead of all_defs(). I'll fix that after the main change lands. |
InitUndef should also handle early-clobber / undef conflicts in inline asm operands. Do this by iterating over all_defs() instead of defs(). The newly added test was generating an "unpredictable STXP instruction, status is also a source" error prior to this change. Fixes llvm#106380.
InitUndef should also handle early-clobber / undef conflicts in inline asm operands. Do this by iterating over all_defs() instead of defs(). The newly added ARM test was generating an "unpredictable STXP instruction, status is also a source" error prior to this change. Fixes llvm#106380.
https://llvm.godbolt.org/z/xz6PYTaW3
Results in:
Note that
${0:w}
and${2}
are allocated to overlapping registers, even though a lateout=&
constraint is used.The text was updated successfully, but these errors were encountered: