-
Notifications
You must be signed in to change notification settings - Fork 12.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AArch64][GlobalISel] Fix miscompile on carry-in selection #68840
Conversation
@llvm/pr-subscribers-backend-aarch64 @llvm/pr-subscribers-llvm-globalisel Author: Tobias Stadler (tobias-stadler) ChangesEliding the carry-in setting instruction for G_UADDE/... is illegal if it causes the carry generating instruction to become dead because ISel will just remove the instruction. The optimization is more of a band-aid to cover the most common case, so I'd also be willing to just remove it right now. I'll eventually add some proper NZCV optimizations to cover these cases (and stuff like when the CarryOut is directly fed into a branch) to PostSelectOptimize. Please let me know if we want to keep this as a cheap check for O0 even though it is a little hacky. Full diff: https://github.com/llvm/llvm-project/pull/68840.diff 5 Files Affected:
diff --git a/llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp b/llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
index 1c7a09696e853e2..f523d0e3740ede3 100644
--- a/llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
+++ b/llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
@@ -4749,11 +4749,17 @@ MachineInstr *AArch64InstructionSelector::emitCarryIn(MachineInstr &I,
// emit a carry generating instruction. E.g. for G_UADDE/G_USUBE sequences
// generated during legalization of wide add/sub. This optimization depends on
// these sequences not being interrupted by other instructions.
+ // We can only safely omit the instruction if it doesn't cause the previous
+ // instruction to become dead, otherwise the instruction selector will delete
+ // it.
MachineInstr *SrcMI = MRI->getVRegDef(CarryReg);
if (SrcMI == I.getPrevNode()) {
if (auto *CarrySrcMI = dyn_cast<GAddSubCarryOut>(SrcMI)) {
bool ProducesNegatedCarry = CarrySrcMI->isSub();
- if (NeedsNegatedCarry == ProducesNegatedCarry && CarrySrcMI->isUnsigned())
+ if (NeedsNegatedCarry == ProducesNegatedCarry &&
+ CarrySrcMI->isUnsigned() &&
+ CarrySrcMI->getCarryOutReg() == CarryReg &&
+ !MRI->use_nodbg_empty(CarrySrcMI->getDstReg()))
return nullptr;
}
}
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/select-sadde.mir b/llvm/test/CodeGen/AArch64/GlobalISel/select-sadde.mir
index 85625ced4ba6922..6af4b1e2ad179aa 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/select-sadde.mir
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/select-sadde.mir
@@ -175,3 +175,36 @@ body: |
$x2 = COPY %9(s64)
RET_ReallyLR implicit $x0, implicit $x1, implicit $x2
...
+...
+---
+name: sadde_opt_bail_dead
+alignment: 4
+legalized: true
+regBankSelected: true
+tracksRegLiveness: true
+body: |
+ bb.1:
+ liveins: $x0, $x1, $x2, $x3
+ ; CHECK-LABEL: name: sadde_opt_bail_dead
+ ; CHECK: liveins: $x0, $x1, $x2, $x3
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: [[COPY:%[0-9]+]]:gpr64 = COPY $x0
+ ; CHECK-NEXT: [[COPY1:%[0-9]+]]:gpr64 = COPY $x1
+ ; CHECK-NEXT: [[COPY2:%[0-9]+]]:gpr64 = COPY $x2
+ ; CHECK-NEXT: [[COPY3:%[0-9]+]]:gpr64 = COPY $x3
+ ; CHECK-NEXT: [[ADDSXrr:%[0-9]+]]:gpr64 = ADDSXrr [[COPY]], [[COPY2]], implicit-def $nzcv
+ ; CHECK-NEXT: [[CSINCWr:%[0-9]+]]:gpr32common = CSINCWr $wzr, $wzr, 3, implicit $nzcv
+ ; CHECK-NEXT: [[SUBSWri:%[0-9]+]]:gpr32 = SUBSWri [[CSINCWr]], 1, 0, implicit-def $nzcv
+ ; CHECK-NEXT: [[ADCSXr:%[0-9]+]]:gpr64 = ADCSXr [[COPY1]], [[COPY3]], implicit-def $nzcv, implicit $nzcv
+ ; CHECK-NEXT: [[CSINCWr1:%[0-9]+]]:gpr32 = CSINCWr $wzr, $wzr, 7, implicit $nzcv
+ ; CHECK-NEXT: $w0 = COPY [[CSINCWr1]]
+ ; CHECK-NEXT: RET_ReallyLR implicit $w0
+ %0:gpr(s64) = COPY $x0
+ %1:gpr(s64) = COPY $x1
+ %2:gpr(s64) = COPY $x2
+ %3:gpr(s64) = COPY $x3
+ %4:gpr(s64), %5:gpr(s32) = G_UADDO %0, %2
+ %6:gpr(s64), %7:gpr(s32) = G_SADDE %1, %3, %5
+ $w0 = COPY %7(s32)
+ RET_ReallyLR implicit $w0
+...
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/select-ssube.mir b/llvm/test/CodeGen/AArch64/GlobalISel/select-ssube.mir
index 00bd26cc0220d2b..64643502b60801e 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/select-ssube.mir
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/select-ssube.mir
@@ -175,3 +175,36 @@ body: |
$x2 = COPY %9(s64)
RET_ReallyLR implicit $x0, implicit $x1, implicit $x2
...
+...
+---
+name: ssube_opt_bail_dead
+alignment: 4
+legalized: true
+regBankSelected: true
+tracksRegLiveness: true
+body: |
+ bb.1:
+ liveins: $x0, $x1, $x2, $x3
+ ; CHECK-LABEL: name: ssube_opt_bail_dead
+ ; CHECK: liveins: $x0, $x1, $x2, $x3
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: [[COPY:%[0-9]+]]:gpr64 = COPY $x0
+ ; CHECK-NEXT: [[COPY1:%[0-9]+]]:gpr64 = COPY $x1
+ ; CHECK-NEXT: [[COPY2:%[0-9]+]]:gpr64 = COPY $x2
+ ; CHECK-NEXT: [[COPY3:%[0-9]+]]:gpr64 = COPY $x3
+ ; CHECK-NEXT: [[SUBSXrr:%[0-9]+]]:gpr64 = SUBSXrr [[COPY]], [[COPY2]], implicit-def $nzcv
+ ; CHECK-NEXT: [[CSINCWr:%[0-9]+]]:gpr32 = CSINCWr $wzr, $wzr, 2, implicit $nzcv
+ ; CHECK-NEXT: [[SUBSWrr:%[0-9]+]]:gpr32 = SUBSWrr $wzr, [[CSINCWr]], implicit-def $nzcv
+ ; CHECK-NEXT: [[SBCSXr:%[0-9]+]]:gpr64 = SBCSXr [[COPY1]], [[COPY3]], implicit-def $nzcv, implicit $nzcv
+ ; CHECK-NEXT: [[CSINCWr1:%[0-9]+]]:gpr32 = CSINCWr $wzr, $wzr, 7, implicit $nzcv
+ ; CHECK-NEXT: $w0 = COPY [[CSINCWr1]]
+ ; CHECK-NEXT: RET_ReallyLR implicit $w0
+ %0:gpr(s64) = COPY $x0
+ %1:gpr(s64) = COPY $x1
+ %2:gpr(s64) = COPY $x2
+ %3:gpr(s64) = COPY $x3
+ %4:gpr(s64), %5:gpr(s32) = G_USUBO %0, %2
+ %6:gpr(s64), %7:gpr(s32) = G_SSUBE %1, %3, %5
+ $w0 = COPY %7(s32)
+ RET_ReallyLR implicit $w0
+...
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/select-uadde.mir b/llvm/test/CodeGen/AArch64/GlobalISel/select-uadde.mir
index dc80d0c9abc252e..7f58ecf6b1c6aa4 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/select-uadde.mir
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/select-uadde.mir
@@ -175,3 +175,36 @@ body: |
$x2 = COPY %9(s64)
RET_ReallyLR implicit $x0, implicit $x1, implicit $x2
...
+...
+---
+name: uadde_opt_bail_dead
+alignment: 4
+legalized: true
+regBankSelected: true
+tracksRegLiveness: true
+body: |
+ bb.1:
+ liveins: $x0, $x1, $x2, $x3
+ ; CHECK-LABEL: name: uadde_opt_bail_dead
+ ; CHECK: liveins: $x0, $x1, $x2, $x3
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: [[COPY:%[0-9]+]]:gpr64 = COPY $x0
+ ; CHECK-NEXT: [[COPY1:%[0-9]+]]:gpr64 = COPY $x1
+ ; CHECK-NEXT: [[COPY2:%[0-9]+]]:gpr64 = COPY $x2
+ ; CHECK-NEXT: [[COPY3:%[0-9]+]]:gpr64 = COPY $x3
+ ; CHECK-NEXT: [[ADDSXrr:%[0-9]+]]:gpr64 = ADDSXrr [[COPY]], [[COPY2]], implicit-def $nzcv
+ ; CHECK-NEXT: [[CSINCWr:%[0-9]+]]:gpr32common = CSINCWr $wzr, $wzr, 3, implicit $nzcv
+ ; CHECK-NEXT: [[SUBSWri:%[0-9]+]]:gpr32 = SUBSWri [[CSINCWr]], 1, 0, implicit-def $nzcv
+ ; CHECK-NEXT: [[ADCSXr:%[0-9]+]]:gpr64 = ADCSXr [[COPY1]], [[COPY3]], implicit-def $nzcv, implicit $nzcv
+ ; CHECK-NEXT: [[CSINCWr1:%[0-9]+]]:gpr32 = CSINCWr $wzr, $wzr, 3, implicit $nzcv
+ ; CHECK-NEXT: $w0 = COPY [[CSINCWr1]]
+ ; CHECK-NEXT: RET_ReallyLR implicit $w0
+ %0:gpr(s64) = COPY $x0
+ %1:gpr(s64) = COPY $x1
+ %2:gpr(s64) = COPY $x2
+ %3:gpr(s64) = COPY $x3
+ %4:gpr(s64), %5:gpr(s32) = G_UADDO %0, %2
+ %6:gpr(s64), %7:gpr(s32) = G_UADDE %1, %3, %5
+ $w0 = COPY %7(s32)
+ RET_ReallyLR implicit $w0
+...
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/select-usube.mir b/llvm/test/CodeGen/AArch64/GlobalISel/select-usube.mir
index c532474fc67b4f7..97e77f61137080c 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/select-usube.mir
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/select-usube.mir
@@ -175,3 +175,36 @@ body: |
$x2 = COPY %9(s64)
RET_ReallyLR implicit $x0, implicit $x1, implicit $x2
...
+...
+---
+name: usube_opt_bail_dead
+alignment: 4
+legalized: true
+regBankSelected: true
+tracksRegLiveness: true
+body: |
+ bb.1:
+ liveins: $x0, $x1, $x2, $x3
+ ; CHECK-LABEL: name: usube_opt_bail_dead
+ ; CHECK: liveins: $x0, $x1, $x2, $x3
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: [[COPY:%[0-9]+]]:gpr64 = COPY $x0
+ ; CHECK-NEXT: [[COPY1:%[0-9]+]]:gpr64 = COPY $x1
+ ; CHECK-NEXT: [[COPY2:%[0-9]+]]:gpr64 = COPY $x2
+ ; CHECK-NEXT: [[COPY3:%[0-9]+]]:gpr64 = COPY $x3
+ ; CHECK-NEXT: [[SUBSXrr:%[0-9]+]]:gpr64 = SUBSXrr [[COPY]], [[COPY2]], implicit-def $nzcv
+ ; CHECK-NEXT: [[CSINCWr:%[0-9]+]]:gpr32 = CSINCWr $wzr, $wzr, 2, implicit $nzcv
+ ; CHECK-NEXT: [[SUBSWrr:%[0-9]+]]:gpr32 = SUBSWrr $wzr, [[CSINCWr]], implicit-def $nzcv
+ ; CHECK-NEXT: [[SBCSXr:%[0-9]+]]:gpr64 = SBCSXr [[COPY1]], [[COPY3]], implicit-def $nzcv, implicit $nzcv
+ ; CHECK-NEXT: [[CSINCWr1:%[0-9]+]]:gpr32 = CSINCWr $wzr, $wzr, 2, implicit $nzcv
+ ; CHECK-NEXT: $w0 = COPY [[CSINCWr1]]
+ ; CHECK-NEXT: RET_ReallyLR implicit $w0
+ %0:gpr(s64) = COPY $x0
+ %1:gpr(s64) = COPY $x1
+ %2:gpr(s64) = COPY $x2
+ %3:gpr(s64) = COPY $x3
+ %4:gpr(s64), %5:gpr(s32) = G_USUBO %0, %2
+ %6:gpr(s64), %7:gpr(s32) = G_USUBE %1, %3, %5
+ $w0 = COPY %7(s32)
+ RET_ReallyLR implicit $w0
+...
|
Would it be legal to just add NZCV as implicit-def to the previous G_UADDE/.. to make the dependency explicit? addOperand would place it at the end of the operand array, so I believe it should be ok. |
What if we just call |
3f7fd50
to
3c413ff
Compare
Eliding the NZCV setting instruction for G_UADDE/... is illegal if it causes the instruction that defines the carry vReg to become dead. We fix this by recursively selecting this instruction before continuing to select the current instruction.
bool AArch64InstructionSelector::selectAndRestoreState(MachineInstr &I) { | ||
MachineIRBuilderState OldMIBState = MIB.getState(); | ||
bool Success = select(I); | ||
MIB.getState() = std::move(OldMIBState); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just add a setState()
instead? The state can be copied by value, it's just some pointers and an iterator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about the DebugLoc? It has a TrackingMDRef, which does seem to benefit from moving, because it can call MetadataTracking::retrack(). Is it negligible? I don't understand the details of MetadataTracking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also not familiar with that, but we already copy DebugLocs everywhere in llvm so I doubt think it's an issue?
Reland 3686a0b after fixing an exposed miscompile in #68840 Differential Revision: https://reviews.llvm.org/D159140
Eliding the carry-in setting instruction for G_UADDE/... is illegal if it causes the carry generating instruction to become dead because ISel will just remove the instruction.
I accidentally introduced this here: https://reviews.llvm.org/D153164.
As far as I can tell, this is not exposed on the default clang settings, because on O0 there is always a G_AND between boolean defs and uses, so the optimization doesn't apply. Thus, when I tried to commit https://reviews.llvm.org/D159140, which removes these G_ANDs on O0, I broke some UBSan tests.
The optimization is more of a band-aid to cover the most common case, so I'd also be willing to just remove it right now. I'll eventually add some proper NZCV optimizations to cover these cases (and stuff like when the CarryOut is directly fed into a branch) to PostSelectOptimize. Please let me know if we want to keep this as a cheap check for O0 even though it is a little hacky.