Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AArch64][GlobalISel] Fix miscompile on carry-in selection #68840

Merged
merged 2 commits into from
Oct 19, 2023

Conversation

tobias-stadler
Copy link
Contributor

Eliding the carry-in setting instruction for G_UADDE/... is illegal if it causes the carry generating instruction to become dead because ISel will just remove the instruction.
I accidentally introduced this here: https://reviews.llvm.org/D153164.
As far as I can tell, this is not exposed on the default clang settings, because on O0 there is always a G_AND between boolean defs and uses, so the optimization doesn't apply. Thus, when I tried to commit https://reviews.llvm.org/D159140, which removes these G_ANDs on O0, I broke some UBSan tests.

The optimization is more of a band-aid to cover the most common case, so I'd also be willing to just remove it right now. I'll eventually add some proper NZCV optimizations to cover these cases (and stuff like when the CarryOut is directly fed into a branch) to PostSelectOptimize. Please let me know if we want to keep this as a cheap check for O0 even though it is a little hacky.

@llvmbot
Copy link
Member

llvmbot commented Oct 11, 2023

@llvm/pr-subscribers-backend-aarch64

@llvm/pr-subscribers-llvm-globalisel

Author: Tobias Stadler (tobias-stadler)

Changes

Eliding the carry-in setting instruction for G_UADDE/... is illegal if it causes the carry generating instruction to become dead because ISel will just remove the instruction.
I accidentally introduced this here: https://reviews.llvm.org/D153164.
As far as I can tell, this is not exposed on the default clang settings, because on O0 there is always a G_AND between boolean defs and uses, so the optimization doesn't apply. Thus, when I tried to commit https://reviews.llvm.org/D159140, which removes these G_ANDs on O0, I broke some UBSan tests.

The optimization is more of a band-aid to cover the most common case, so I'd also be willing to just remove it right now. I'll eventually add some proper NZCV optimizations to cover these cases (and stuff like when the CarryOut is directly fed into a branch) to PostSelectOptimize. Please let me know if we want to keep this as a cheap check for O0 even though it is a little hacky.


Full diff: https://github.com/llvm/llvm-project/pull/68840.diff

5 Files Affected:

  • (modified) llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp (+7-1)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/select-sadde.mir (+33)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/select-ssube.mir (+33)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/select-uadde.mir (+33)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/select-usube.mir (+33)
diff --git a/llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp b/llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
index 1c7a09696e853e2..f523d0e3740ede3 100644
--- a/llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
+++ b/llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
@@ -4749,11 +4749,17 @@ MachineInstr *AArch64InstructionSelector::emitCarryIn(MachineInstr &I,
   // emit a carry generating instruction. E.g. for G_UADDE/G_USUBE sequences
   // generated during legalization of wide add/sub. This optimization depends on
   // these sequences not being interrupted by other instructions.
+  // We can only safely omit the instruction if it doesn't cause the previous
+  // instruction to become dead, otherwise the instruction selector will delete
+  // it.
   MachineInstr *SrcMI = MRI->getVRegDef(CarryReg);
   if (SrcMI == I.getPrevNode()) {
     if (auto *CarrySrcMI = dyn_cast<GAddSubCarryOut>(SrcMI)) {
       bool ProducesNegatedCarry = CarrySrcMI->isSub();
-      if (NeedsNegatedCarry == ProducesNegatedCarry && CarrySrcMI->isUnsigned())
+      if (NeedsNegatedCarry == ProducesNegatedCarry &&
+          CarrySrcMI->isUnsigned() &&
+          CarrySrcMI->getCarryOutReg() == CarryReg &&
+          !MRI->use_nodbg_empty(CarrySrcMI->getDstReg()))
         return nullptr;
     }
   }
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/select-sadde.mir b/llvm/test/CodeGen/AArch64/GlobalISel/select-sadde.mir
index 85625ced4ba6922..6af4b1e2ad179aa 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/select-sadde.mir
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/select-sadde.mir
@@ -175,3 +175,36 @@ body:             |
     $x2 = COPY %9(s64)
     RET_ReallyLR implicit $x0, implicit $x1, implicit $x2
 ...
+...
+---
+name:            sadde_opt_bail_dead
+alignment:       4
+legalized:       true
+regBankSelected: true
+tracksRegLiveness: true
+body:             |
+  bb.1:
+    liveins: $x0, $x1, $x2, $x3
+    ; CHECK-LABEL: name: sadde_opt_bail_dead
+    ; CHECK: liveins: $x0, $x1, $x2, $x3
+    ; CHECK-NEXT: {{  $}}
+    ; CHECK-NEXT: [[COPY:%[0-9]+]]:gpr64 = COPY $x0
+    ; CHECK-NEXT: [[COPY1:%[0-9]+]]:gpr64 = COPY $x1
+    ; CHECK-NEXT: [[COPY2:%[0-9]+]]:gpr64 = COPY $x2
+    ; CHECK-NEXT: [[COPY3:%[0-9]+]]:gpr64 = COPY $x3
+    ; CHECK-NEXT: [[ADDSXrr:%[0-9]+]]:gpr64 = ADDSXrr [[COPY]], [[COPY2]], implicit-def $nzcv
+    ; CHECK-NEXT: [[CSINCWr:%[0-9]+]]:gpr32common = CSINCWr $wzr, $wzr, 3, implicit $nzcv
+    ; CHECK-NEXT: [[SUBSWri:%[0-9]+]]:gpr32 = SUBSWri [[CSINCWr]], 1, 0, implicit-def $nzcv
+    ; CHECK-NEXT: [[ADCSXr:%[0-9]+]]:gpr64 = ADCSXr [[COPY1]], [[COPY3]], implicit-def $nzcv, implicit $nzcv
+    ; CHECK-NEXT: [[CSINCWr1:%[0-9]+]]:gpr32 = CSINCWr $wzr, $wzr, 7, implicit $nzcv
+    ; CHECK-NEXT: $w0 = COPY [[CSINCWr1]]
+    ; CHECK-NEXT: RET_ReallyLR implicit $w0
+    %0:gpr(s64) = COPY $x0
+    %1:gpr(s64) = COPY $x1
+    %2:gpr(s64) = COPY $x2
+    %3:gpr(s64) = COPY $x3
+    %4:gpr(s64), %5:gpr(s32) = G_UADDO %0, %2
+    %6:gpr(s64), %7:gpr(s32) = G_SADDE %1, %3, %5
+    $w0 = COPY %7(s32)
+    RET_ReallyLR implicit $w0
+...
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/select-ssube.mir b/llvm/test/CodeGen/AArch64/GlobalISel/select-ssube.mir
index 00bd26cc0220d2b..64643502b60801e 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/select-ssube.mir
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/select-ssube.mir
@@ -175,3 +175,36 @@ body:             |
     $x2 = COPY %9(s64)
     RET_ReallyLR implicit $x0, implicit $x1, implicit $x2
 ...
+...
+---
+name:            ssube_opt_bail_dead
+alignment:       4
+legalized:       true
+regBankSelected: true
+tracksRegLiveness: true
+body:             |
+  bb.1:
+    liveins: $x0, $x1, $x2, $x3
+    ; CHECK-LABEL: name: ssube_opt_bail_dead
+    ; CHECK: liveins: $x0, $x1, $x2, $x3
+    ; CHECK-NEXT: {{  $}}
+    ; CHECK-NEXT: [[COPY:%[0-9]+]]:gpr64 = COPY $x0
+    ; CHECK-NEXT: [[COPY1:%[0-9]+]]:gpr64 = COPY $x1
+    ; CHECK-NEXT: [[COPY2:%[0-9]+]]:gpr64 = COPY $x2
+    ; CHECK-NEXT: [[COPY3:%[0-9]+]]:gpr64 = COPY $x3
+    ; CHECK-NEXT: [[SUBSXrr:%[0-9]+]]:gpr64 = SUBSXrr [[COPY]], [[COPY2]], implicit-def $nzcv
+    ; CHECK-NEXT: [[CSINCWr:%[0-9]+]]:gpr32 = CSINCWr $wzr, $wzr, 2, implicit $nzcv
+    ; CHECK-NEXT: [[SUBSWrr:%[0-9]+]]:gpr32 = SUBSWrr $wzr, [[CSINCWr]], implicit-def $nzcv
+    ; CHECK-NEXT: [[SBCSXr:%[0-9]+]]:gpr64 = SBCSXr [[COPY1]], [[COPY3]], implicit-def $nzcv, implicit $nzcv
+    ; CHECK-NEXT: [[CSINCWr1:%[0-9]+]]:gpr32 = CSINCWr $wzr, $wzr, 7, implicit $nzcv
+    ; CHECK-NEXT: $w0 = COPY [[CSINCWr1]]
+    ; CHECK-NEXT: RET_ReallyLR implicit $w0
+    %0:gpr(s64) = COPY $x0
+    %1:gpr(s64) = COPY $x1
+    %2:gpr(s64) = COPY $x2
+    %3:gpr(s64) = COPY $x3
+    %4:gpr(s64), %5:gpr(s32) = G_USUBO %0, %2
+    %6:gpr(s64), %7:gpr(s32) = G_SSUBE %1, %3, %5
+    $w0 = COPY %7(s32)
+    RET_ReallyLR implicit $w0
+...
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/select-uadde.mir b/llvm/test/CodeGen/AArch64/GlobalISel/select-uadde.mir
index dc80d0c9abc252e..7f58ecf6b1c6aa4 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/select-uadde.mir
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/select-uadde.mir
@@ -175,3 +175,36 @@ body:             |
     $x2 = COPY %9(s64)
     RET_ReallyLR implicit $x0, implicit $x1, implicit $x2
 ...
+...
+---
+name:            uadde_opt_bail_dead
+alignment:       4
+legalized:       true
+regBankSelected: true
+tracksRegLiveness: true
+body:             |
+  bb.1:
+    liveins: $x0, $x1, $x2, $x3
+    ; CHECK-LABEL: name: uadde_opt_bail_dead
+    ; CHECK: liveins: $x0, $x1, $x2, $x3
+    ; CHECK-NEXT: {{  $}}
+    ; CHECK-NEXT: [[COPY:%[0-9]+]]:gpr64 = COPY $x0
+    ; CHECK-NEXT: [[COPY1:%[0-9]+]]:gpr64 = COPY $x1
+    ; CHECK-NEXT: [[COPY2:%[0-9]+]]:gpr64 = COPY $x2
+    ; CHECK-NEXT: [[COPY3:%[0-9]+]]:gpr64 = COPY $x3
+    ; CHECK-NEXT: [[ADDSXrr:%[0-9]+]]:gpr64 = ADDSXrr [[COPY]], [[COPY2]], implicit-def $nzcv
+    ; CHECK-NEXT: [[CSINCWr:%[0-9]+]]:gpr32common = CSINCWr $wzr, $wzr, 3, implicit $nzcv
+    ; CHECK-NEXT: [[SUBSWri:%[0-9]+]]:gpr32 = SUBSWri [[CSINCWr]], 1, 0, implicit-def $nzcv
+    ; CHECK-NEXT: [[ADCSXr:%[0-9]+]]:gpr64 = ADCSXr [[COPY1]], [[COPY3]], implicit-def $nzcv, implicit $nzcv
+    ; CHECK-NEXT: [[CSINCWr1:%[0-9]+]]:gpr32 = CSINCWr $wzr, $wzr, 3, implicit $nzcv
+    ; CHECK-NEXT: $w0 = COPY [[CSINCWr1]]
+    ; CHECK-NEXT: RET_ReallyLR implicit $w0
+    %0:gpr(s64) = COPY $x0
+    %1:gpr(s64) = COPY $x1
+    %2:gpr(s64) = COPY $x2
+    %3:gpr(s64) = COPY $x3
+    %4:gpr(s64), %5:gpr(s32) = G_UADDO %0, %2
+    %6:gpr(s64), %7:gpr(s32) = G_UADDE %1, %3, %5
+    $w0 = COPY %7(s32)
+    RET_ReallyLR implicit $w0
+...
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/select-usube.mir b/llvm/test/CodeGen/AArch64/GlobalISel/select-usube.mir
index c532474fc67b4f7..97e77f61137080c 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/select-usube.mir
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/select-usube.mir
@@ -175,3 +175,36 @@ body:             |
     $x2 = COPY %9(s64)
     RET_ReallyLR implicit $x0, implicit $x1, implicit $x2
 ...
+...
+---
+name:            usube_opt_bail_dead
+alignment:       4
+legalized:       true
+regBankSelected: true
+tracksRegLiveness: true
+body:             |
+  bb.1:
+    liveins: $x0, $x1, $x2, $x3
+    ; CHECK-LABEL: name: usube_opt_bail_dead
+    ; CHECK: liveins: $x0, $x1, $x2, $x3
+    ; CHECK-NEXT: {{  $}}
+    ; CHECK-NEXT: [[COPY:%[0-9]+]]:gpr64 = COPY $x0
+    ; CHECK-NEXT: [[COPY1:%[0-9]+]]:gpr64 = COPY $x1
+    ; CHECK-NEXT: [[COPY2:%[0-9]+]]:gpr64 = COPY $x2
+    ; CHECK-NEXT: [[COPY3:%[0-9]+]]:gpr64 = COPY $x3
+    ; CHECK-NEXT: [[SUBSXrr:%[0-9]+]]:gpr64 = SUBSXrr [[COPY]], [[COPY2]], implicit-def $nzcv
+    ; CHECK-NEXT: [[CSINCWr:%[0-9]+]]:gpr32 = CSINCWr $wzr, $wzr, 2, implicit $nzcv
+    ; CHECK-NEXT: [[SUBSWrr:%[0-9]+]]:gpr32 = SUBSWrr $wzr, [[CSINCWr]], implicit-def $nzcv
+    ; CHECK-NEXT: [[SBCSXr:%[0-9]+]]:gpr64 = SBCSXr [[COPY1]], [[COPY3]], implicit-def $nzcv, implicit $nzcv
+    ; CHECK-NEXT: [[CSINCWr1:%[0-9]+]]:gpr32 = CSINCWr $wzr, $wzr, 2, implicit $nzcv
+    ; CHECK-NEXT: $w0 = COPY [[CSINCWr1]]
+    ; CHECK-NEXT: RET_ReallyLR implicit $w0
+    %0:gpr(s64) = COPY $x0
+    %1:gpr(s64) = COPY $x1
+    %2:gpr(s64) = COPY $x2
+    %3:gpr(s64) = COPY $x3
+    %4:gpr(s64), %5:gpr(s32) = G_USUBO %0, %2
+    %6:gpr(s64), %7:gpr(s32) = G_USUBE %1, %3, %5
+    $w0 = COPY %7(s32)
+    RET_ReallyLR implicit $w0
+...

@tobias-stadler
Copy link
Contributor Author

Would it be legal to just add NZCV as implicit-def to the previous G_UADDE/.. to make the dependency explicit? addOperand would place it at the end of the operand array, so I believe it should be ok.

@aemerson
Copy link
Contributor

What if we just call select() on the now-dead instruction? If that works it seems the most reliable way, if it doesn't then adding NZCV seems like an ok stop-gap.

Eliding the NZCV setting instruction for G_UADDE/... is illegal if it causes the
instruction that defines the carry vReg to become dead. We fix this by recursively
selecting this instruction before continuing to select the current instruction.
bool AArch64InstructionSelector::selectAndRestoreState(MachineInstr &I) {
MachineIRBuilderState OldMIBState = MIB.getState();
bool Success = select(I);
MIB.getState() = std::move(OldMIBState);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just add a setState() instead? The state can be copied by value, it's just some pointers and an iterator.

Copy link
Contributor Author

@tobias-stadler tobias-stadler Oct 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about the DebugLoc? It has a TrackingMDRef, which does seem to benefit from moving, because it can call MetadataTracking::retrack(). Is it negligible? I don't understand the details of MetadataTracking.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also not familiar with that, but we already copy DebugLocs everywhere in llvm so I doubt think it's an issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants