-
Notifications
You must be signed in to change notification settings - Fork 12.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AMDGPU] Add legality check when folding short 64-bit literals #69391
Conversation
We can only fold it if it can fit into 32-bit. I believe it did not trigger yet because we do not select 64-bit literals generally.
@llvm/pr-subscribers-backend-amdgpu Author: Stanislav Mekhanoshin (rampitec) ChangesWe can only fold it if it can fit into 32-bit. I believe it did not trigger yet because we do not select 64-bit literals generally. Full diff: https://github.com/llvm/llvm-project/pull/69391.diff 2 Files Affected:
diff --git a/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp b/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
index 2ad07550c763954..e01ca73c135c53b 100644
--- a/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
@@ -5490,6 +5490,16 @@ bool SIInstrInfo::isOperandLegal(const MachineInstr &MI, unsigned OpIdx,
return true;
}
+ if (MO->isImm()) {
+ uint64_t Imm = MO->getImm();
+ bool Is64BitFPOp = OpInfo.OperandType == AMDGPU::OPERAND_REG_IMM_FP64;
+ bool Is64BitOp = Is64BitFPOp ||
+ OpInfo.OperandType == AMDGPU::OPERAND_REG_IMM_INT64;
+ if (Is64BitOp && !AMDGPU::isValid32BitLiteral(Imm, Is64BitFPOp) &&
+ !AMDGPU::isInlinableLiteral64(Imm, ST.hasInv2PiInlineImm()))
+ return false;
+ }
+
// Handle non-register types that are treated like immediates.
assert(MO->isImm() || MO->isTargetIndex() || MO->isFI() || MO->isGlobal());
diff --git a/llvm/test/CodeGen/AMDGPU/fold-short-64-bit-literals.mir b/llvm/test/CodeGen/AMDGPU/fold-short-64-bit-literals.mir
new file mode 100644
index 000000000000000..eb74412b18b3a22
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/fold-short-64-bit-literals.mir
@@ -0,0 +1,101 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 3
+# RUN: llc -march=amdgcn -mcpu=gfx1010 -verify-machineinstrs -run-pass=si-fold-operands -o - %s | FileCheck --check-prefix=GCN %s
+
+---
+name: no_fold_fp_64bit_literal_sgpr
+tracksRegLiveness: true
+body: |
+ bb.0:
+
+ ; GCN-LABEL: name: no_fold_fp_64bit_literal_sgpr
+ ; GCN: [[DEF:%[0-9]+]]:vreg_64 = IMPLICIT_DEF
+ ; GCN-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 1311768467750121200
+ ; GCN-NEXT: [[V_ADD_F64_e64_:%[0-9]+]]:vreg_64 = V_ADD_F64_e64 0, [[S_MOV_B64_]], 0, [[DEF]], 0, 0, implicit $mode, implicit $exec
+ ; GCN-NEXT: SI_RETURN_TO_EPILOG [[V_ADD_F64_e64_]]
+ %0:vreg_64 = IMPLICIT_DEF
+ %1:sreg_64 = S_MOV_B64 1311768467750121200
+ %2:vreg_64 = V_ADD_F64_e64 0, %1, 0, %0, 0, 0, implicit $mode, implicit $exec
+ SI_RETURN_TO_EPILOG %2
+...
+
+---
+name: no_fold_fp_64bit_literal_vgpr
+tracksRegLiveness: true
+body: |
+ bb.0:
+
+ ; GCN-LABEL: name: no_fold_fp_64bit_literal_vgpr
+ ; GCN: [[DEF:%[0-9]+]]:vreg_64 = IMPLICIT_DEF
+ ; GCN-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 1311768467750121200, implicit $exec
+ ; GCN-NEXT: [[V_ADD_F64_e64_:%[0-9]+]]:vreg_64 = V_ADD_F64_e64 0, [[V_MOV_B]], 0, [[DEF]], 0, 0, implicit $mode, implicit $exec
+ ; GCN-NEXT: SI_RETURN_TO_EPILOG [[V_ADD_F64_e64_]]
+ %0:vreg_64 = IMPLICIT_DEF
+ %1:vreg_64 = V_MOV_B64_PSEUDO 1311768467750121200, implicit $exec
+ %2:vreg_64 = V_ADD_F64_e64 0, %1, 0, %0, 0, 0, implicit $mode, implicit $exec
+ SI_RETURN_TO_EPILOG %2
+...
+
+---
+name: fold_fp_32bit_literal_sgpr
+tracksRegLiveness: true
+body: |
+ bb.0:
+
+ ; GCN-LABEL: name: fold_fp_32bit_literal_sgpr
+ ; GCN: [[DEF:%[0-9]+]]:vreg_64 = IMPLICIT_DEF
+ ; GCN-NEXT: [[V_ADD_F64_e64_:%[0-9]+]]:vreg_64 = V_ADD_F64_e64 0, 4636737291354636288, 0, [[DEF]], 0, 0, implicit $mode, implicit $exec
+ ; GCN-NEXT: SI_RETURN_TO_EPILOG [[V_ADD_F64_e64_]]
+ %0:vreg_64 = IMPLICIT_DEF
+ %1:sreg_64 = S_MOV_B64 4636737291354636288
+ %2:vreg_64 = V_ADD_F64_e64 0, %1, 0, %0, 0, 0, implicit $mode, implicit $exec
+ SI_RETURN_TO_EPILOG %2
+...
+
+---
+name: no_fold_int_64bit_literal_sgpr
+tracksRegLiveness: true
+body: |
+ bb.0:
+
+ ; GCN-LABEL: name: no_fold_int_64bit_literal_sgpr
+ ; GCN: [[DEF:%[0-9]+]]:sreg_64 = IMPLICIT_DEF
+ ; GCN-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 1311768467750121200
+ ; GCN-NEXT: [[S_AND_B64_:%[0-9]+]]:sreg_64 = S_AND_B64 [[DEF]], [[S_MOV_B64_]], implicit-def $scc
+ ; GCN-NEXT: SI_RETURN_TO_EPILOG [[S_AND_B64_]]
+ %0:sreg_64 = IMPLICIT_DEF
+ %1:sreg_64 = S_MOV_B64 1311768467750121200
+ %2:sreg_64 = S_AND_B64 %0, %1, implicit-def $scc
+ SI_RETURN_TO_EPILOG %2
+...
+
+---
+name: fold_int_32bit_literal_sgpr
+tracksRegLiveness: true
+body: |
+ bb.0:
+
+ ; GCN-LABEL: name: fold_int_32bit_literal_sgpr
+ ; GCN: [[DEF:%[0-9]+]]:sreg_64 = IMPLICIT_DEF
+ ; GCN-NEXT: [[S_AND_B64_:%[0-9]+]]:sreg_64 = S_AND_B64 [[DEF]], 2147483647, implicit-def $scc
+ ; GCN-NEXT: SI_RETURN_TO_EPILOG [[S_AND_B64_]]
+ %0:sreg_64 = IMPLICIT_DEF
+ %1:sreg_64 = S_MOV_B64 2147483647
+ %2:sreg_64 = S_AND_B64 %0, %1, implicit-def $scc
+ SI_RETURN_TO_EPILOG %2
+...
+
+---
+name: fold_uint_32bit_literal_sgpr
+tracksRegLiveness: true
+body: |
+ bb.0:
+
+ ; GCN-LABEL: name: fold_uint_32bit_literal_sgpr
+ ; GCN: [[DEF:%[0-9]+]]:sreg_64 = IMPLICIT_DEF
+ ; GCN-NEXT: [[S_AND_B64_:%[0-9]+]]:sreg_64 = S_AND_B64 [[DEF]], 4294967295, implicit-def $scc
+ ; GCN-NEXT: SI_RETURN_TO_EPILOG [[S_AND_B64_]]
+ %0:sreg_64 = IMPLICIT_DEF
+ %1:sreg_64 = S_MOV_B64 4294967295
+ %2:sreg_64 = S_AND_B64 %0, %1, implicit-def $scc
+ SI_RETURN_TO_EPILOG %2
+...
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
I like to move it, move it!
I tend to believe this is not exploitable because we do not select these moves. But better be safe than sorry. Especially given that it will ultimately result in the selection of such literals into a single move. |
We can only fold it if it can fit into 32-bit. I believe it did not trigger yet because we do not select 64-bit literals generally.