Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AMDGPU] Add legality check when folding short 64-bit literals #69391

Merged
merged 3 commits into from
Oct 18, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -5490,6 +5490,18 @@ bool SIInstrInfo::isOperandLegal(const MachineInstr &MI, unsigned OpIdx,
return true;
}

if (MO->isImm()) {
uint64_t Imm = MO->getImm();
bool Is64BitFPOp = OpInfo.OperandType == AMDGPU::OPERAND_REG_IMM_FP64;
bool Is64BitOp = Is64BitFPOp ||
OpInfo.OperandType == AMDGPU::OPERAND_REG_IMM_INT64 ||
OpInfo.OperandType == AMDGPU::OPERAND_REG_IMM_V2INT32 ||
OpInfo.OperandType == AMDGPU::OPERAND_REG_IMM_V2FP32;
if (Is64BitOp && !AMDGPU::isValid32BitLiteral(Imm, Is64BitFPOp) &&
!AMDGPU::isInlinableLiteral64(Imm, ST.hasInv2PiInlineImm()))
return false;
}

// Handle non-register types that are treated like immediates.
assert(MO->isImm() || MO->isTargetIndex() || MO->isFI() || MO->isGlobal());

Expand Down
125 changes: 125 additions & 0 deletions llvm/test/CodeGen/AMDGPU/fold-short-64-bit-literals.mir
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 3
# RUN: llc -march=amdgcn -mcpu=gfx1010 -verify-machineinstrs -run-pass=si-fold-operands -o - %s | FileCheck --check-prefix=GCN %s

---
name: no_fold_fp_64bit_literal_sgpr
tracksRegLiveness: true
body: |
bb.0:

; GCN-LABEL: name: no_fold_fp_64bit_literal_sgpr
; GCN: [[DEF:%[0-9]+]]:vreg_64 = IMPLICIT_DEF
; GCN-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 1311768467750121200
; GCN-NEXT: [[V_ADD_F64_e64_:%[0-9]+]]:vreg_64 = V_ADD_F64_e64 0, [[S_MOV_B64_]], 0, [[DEF]], 0, 0, implicit $mode, implicit $exec
; GCN-NEXT: SI_RETURN_TO_EPILOG [[V_ADD_F64_e64_]]
%0:vreg_64 = IMPLICIT_DEF
%1:sreg_64 = S_MOV_B64 1311768467750121200
%2:vreg_64 = V_ADD_F64_e64 0, %1, 0, %0, 0, 0, implicit $mode, implicit $exec
SI_RETURN_TO_EPILOG %2
...

---
name: no_fold_fp_64bit_literal_vgpr
tracksRegLiveness: true
body: |
bb.0:

; GCN-LABEL: name: no_fold_fp_64bit_literal_vgpr
; GCN: [[DEF:%[0-9]+]]:vreg_64 = IMPLICIT_DEF
; GCN-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 1311768467750121200, implicit $exec
; GCN-NEXT: [[V_ADD_F64_e64_:%[0-9]+]]:vreg_64 = V_ADD_F64_e64 0, [[V_MOV_B]], 0, [[DEF]], 0, 0, implicit $mode, implicit $exec
; GCN-NEXT: SI_RETURN_TO_EPILOG [[V_ADD_F64_e64_]]
%0:vreg_64 = IMPLICIT_DEF
%1:vreg_64 = V_MOV_B64_PSEUDO 1311768467750121200, implicit $exec
%2:vreg_64 = V_ADD_F64_e64 0, %1, 0, %0, 0, 0, implicit $mode, implicit $exec
SI_RETURN_TO_EPILOG %2
...

---
name: fold_fp_32bit_literal_sgpr
tracksRegLiveness: true
body: |
bb.0:

; GCN-LABEL: name: fold_fp_32bit_literal_sgpr
; GCN: [[DEF:%[0-9]+]]:vreg_64 = IMPLICIT_DEF
; GCN-NEXT: [[V_ADD_F64_e64_:%[0-9]+]]:vreg_64 = V_ADD_F64_e64 0, 4636737291354636288, 0, [[DEF]], 0, 0, implicit $mode, implicit $exec
; GCN-NEXT: SI_RETURN_TO_EPILOG [[V_ADD_F64_e64_]]
%0:vreg_64 = IMPLICIT_DEF
%1:sreg_64 = S_MOV_B64 4636737291354636288
%2:vreg_64 = V_ADD_F64_e64 0, %1, 0, %0, 0, 0, implicit $mode, implicit $exec
SI_RETURN_TO_EPILOG %2
...

---
name: no_fold_int_64bit_literal_sgpr
tracksRegLiveness: true
body: |
bb.0:

; GCN-LABEL: name: no_fold_int_64bit_literal_sgpr
; GCN: [[DEF:%[0-9]+]]:sreg_64 = IMPLICIT_DEF
; GCN-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 1311768467750121200
; GCN-NEXT: [[S_AND_B64_:%[0-9]+]]:sreg_64 = S_AND_B64 [[DEF]], [[S_MOV_B64_]], implicit-def $scc
; GCN-NEXT: SI_RETURN_TO_EPILOG [[S_AND_B64_]]
%0:sreg_64 = IMPLICIT_DEF
%1:sreg_64 = S_MOV_B64 1311768467750121200
%2:sreg_64 = S_AND_B64 %0, %1, implicit-def $scc
SI_RETURN_TO_EPILOG %2
...

---
name: fold_int_32bit_literal_sgpr
tracksRegLiveness: true
body: |
bb.0:

; GCN-LABEL: name: fold_int_32bit_literal_sgpr
; GCN: [[DEF:%[0-9]+]]:sreg_64 = IMPLICIT_DEF
; GCN-NEXT: [[S_AND_B64_:%[0-9]+]]:sreg_64 = S_AND_B64 [[DEF]], 2147483647, implicit-def $scc
; GCN-NEXT: SI_RETURN_TO_EPILOG [[S_AND_B64_]]
%0:sreg_64 = IMPLICIT_DEF
%1:sreg_64 = S_MOV_B64 2147483647
%2:sreg_64 = S_AND_B64 %0, %1, implicit-def $scc
SI_RETURN_TO_EPILOG %2
...

---
name: fold_uint_32bit_literal_sgpr
tracksRegLiveness: true
body: |
bb.0:

; GCN-LABEL: name: fold_uint_32bit_literal_sgpr
; GCN: [[DEF:%[0-9]+]]:sreg_64 = IMPLICIT_DEF
; GCN-NEXT: [[S_AND_B64_:%[0-9]+]]:sreg_64 = S_AND_B64 [[DEF]], 4294967295, implicit-def $scc
; GCN-NEXT: SI_RETURN_TO_EPILOG [[S_AND_B64_]]
%0:sreg_64 = IMPLICIT_DEF
%1:sreg_64 = S_MOV_B64 4294967295
%2:sreg_64 = S_AND_B64 %0, %1, implicit-def $scc
SI_RETURN_TO_EPILOG %2
...

---
name: no_fold_v2fp_64bit_literal_sgpr
tracksRegLiveness: true
body: |
bb.0:

%0:vreg_64 = IMPLICIT_DEF
%1:vreg_64 = V_MOV_B64_PSEUDO 4629700418019000320, implicit $exec
%2:vreg_64 = V_PK_ADD_F32 0, %0, 0, %1, 0, 0, 0, 0, 0, implicit $mode, implicit $exec
SI_RETURN_TO_EPILOG %2
...

---
name: fold_v2fp_32bit_literal_sgpr
tracksRegLiveness: true
body: |
bb.0:

%0:vreg_64 = IMPLICIT_DEF
%1:vreg_64 = V_MOV_B64_PSEUDO 1065353216, implicit $exec
%2:vreg_64 = V_PK_ADD_F32 0, %0, 0, %1, 0, 0, 0, 0, 0, implicit $mode, implicit $exec
SI_RETURN_TO_EPILOG %2
...