-
Notifications
You must be signed in to change notification settings - Fork 12.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AMDGPU: Fix inst-selection of large scratch offsets with sgpr base #110256
AMDGPU: Fix inst-selection of large scratch offsets with sgpr base #110256
Conversation
This stack of pull requests is managed by Graphite. Learn more about stacking. Join @petar-avramovic and the rest of your teammates on Graphite |
@llvm/pr-subscribers-backend-amdgpu Author: Petar Avramovic (petar-avramovic) ChangesUse i32 for offset instead of i16, this way it does not get interpreted Full diff: https://github.com/llvm/llvm-project/pull/110256.diff 2 Files Affected:
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp b/llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
index d3d5bc924525fc..48971a6840c779 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
@@ -1911,7 +1911,7 @@ bool AMDGPUDAGToDAGISel::SelectScratchSAddr(SDNode *Parent, SDValue Addr,
0);
}
- Offset = CurDAG->getTargetConstant(COffsetVal, DL, MVT::i16);
+ Offset = CurDAG->getTargetConstant(COffsetVal, DL, MVT::i32);
return true;
}
diff --git a/llvm/test/CodeGen/AMDGPU/flat-scratch.ll b/llvm/test/CodeGen/AMDGPU/flat-scratch.ll
index 667a8a38c62ecc..496ac80a3dfbcf 100644
--- a/llvm/test/CodeGen/AMDGPU/flat-scratch.ll
+++ b/llvm/test/CodeGen/AMDGPU/flat-scratch.ll
@@ -4926,7 +4926,7 @@ define amdgpu_gs void @sgpr_base_large_offset(ptr addrspace(1) %out, ptr addrspa
;
; GFX12-LABEL: sgpr_base_large_offset:
; GFX12: ; %bb.0: ; %entry
-; GFX12-NEXT: scratch_load_b32 v2, off, s0 offset:-24
+; GFX12-NEXT: scratch_load_b32 v2, off, s0 offset:65512
; GFX12-NEXT: s_wait_loadcnt 0x0
; GFX12-NEXT: global_store_b32 v[0:1], v2, off
; GFX12-NEXT: s_nop 0
@@ -4985,7 +4985,7 @@ define amdgpu_gs void @sgpr_base_large_offset(ptr addrspace(1) %out, ptr addrspa
;
; GFX12-PAL-LABEL: sgpr_base_large_offset:
; GFX12-PAL: ; %bb.0: ; %entry
-; GFX12-PAL-NEXT: scratch_load_b32 v2, off, s0 offset:-24
+; GFX12-PAL-NEXT: scratch_load_b32 v2, off, s0 offset:65512
; GFX12-PAL-NEXT: s_wait_loadcnt 0x0
; GFX12-PAL-NEXT: global_store_b32 v[0:1], v2, off
; GFX12-PAL-NEXT: s_nop 0
|
41189ad
to
43076c2
Compare
dcec930
to
2ea25b2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks! Please also backport to release/19.x.
Use i32 for offset instead of i16, this way it does not get interpreted as negative 16 bit offset.
2ea25b2
to
fd2e866
Compare
Merge activity
|
…lvm#110256) Use i32 for offset instead of i16, this way it does not get interpreted as negative 16 bit offset. (cherry picked from commit 83fe851)
/pull-request #110470 |
…lvm#110256) Use i32 for offset instead of i16, this way it does not get interpreted as negative 16 bit offset. (cherry picked from commit 83fe851)
…lvm#110256) Use i32 for offset instead of i16, this way it does not get interpreted as negative 16 bit offset.
Use i32 for offset instead of i16, this way it does not get interpreted
as negative 16 bit offset.