[AMD] rc/3.2.x cherry picks #5347

jataylo · 2024-12-05T12:12:06Z

Reverts #5191 due to some mlir errors in pytorch unit tests

Smaller set of cherry picks:

This reverts commit 2d8093c.

Enable new arch target since backend support has been added. (cherry picked from commit 134b3eb)

Fixes triton-lang#4769 (cherry picked from commit f484cb8)

triton-lang#5064) Bumping llvm to include a loop unroller fix: llvm/llvm-project#114573. This is needed for subsequent loop unroller upstreaming work. (cherry picked from commit 3c296ab)

This pulls in llvm/llvm-project@bd9145c8c213 to enable ASan on AMD backend. (cherry picked from commit 0bd30a2)

This includes llvm/llvm-project#115627 (cherry picked from commit 6404fbb)

This pulls in the AMDGPU backend support for the gfx950 target. We need to fix the rewrites in `Combine.td` given that llvm/llvm-project#112700 adds a new attribute for denorm mode for `arith.addf`. --------- Co-authored-by: Lei Zhang <[email protected]> (cherry picked from commit 1d5e9a2)

In the case of 16 bit floats operands for tt::AtomicRMWOp, construct only one LLVM::AtomicRMWOp but use vector of elements. Such approach allows to generate packed intrinsics and process 2 elements at once. Added a lit test for f16 vectorized case. (cherry picked from commit 78c8054)

TritonAMDGPUTransforms now depends on it. (cherry picked from commit 0b443ce)

This commit adds support for warp-level reduction with DPP instructions, which can improve performance. See https://gpuopen.com/learn/amd-gcn-assembly-cross-lane-operations/ (cherry picked from commit 21119e3)

(cherry picked from commit 86a2ac7)

This reverts commit 7e401df.

This PR brings in required LLVM bumps and additional targets for gfx950 support. - #5040 - #5064 - #5180 - #5242 - #5392 Note this PR reverts the last two PRs to only focus on the LLVM upgrade - #5347 - #5191 --------- Co-authored-by: peterbell10 <[email protected]> Co-authored-by: Hongtao Yu <[email protected]> Co-authored-by: Lei Zhang <[email protected]> Co-authored-by: Jungwook Park <[email protected]>

This PR brings in required LLVM bumps and additional targets for gfx950 support. - triton-lang#5040 - triton-lang#5064 - triton-lang#5180 - triton-lang#5242 - triton-lang#5392 Note this PR reverts the last two PRs to only focus on the LLVM upgrade - triton-lang#5347 - triton-lang#5191 --------- Co-authored-by: peterbell10 <[email protected]> Co-authored-by: Hongtao Yu <[email protected]> Co-authored-by: Lei Zhang <[email protected]> Co-authored-by: Jungwook Park <[email protected]> (cherry picked from commit f11c5ba)

Reverts triton-lang#5191 due to some mlir errors in pytorch unit tests Smaller set of cherry picks: - triton-lang#5308 (and previous LLVM upgrades) - triton-lang#5281 - triton-lang#4925 - triton-lang#5053 - triton-lang#5019 - triton-lang#4998 --------- Co-authored-by: Jungwook Park <[email protected]> Co-authored-by: peterbell10 <[email protected]> Co-authored-by: Hongtao Yu <[email protected]> Co-authored-by: Lei Zhang <[email protected]> Co-authored-by: Ilya V <[email protected]> Co-authored-by: Kyle Wang <[email protected]> (cherry picked from commit 7e401df)

This PR brings in required LLVM bumps and additional targets for gfx950 support. - triton-lang#5040 - triton-lang#5064 - triton-lang#5180 - triton-lang#5242 - triton-lang#5392 Note this PR reverts the last two PRs to only focus on the LLVM upgrade - triton-lang#5347 - triton-lang#5191 --------- Co-authored-by: peterbell10 <[email protected]> Co-authored-by: Hongtao Yu <[email protected]> Co-authored-by: Lei Zhang <[email protected]> Co-authored-by: Jungwook Park <[email protected]> (cherry picked from commit f11c5ba)

This PR brings in required LLVM bumps and additional targets for gfx950 support. - #5040 - #5064 - #5180 - #5242 - #5392 Reverts: - #5347 - #5191

Reverts #5191 due to some mlir errors in pytorch unit tests Smaller set of cherry picks: - #5308 (and previous LLVM upgrades) - #5281 - #4925 - #5053 - #5019 - #4998 --------- Co-authored-by: Jungwook Park <[email protected]> Co-authored-by: peterbell10 <[email protected]> Co-authored-by: Hongtao Yu <[email protected]> Co-authored-by: Lei Zhang <[email protected]> Co-authored-by: Ilya V <[email protected]> Co-authored-by: Kyle Wang <[email protected]>

This PR brings in required LLVM bumps and additional targets for gfx950 support. - #5040 - #5064 - #5180 - #5242 - #5392 Note this PR reverts the last two PRs to only focus on the LLVM upgrade - #5347 - #5191 --------- Co-authored-by: peterbell10 <[email protected]> Co-authored-by: Hongtao Yu <[email protected]> Co-authored-by: Lei Zhang <[email protected]> Co-authored-by: Jungwook Park <[email protected]>

jataylo and others added 11 commits December 5, 2024 12:05

Revert "[AMD] release/3.2.x AMD perf cherry picks (triton-lang#5191)"

ed5dc78

This reverts commit 2d8093c.

[AMD][BACKEND] Add gfx950 target definitions. (triton-lang#5281)

3ce364e

Enable new arch target since backend support has been added. (cherry picked from commit 134b3eb)

[BACKEND] Update LLVM hash (triton-lang#5040)

aa78bbe

Fixes triton-lang#4769 (cherry picked from commit f484cb8)

[BACKEND] Update llvm to llvm/llvm-project@fa57c7a6a5f594a9e3ae2dbe35… (

f112660

triton-lang#5064) Bumping llvm to include a loop unroller fix: llvm/llvm-project#114573. This is needed for subsequent loop unroller upstreaming work. (cherry picked from commit 3c296ab)

Update to llvm/llvm-project@bd9145c8c213 (triton-lang#5180)

0906635

This pulls in llvm/llvm-project@bd9145c8c213 to enable ASan on AMD backend. (cherry picked from commit 0bd30a2)

[LLVM] Update to llvm-project@86b69c3 (triton-lang#5242)

80159fb

This includes llvm/llvm-project#115627 (cherry picked from commit 6404fbb)

[AMD] Add missing dependency to TritonAMDGPUIR (triton-lang#5053)

383054b

TritonAMDGPUTransforms now depends on it. (cherry picked from commit 0b443ce)

[AMD] Support warp-level reduction with DPP (triton-lang#5019)

5faf1c7

This commit adds support for warp-level reduction with DPP instructions, which can improve performance. See https://gpuopen.com/learn/amd-gcn-assembly-cross-lane-operations/ (cherry picked from commit 21119e3)

[AMD] Restructure ReorderInstructions pass (triton-lang#4998)

218d8b0

(cherry picked from commit 86a2ac7)

jataylo requested review from antiagainst, zhanglx13, Jokeren and ptillet as code owners December 5, 2024 12:12

antiagainst approved these changes Dec 5, 2024

View reviewed changes

antiagainst merged commit 7e401df into triton-lang:rc/3.2.x Dec 5, 2024
6 of 7 checks passed

jataylo added a commit to jataylo/triton that referenced this pull request Dec 11, 2024

Revert "[AMD] rc/3.2.x cherry picks (triton-lang#5347)"

ec30446

This reverts commit 7e401df.

jataylo added a commit to jataylo/triton that referenced this pull request Dec 12, 2024

Revert "[AMD] rc/3.2.x cherry picks (triton-lang#5347)"

a971b59

This reverts commit 7e401df.

jataylo mentioned this pull request Dec 12, 2024

[rc/3.2.x] LLVM bump for gfx950 target support #5417

Merged

jataylo mentioned this pull request Dec 18, 2024

[release/3.2.x] [CHERRY PICK] Add gfx950 target definition #5452

Merged

atalman pushed a commit that referenced this pull request Dec 19, 2024

[release/3.2.x] [CHERRY PICK] Add gfx950 target definition (#5452)

aba0fbf

This PR brings in required LLVM bumps and additional targets for gfx950 support. - #5040 - #5064 - #5180 - #5242 - #5392 Reverts: - #5347 - #5191

bertmaher mentioned this pull request Dec 19, 2024

Release cherry picks: #5191 #5347 #5417 #5084 #3731 #5464

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AMD] rc/3.2.x cherry picks #5347

[AMD] rc/3.2.x cherry picks #5347

jataylo commented Dec 5, 2024

[AMD] rc/3.2.x cherry picks #5347

[AMD] rc/3.2.x cherry picks #5347

Conversation

jataylo commented Dec 5, 2024