[AMDGPU][SplitModule] Fix unintentional integer division #117586

frasercrmck · 2024-11-25T18:02:41Z

A static analysis tool warned that a division was always being performed
in integer division, so was either 0.0 or 1.0.

This doesn't seem intentional, so has been fixed to return a true ratio
using floating-point division. This in turn showed a bug where a
comparison against this ratio was incorrect.

llvmbot · 2024-11-25T18:03:16Z

@llvm/pr-subscribers-backend-amdgpu

Author: Fraser Cormack (frasercrmck)

Changes

A static analysis tool found that ModuleCost could be zero, so would perform divide by zero when being printed. Perhaps this is unreachable in practice, but the fix is straightforward enough and unlikely to be a performance concern.

The same tool warned that a division was always being performed in integer division, so was either 0.0 or 1.0. This doesn't seem intentional, so has been fixed to return a true ratio using floating-point division. This has a knock-on effect on how a test was splitting modules.

Full diff: https://github.com/llvm/llvm-project/pull/117586.diff

2 Files Affected:

(modified) llvm/lib/Target/AMDGPU/AMDGPUSplitModule.cpp (+3-2)
(modified) llvm/test/tools/llvm-split/AMDGPU/large-kernels-merging.ll (+7-6)

diff --git a/llvm/lib/Target/AMDGPU/AMDGPUSplitModule.cpp b/llvm/lib/Target/AMDGPU/AMDGPUSplitModule.cpp
index 5d7aff1c5092cc..a7db7cdb890051 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUSplitModule.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUSplitModule.cpp
@@ -149,7 +149,8 @@ static constexpr unsigned InvalidPID = -1;
 /// \param Dem denominator
 /// \returns a printable object to print (Num/Dem) using "%0.2f".
 static auto formatRatioOf(CostType Num, CostType Dem) {
-  return format("%0.2f", (static_cast<double>(Num) / Dem) * 100);
+  CostType DemOr1 = Dem ? Dem : 1;
+  return format("%0.2f", (static_cast<double>(Num) / DemOr1) * 100);
 }
 
 /// Checks whether a given function is non-copyable.
@@ -1101,7 +1102,7 @@ void RecursiveSearchSplitting::pickPartition(unsigned Depth, unsigned Idx,
         // Check if the amount of code in common makes it worth it.
         assert(SimilarDepsCost && Entry.CostExcludingGraphEntryPoints);
         const double Ratio =
-            SimilarDepsCost / Entry.CostExcludingGraphEntryPoints;
+            (double)SimilarDepsCost / Entry.CostExcludingGraphEntryPoints;
         assert(Ratio >= 0.0 && Ratio <= 1.0);
         if (LargeFnOverlapForMerge > Ratio) {
           // For debug, just print "L", so we'll see "L3=P3" for instance, which
diff --git a/llvm/test/tools/llvm-split/AMDGPU/large-kernels-merging.ll b/llvm/test/tools/llvm-split/AMDGPU/large-kernels-merging.ll
index 807fb2e5f33cea..2810e9853bebe3 100644
--- a/llvm/test/tools/llvm-split/AMDGPU/large-kernels-merging.ll
+++ b/llvm/test/tools/llvm-split/AMDGPU/large-kernels-merging.ll
@@ -15,19 +15,20 @@
 ; Also check w/o large kernels processing to verify they are indeed handled
 ; differently.
 
-; P0 is empty
-; CHECK0: declare
+; CHECK0: define internal void @HelperC()
+; CHECK0: define amdgpu_kernel void @C
 
-; CHECK1: define internal void @HelperC()
-; CHECK1: define amdgpu_kernel void @C
+; CHECK1: define internal void @large2()
+; CHECK1: define internal void @large1()
+; CHECK1: define internal void @large0()
+; CHECK1: define internal void @HelperB()
+; CHECK1: define amdgpu_kernel void @B
 
 ; CHECK2: define internal void @large2()
 ; CHECK2: define internal void @large1()
 ; CHECK2: define internal void @large0()
 ; CHECK2: define internal void @HelperA()
-; CHECK2: define internal void @HelperB()
 ; CHECK2: define amdgpu_kernel void @A
-; CHECK2: define amdgpu_kernel void @B
 
 ; NOLARGEKERNELS-CHECK0: define internal void @HelperC()
 ; NOLARGEKERNELS-CHECK0: define amdgpu_kernel void @C

llvm/test/tools/llvm-split/AMDGPU/large-kernels-merging.ll

arsenm · 2024-11-25T18:08:15Z

Description should be more specific

frasercrmck · 2024-11-25T18:19:05Z

Description should be more specific

Yep, you're right, sorry. I've just split it into two PRs since we can't merge individual commits in the same PR

llvm/test/tools/llvm-split/AMDGPU/large-kernels-merging.ll

Pierre-vh · 2024-11-26T07:12:24Z

llvm/lib/Target/AMDGPU/AMDGPUSplitModule.cpp

@@ -1101,7 +1101,7 @@ void RecursiveSearchSplitting::pickPartition(unsigned Depth, unsigned Idx,
        // Check if the amount of code in common makes it worth it.
        assert(SimilarDepsCost && Entry.CostExcludingGraphEntryPoints);
        const double Ratio =
-            SimilarDepsCost / Entry.CostExcludingGraphEntryPoints;
+            (double)SimilarDepsCost / Entry.CostExcludingGraphEntryPoints;
        assert(Ratio >= 0.0 && Ratio <= 1.0);
        if (LargeFnOverlapForMerge > Ratio) {


I think this needs to be reversed, so Ratio > LargeFnOverlapForMerge

Yep good spot, thanks

I can't reply inline with your other comment for some reason, but it does indeed reverse the changes to the test.

A static analysis tool warned that a division was always being performed in integer division, so was either 0.0 or 1.0. This doesn't seem intentional, so has been fixed to return a true ratio using floating-point division. This in turn showed a bug where a comparison against this ratio was incorrect.

llvm/lib/Target/AMDGPU/AMDGPUSplitModule.cpp

A static analysis tool warned that a division was always being performed in integer division, so was either 0.0 or 1.0. This doesn't seem intentional, so has been fixed to return a true ratio using floating-point division. This in turn showed a bug where a comparison against this ratio was incorrect. Change-Id: Ia8b755339a7ddc2d3bdf5ba701bd5724b00488d2

frasercrmck requested a review from Pierre-vh November 25, 2024 18:02

llvmbot added the backend:AMDGPU label Nov 25, 2024

frasercrmck commented Nov 25, 2024

View reviewed changes

llvm/test/tools/llvm-split/AMDGPU/large-kernels-merging.ll Outdated Show resolved Hide resolved

frasercrmck force-pushed the amdgpu-fix-split-module-ratio branch from 9f271dd to 64efad2 Compare November 25, 2024 18:15

frasercrmck changed the title ~~[AMDGPU][SplitModule] Fix a couple of issues~~ [AMDGPU][SplitModule] Fix unintentional integer division Nov 25, 2024

frasercrmck force-pushed the amdgpu-fix-split-module-ratio branch from 64efad2 to 36267e1 Compare November 25, 2024 18:17

Pierre-vh reviewed Nov 26, 2024

View reviewed changes

frasercrmck force-pushed the amdgpu-fix-split-module-ratio branch from 36267e1 to 70ce91d Compare November 26, 2024 10:30

arsenm reviewed Nov 26, 2024

View reviewed changes

llvm/lib/Target/AMDGPU/AMDGPUSplitModule.cpp Outdated Show resolved Hide resolved

static_cast

cdecbbc

Pierre-vh approved these changes Nov 27, 2024

View reviewed changes

frasercrmck merged commit 345b331 into llvm:main Nov 27, 2024
8 checks passed

frasercrmck deleted the amdgpu-fix-split-module-ratio branch November 27, 2024 10:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AMDGPU][SplitModule] Fix unintentional integer division #117586

[AMDGPU][SplitModule] Fix unintentional integer division #117586

frasercrmck commented Nov 25, 2024 •

edited

Loading

llvmbot commented Nov 25, 2024

arsenm commented Nov 25, 2024

frasercrmck commented Nov 25, 2024

Pierre-vh Nov 26, 2024

frasercrmck Nov 26, 2024

frasercrmck Nov 26, 2024

[AMDGPU][SplitModule] Fix unintentional integer division #117586

[AMDGPU][SplitModule] Fix unintentional integer division #117586

Conversation

frasercrmck commented Nov 25, 2024 • edited Loading

llvmbot commented Nov 25, 2024

arsenm commented Nov 25, 2024

frasercrmck commented Nov 25, 2024

Pierre-vh Nov 26, 2024

Choose a reason for hiding this comment

frasercrmck Nov 26, 2024

Choose a reason for hiding this comment

frasercrmck Nov 26, 2024

Choose a reason for hiding this comment

frasercrmck commented Nov 25, 2024 •

edited

Loading