-
Notifications
You must be signed in to change notification settings - Fork 12.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AMDGPU][SplitModule] Fix unintentional integer division #117586
[AMDGPU][SplitModule] Fix unintentional integer division #117586
Conversation
@llvm/pr-subscribers-backend-amdgpu Author: Fraser Cormack (frasercrmck) ChangesA static analysis tool found that ModuleCost could be zero, so would perform divide by zero when being printed. Perhaps this is unreachable in practice, but the fix is straightforward enough and unlikely to be a performance concern. The same tool warned that a division was always being performed in integer division, so was either 0.0 or 1.0. This doesn't seem intentional, so has been fixed to return a true ratio using floating-point division. This has a knock-on effect on how a test was splitting modules. Full diff: https://github.com/llvm/llvm-project/pull/117586.diff 2 Files Affected:
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUSplitModule.cpp b/llvm/lib/Target/AMDGPU/AMDGPUSplitModule.cpp
index 5d7aff1c5092cc..a7db7cdb890051 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUSplitModule.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUSplitModule.cpp
@@ -149,7 +149,8 @@ static constexpr unsigned InvalidPID = -1;
/// \param Dem denominator
/// \returns a printable object to print (Num/Dem) using "%0.2f".
static auto formatRatioOf(CostType Num, CostType Dem) {
- return format("%0.2f", (static_cast<double>(Num) / Dem) * 100);
+ CostType DemOr1 = Dem ? Dem : 1;
+ return format("%0.2f", (static_cast<double>(Num) / DemOr1) * 100);
}
/// Checks whether a given function is non-copyable.
@@ -1101,7 +1102,7 @@ void RecursiveSearchSplitting::pickPartition(unsigned Depth, unsigned Idx,
// Check if the amount of code in common makes it worth it.
assert(SimilarDepsCost && Entry.CostExcludingGraphEntryPoints);
const double Ratio =
- SimilarDepsCost / Entry.CostExcludingGraphEntryPoints;
+ (double)SimilarDepsCost / Entry.CostExcludingGraphEntryPoints;
assert(Ratio >= 0.0 && Ratio <= 1.0);
if (LargeFnOverlapForMerge > Ratio) {
// For debug, just print "L", so we'll see "L3=P3" for instance, which
diff --git a/llvm/test/tools/llvm-split/AMDGPU/large-kernels-merging.ll b/llvm/test/tools/llvm-split/AMDGPU/large-kernels-merging.ll
index 807fb2e5f33cea..2810e9853bebe3 100644
--- a/llvm/test/tools/llvm-split/AMDGPU/large-kernels-merging.ll
+++ b/llvm/test/tools/llvm-split/AMDGPU/large-kernels-merging.ll
@@ -15,19 +15,20 @@
; Also check w/o large kernels processing to verify they are indeed handled
; differently.
-; P0 is empty
-; CHECK0: declare
+; CHECK0: define internal void @HelperC()
+; CHECK0: define amdgpu_kernel void @C
-; CHECK1: define internal void @HelperC()
-; CHECK1: define amdgpu_kernel void @C
+; CHECK1: define internal void @large2()
+; CHECK1: define internal void @large1()
+; CHECK1: define internal void @large0()
+; CHECK1: define internal void @HelperB()
+; CHECK1: define amdgpu_kernel void @B
; CHECK2: define internal void @large2()
; CHECK2: define internal void @large1()
; CHECK2: define internal void @large0()
; CHECK2: define internal void @HelperA()
-; CHECK2: define internal void @HelperB()
; CHECK2: define amdgpu_kernel void @A
-; CHECK2: define amdgpu_kernel void @B
; NOLARGEKERNELS-CHECK0: define internal void @HelperC()
; NOLARGEKERNELS-CHECK0: define amdgpu_kernel void @C
|
Description should be more specific |
9f271dd
to
64efad2
Compare
64efad2
to
36267e1
Compare
Yep, you're right, sorry. I've just split it into two PRs since we can't merge individual commits in the same PR |
@@ -1101,7 +1101,7 @@ void RecursiveSearchSplitting::pickPartition(unsigned Depth, unsigned Idx, | |||
// Check if the amount of code in common makes it worth it. | |||
assert(SimilarDepsCost && Entry.CostExcludingGraphEntryPoints); | |||
const double Ratio = | |||
SimilarDepsCost / Entry.CostExcludingGraphEntryPoints; | |||
(double)SimilarDepsCost / Entry.CostExcludingGraphEntryPoints; | |||
assert(Ratio >= 0.0 && Ratio <= 1.0); | |||
if (LargeFnOverlapForMerge > Ratio) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this needs to be reversed, so Ratio > LargeFnOverlapForMerge
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep good spot, thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't reply inline with your other comment for some reason, but it does indeed reverse the changes to the test.
A static analysis tool warned that a division was always being performed in integer division, so was either 0.0 or 1.0. This doesn't seem intentional, so has been fixed to return a true ratio using floating-point division. This in turn showed a bug where a comparison against this ratio was incorrect.
36267e1
to
70ce91d
Compare
A static analysis tool warned that a division was always being performed in integer division, so was either 0.0 or 1.0. This doesn't seem intentional, so has been fixed to return a true ratio using floating-point division. This in turn showed a bug where a comparison against this ratio was incorrect. Change-Id: Ia8b755339a7ddc2d3bdf5ba701bd5724b00488d2
A static analysis tool warned that a division was always being performed
in integer division, so was either 0.0 or 1.0.
This doesn't seem intentional, so has been fixed to return a true ratio
using floating-point division. This in turn showed a bug where a
comparison against this ratio was incorrect.