Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CodeLayout] CDSortImpl: remove HotChains and remove linear-time erase_value from mergeChains #69276

Merged
merged 3 commits into from
Oct 18, 2023

Conversation

MaskRay
Copy link
Member

@MaskRay MaskRay commented Oct 17, 2023

After mergeChainPairs initializes a priority queue, HotChains is unused
except a HotChains.size() use in LLVM_DEBUG. Optimize it out.

@llvmbot
Copy link
Member

llvmbot commented Oct 17, 2023

@llvm/pr-subscribers-llvm-transforms

Author: Fangrui Song (MaskRay)

Changes

After mergeChainPairs initializes a priority queue, HotChains is unused
except a HotChains.size() use in LLVM_DEBUG. Replace it with a variable.


Full diff: https://github.com/llvm/llvm-project/pull/69276.diff

1 Files Affected:

  • (modified) llvm/lib/Transforms/Utils/CodeLayout.cpp (+4-4)
diff --git a/llvm/lib/Transforms/Utils/CodeLayout.cpp b/llvm/lib/Transforms/Utils/CodeLayout.cpp
index d9e302d8b4fa54d..d2f39b0284c6799 100644
--- a/llvm/lib/Transforms/Utils/CodeLayout.cpp
+++ b/llvm/lib/Transforms/Utils/CodeLayout.cpp
@@ -1027,7 +1027,7 @@ class CDSortImpl {
 
     LLVM_DEBUG(dbgs() << "Cache-directed function sorting reduced the number"
                       << " of chains from " << NumNodes << " to "
-                      << HotChains.size() << "\n");
+                      << NumHotChains << "\n");
 
     // Collect nodes from all the chains.
     return concatChains();
@@ -1085,6 +1085,7 @@ class CDSortImpl {
       if (Node.ExecutionCount > 0)
         HotChains.push_back(&AllChains.back());
     }
+    NumHotChains = HotChains.size();
 
     // Initialize chain edges.
     AllEdges.reserve(AllJumps.size());
@@ -1152,6 +1153,7 @@ class CDSortImpl {
       MergeGainT BestGain = BestEdge->getMergeGain();
       mergeChains(BestSrcChain, BestDstChain, BestGain.mergeOffset(),
                   BestGain.mergeType());
+      --NumHotChains;
 
       // Insert newly created edges into the queue.
       for (const auto &[_, Edge] : BestSrcChain->Edges) {
@@ -1301,9 +1303,6 @@ class CDSortImpl {
     // Merge the edges.
     Into->mergeEdges(From);
     From->clear();
-
-    // Remove the chain from the list of active chains.
-    llvm::erase_value(HotChains, From);
   }
 
   /// Concatenate all chains into the final order.
@@ -1372,6 +1371,7 @@ class CDSortImpl {
 
   /// Active chains. The vector gets updated at runtime when chains are merged.
   std::vector<ChainT *> HotChains;
+  size_t NumHotChains = 0;
 
   /// The total number of samples in the graph.
   uint64_t TotalSamples{0};

@@ -1152,6 +1153,7 @@ class CDSortImpl {
MergeGainT BestGain = BestEdge->getMergeGain();
mergeChains(BestSrcChain, BestDstChain, BestGain.mergeOffset(),
BestGain.mergeType());
--NumHotChains;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the list, HotChains, is not used in the algorithm. Can we get rid of it altogether?

Instead of for (ChainT *ChainPred : HotChains) {... we'd have something like

for (NodeT &Node : AllNodes) {
  if (Node.ExecutionCount > 0) {
    ChainT *ChainPred = Node.CurChain;
    ...
  }
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion. I'll optimize out HotChains and rename NumHotChains to NumActiveChains.

@@ -1027,7 +1027,7 @@ class CDSortImpl {

LLVM_DEBUG(dbgs() << "Cache-directed function sorting reduced the number"
<< " of chains from " << NumNodes << " to "
<< HotChains.size() << "\n");
<< NumHotChains << "\n");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One can move the debug statement to concatChains() where we have access to the number of remaining chains, SortedChains.size(). So this variable, NumHotChains, won't be needed

@MaskRay MaskRay force-pushed the codelayout-numhotchains branch from 6f0afe8 to 299bbc6 Compare October 17, 2023 22:27
@MaskRay MaskRay changed the title [CodeLayout] CDSortImpl: remove linear-time erase_value from mergeChains [CodeLayout] CDSortImpl: remove HotChains and remove linear-time erase_value from mergeChains Oct 17, 2023
@github-actions
Copy link

github-actions bot commented Oct 17, 2023

✅ With the latest revision this PR passed the C/C++ code formatter.

…e_value from mergeChains

After mergeChainPairs initializes a priority queue, HotChains is unused
except a HotChains.size() use in LLVM_DEBUG. Optimize it out.
@MaskRay MaskRay force-pushed the codelayout-numhotchains branch from 7ceeecd to a0686cf Compare October 18, 2023 00:14
@MaskRay
Copy link
Member Author

MaskRay commented Oct 18, 2023

Thanks for the review!

This is not a bottleneck.

  Time (mean ± σ):     534.928 s ±  1.463 s    [User: 545.663 s, System: 2.177 s]
  Range (min … max):   533.894 s … 535.963 s    2 runs

Benchmark 2: numactl -C 20-27 /t/lld1 -flavor gnu @response.txt --call-graph-profile-sort=cdsort --threads=8
  Time (mean ± σ):     535.081 s ±  0.563 s    [User: 545.860 s, System: 2.193 s]
  Range (min … max):   534.683 s … 535.479 s    2 runs

Summary
  'numactl -C 20-27 /t/lld0 -flavor gnu @response.txt --call-graph-profile-sort=cdsort --threads=8' ran
    1.00 ± 0.00 times faster than 'numactl -C 20-27 /t/lld1 -flavor gnu @response.txt --call-graph-profile-sort=cdsort --threads=8'

@MaskRay MaskRay merged commit beffc82 into llvm:main Oct 18, 2023
2 checks passed
@MaskRay MaskRay deleted the codelayout-numhotchains branch October 18, 2023 01:37
Guzhu-AMD pushed a commit to GPUOpen-Drivers/llvm-project that referenced this pull request Oct 20, 2023
Local branch amd-gfx 010f535 Merged main:6eee238975e4 into amd-gfx:4d4ec95d174b
Remote branch main beffc82 [CodeLayout] CDSortImpl: remove HotChains and remove linear-time erase_value from mergeChains (llvm#69276)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants