Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[llvm-profdata] Do not create numerical strings for MD5 function names read from a Sample Profile. #66164

Merged
merged 1 commit into from
Oct 17, 2023

Conversation

huangjd
Copy link
Contributor

@huangjd huangjd commented Sep 13, 2023

This is phase 2 of the MD5 refactoring on Sample Profile following https://reviews.llvm.org/D147740

In previous implementation, when a MD5 Sample Profile is read, the reader first converts the MD5 values to strings, and then create a StringRef as if the numerical strings are regular function names, and later on IPO transformation passes perform string comparison over these numerical strings for profile matching. This is inefficient since it causes many small heap allocations.
In this patch I created a class ProfileFuncRef that is similar to StringRef but it can represent a hash value directly without any conversion, and it will be more efficient (I will attach some benchmark results later) when being used in associative containers.

ProfileFuncRef guarantees the same function name in string form or in MD5 form has the same hash value, which also fix a few issue in IPO passes where function matching/lookup only check for function name string, while returns a no-match if the profile is MD5.

When testing on an internal large profile (> 1 GB, with more than 10 million functions), the full profile load time is reduced from 28 sec to 25 sec in average, and reading function offset table from 0.78s to 0.7s

@huangjd huangjd requested review from a team as code owners September 13, 2023 01:24
@llvmbot llvmbot added backend:X86 PGO Profile Guided Optimizations llvm:transforms labels Sep 13, 2023
@huangjd huangjd changed the title md5phase2 [llvm-profdata] Do not create numerical strings for MD5 function names read from a Sample Profile. Sep 13, 2023
@huangjd huangjd self-assigned this Sep 13, 2023
llvm/include/llvm/ADT/StringRef.h Outdated Show resolved Hide resolved
llvm/include/llvm/ProfileData/ProfileFuncName.h Outdated Show resolved Hide resolved
/// Default constructor. Use memset because there may be padding bytes between
/// Length and Placeholder containing undefined values on certain
/// architectures, which breaks comparison methods.
ProfileFuncName() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this safe when the type becomes polymorphic? It sets the vptr to null

llvm/include/llvm/ProfileData/ProfileFuncName.h Outdated Show resolved Hide resolved
@WenleiHe
Copy link
Member

This is a big change, and a somewhat intrusive one. Can you explain the motivation, the change itself and the results in more details?

@huangjd
Copy link
Contributor Author

huangjd commented Sep 15, 2023

ProfileFuncRef no longer depends on StringRef,

@huangjd
Copy link
Contributor Author

huangjd commented Sep 18, 2023

Please review, code is rebased and it's quite different from the original diff

@aeubanks
Copy link
Contributor

a comment on the github workflow, you're not supposed to force push, just create commits on top of the branch, and no need to close the PR unless you're completely abandoning the patch

@huangjd
Copy link
Contributor Author

huangjd commented Sep 18, 2023

a comment on the github workflow, you're not supposed to force push, just create commits on top of the branch, and no need to close the PR unless you're completely abandoning the patch

When I first created the PR the main branch has changed too much and I had to resolve conflict and update the code in quite a few places, and when I push the rebased branch to my fork repo, this PR now contain several commits that is already checked in, and I don't know how to exclude them (like picking specific commits as diff for review in phab). What is the correct workflow for this kind of operation?

@aeubanks
Copy link
Contributor

ah I think yeah you are supposed to force push for rebases, my bad

/// converting a MD5 sample profile to a format that does not support MD5, and
/// if so, convert the numerical string to a hash code first. We assume that
/// no function name (from a profile) can be a pure number.
explicit ProfileFuncRef(const std::string &Str)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible to avoid creating std::string from MD5 in the first place?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Maybe extend SampleContext to allow an integer variant in addition to string name and SampleContextFrames?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point of this class is to allow representation of both String name and MD5. This function is here just in case. It's only used in auxiliary tool or test case. When reading from an MD5 profile, ProfileFuncRef is directly created over the integer MD5 values.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how many such use cases are there in tests? If not many, it might be better to push the conversion to there to make the interfaces here simpler.

@huangjd
Copy link
Contributor Author

huangjd commented Sep 21, 2023

Added performance measurement. This patch does significantly reduce profile load time, which is its motivation

@huangjd
Copy link
Contributor Author

huangjd commented Sep 21, 2023

I will hold on renaming variables to a different patch because this will change files everywhere, may cause confusion to review in this patch.

@huangjd
Copy link
Contributor Author

huangjd commented Sep 21, 2023

Full profile load is further reduced to 24s after optimizing lazy loading of name table, which was a bottleneck

@github-actions
Copy link

github-actions bot commented Sep 25, 2023

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff 618a22144db5e45da8c95dc22064103e1b5e5b71 92dac9f101a8d90e79d71df2f83bb47334972619 -- llvm/include/llvm/ProfileData/FunctionId.h llvm/include/llvm/ProfileData/HashKeyMap.h llvm/include/llvm/ProfileData/SampleProf.h llvm/include/llvm/ProfileData/SampleProfReader.h llvm/include/llvm/ProfileData/SampleProfWriter.h llvm/include/llvm/Transforms/IPO/ProfiledCallGraph.h llvm/include/llvm/Transforms/IPO/SampleContextTracker.h llvm/lib/ProfileData/SampleProf.cpp llvm/lib/ProfileData/SampleProfReader.cpp llvm/lib/ProfileData/SampleProfWriter.cpp llvm/lib/Target/X86/X86InsertPrefetch.cpp llvm/lib/Transforms/IPO/SampleContextTracker.cpp llvm/lib/Transforms/IPO/SampleProfile.cpp llvm/tools/llvm-profdata/llvm-profdata.cpp llvm/tools/llvm-profgen/CSPreInliner.cpp llvm/tools/llvm-profgen/CSPreInliner.h llvm/tools/llvm-profgen/CallContext.h llvm/tools/llvm-profgen/ProfileGenerator.cpp llvm/tools/llvm-profgen/ProfileGenerator.h llvm/tools/llvm-profgen/ProfiledBinary.cpp llvm/tools/llvm-profgen/ProfiledBinary.h llvm/unittests/ProfileData/SampleProfTest.cpp
View the diff from clang-format here.
diff --git a/llvm/include/llvm/ProfileData/FunctionId.h b/llvm/include/llvm/ProfileData/FunctionId.h
index 0076cdc09045..a889796cfcb2 100644
--- a/llvm/include/llvm/ProfileData/FunctionId.h
+++ b/llvm/include/llvm/ProfileData/FunctionId.h
@@ -60,12 +60,10 @@ public:
 
   /// Constructor from a StringRef.
   explicit FunctionId(StringRef Str)
-      : Data(Str.data()), LengthOrHashCode(Str.size()) {
-  }
+      : Data(Str.data()), LengthOrHashCode(Str.size()) {}
 
   /// Constructor from a hash code.
-  explicit FunctionId(uint64_t HashCode)
-      : LengthOrHashCode(HashCode) {
+  explicit FunctionId(uint64_t HashCode) : LengthOrHashCode(HashCode) {
     assert(HashCode != 0);
   }
 
@@ -164,13 +162,9 @@ inline raw_ostream &operator<<(raw_ostream &OS, const FunctionId &Obj) {
   return OS;
 }
 
-inline uint64_t MD5Hash(const FunctionId &Obj) {
-  return Obj.getHashCode();
-}
+inline uint64_t MD5Hash(const FunctionId &Obj) { return Obj.getHashCode(); }
 
-inline uint64_t hash_value(const FunctionId &Obj) {
-  return Obj.getHashCode();
-}
+inline uint64_t hash_value(const FunctionId &Obj) { return Obj.getHashCode(); }
 
 } // end namespace sampleprof
 
diff --git a/llvm/include/llvm/ProfileData/HashKeyMap.h b/llvm/include/llvm/ProfileData/HashKeyMap.h
index b2f1bf222157..9800cc371b4a 100644
--- a/llvm/include/llvm/ProfileData/HashKeyMap.h
+++ b/llvm/include/llvm/ProfileData/HashKeyMap.h
@@ -49,8 +49,8 @@ namespace sampleprof {
 ///   container.
 template <template <typename, typename, typename...> typename MapT,
           typename KeyT, typename ValueT, typename... MapTArgs>
-class HashKeyMap :
-    public MapT<decltype(hash_value(KeyT())), ValueT, MapTArgs...> {
+class HashKeyMap
+    : public MapT<decltype(hash_value(KeyT())), ValueT, MapTArgs...> {
 public:
   using base_type = MapT<decltype(hash_value(KeyT())), ValueT, MapTArgs...>;
   using key_type = decltype(hash_value(KeyT()));
@@ -117,13 +117,11 @@ public:
     return 0;
   }
 
-  iterator erase(const_iterator It) {
-    return base_type::erase(It);
-  }
+  iterator erase(const_iterator It) { return base_type::erase(It); }
 };
 
-}
+} // namespace sampleprof
 
-}
+} // namespace llvm
 
 #endif // LLVM_PROFILEDATA_HASHKEYMAP_H
diff --git a/llvm/include/llvm/ProfileData/SampleProf.h b/llvm/include/llvm/ProfileData/SampleProf.h
index 57ea144532a3..bede36832e67 100644
--- a/llvm/include/llvm/ProfileData/SampleProf.h
+++ b/llvm/include/llvm/ProfileData/SampleProf.h
@@ -22,11 +22,11 @@
 #include "llvm/IR/Function.h"
 #include "llvm/IR/GlobalValue.h"
 #include "llvm/ProfileData/FunctionId.h"
+#include "llvm/ProfileData/HashKeyMap.h"
 #include "llvm/Support/Allocator.h"
 #include "llvm/Support/Debug.h"
 #include "llvm/Support/ErrorOr.h"
 #include "llvm/Support/MathExtras.h"
-#include "llvm/ProfileData/HashKeyMap.h"
 #include <algorithm>
 #include <cstdint>
 #include <list>
@@ -298,7 +298,7 @@ struct LineLocation {
   }
 
   uint64_t getHashCode() const {
-    return ((uint64_t) Discriminator << 32) | LineOffset;
+    return ((uint64_t)Discriminator << 32) | LineOffset;
   }
 
   uint32_t LineOffset;
@@ -467,7 +467,7 @@ struct SampleContextFrame {
   LineLocation Location;
 
   SampleContextFrame() : Location(0, 0) {}
-  
+
   SampleContextFrame(FunctionId Func, LineLocation Location)
       : Func(Func), Location(Location) {}
 
@@ -526,9 +526,9 @@ public:
 
   SampleContext(StringRef Name)
       : Func(Name), State(UnknownContext), Attributes(ContextNone) {
-        assert(!Name.empty() && "Name is empty");
-      }
-  
+    assert(!Name.empty() && "Name is empty");
+  }
+
   SampleContext(FunctionId Func)
       : Func(Func), State(UnknownContext), Attributes(ContextNone) {}
 
@@ -582,8 +582,7 @@ public:
 
   // Decode context string for a frame to get function name and location.
   // `ContextStr` is in the form of `FuncName:StartLine.Discriminator`.
-  static void decodeContextString(StringRef ContextStr,
-                                  FunctionId &Func,
+  static void decodeContextString(StringRef ContextStr, FunctionId &Func,
                                   LineLocation &LineLoc) {
     // Get function name
     auto EntrySplit = ContextStr.split(':');
@@ -784,8 +783,7 @@ public:
 
   sampleprof_error addCalledTargetSamples(uint32_t LineOffset,
                                           uint32_t Discriminator,
-                                          FunctionId Func,
-                                          uint64_t Num,
+                                          FunctionId Func, uint64_t Num,
                                           uint64_t Weight = 1) {
     return BodySamples[LineLocation(LineOffset, Discriminator)].addCalledTarget(
         Func, Num, Weight);
@@ -1034,10 +1032,10 @@ public:
   /// corresponding function is no less than \p Threshold, add its corresponding
   /// GUID to \p S. Also traverse the BodySamples to add hot CallTarget's GUID
   /// to \p S.
-  void findInlinedFunctions(DenseSet<GlobalValue::GUID> &S,
-                            const HashKeyMap<std::unordered_map, FunctionId,
-                                             Function *>  &SymbolMap,
-                            uint64_t Threshold) const {
+  void findInlinedFunctions(
+      DenseSet<GlobalValue::GUID> &S,
+      const HashKeyMap<std::unordered_map, FunctionId, Function *> &SymbolMap,
+      uint64_t Threshold) const {
     if (TotalSamples <= Threshold)
       return;
     auto isDeclaration = [](const Function *F) {
@@ -1062,9 +1060,7 @@ public:
   }
 
   /// Set the name of the function.
-  void setFunction(FunctionId newFunction) {
-    Context.setFunction(newFunction);
-  }
+  void setFunction(FunctionId newFunction) { Context.setFunction(newFunction); }
 
   /// Return the function name.
   FunctionId getFunction() const { return Context.getFunction(); }
@@ -1202,9 +1198,7 @@ public:
 
   /// Return the GUID of the context's name. If the context is already using
   /// MD5, don't hash it again.
-  uint64_t getGUID() const {
-    return getFunction().getHashCode();
-  }
+  uint64_t getGUID() const { return getFunction().getHashCode(); }
 
   // Find all the names in the current FunctionSamples including names in
   // all the inline instances and names of call targets.
@@ -1472,10 +1466,10 @@ private:
         Profile.addBodySamples(I.first.LineOffset, I.first.Discriminator,
                                CalleeProfile.getHeadSamplesEstimate());
         // Add callsite sample.
-        Profile.addCalledTargetSamples(
-            I.first.LineOffset, I.first.Discriminator,
-            CalleeProfile.getFunction(),
-            CalleeProfile.getHeadSamplesEstimate());
+        Profile.addCalledTargetSamples(I.first.LineOffset,
+                                       I.first.Discriminator,
+                                       CalleeProfile.getFunction(),
+                                       CalleeProfile.getHeadSamplesEstimate());
         // Update total samples.
         TotalSamples = TotalSamples >= CalleeProfile.getTotalSamples()
                            ? TotalSamples - CalleeProfile.getTotalSamples()
diff --git a/llvm/include/llvm/ProfileData/SampleProfReader.h b/llvm/include/llvm/ProfileData/SampleProfReader.h
index 9e8f543909cd..410ac3cdb191 100644
--- a/llvm/include/llvm/ProfileData/SampleProfReader.h
+++ b/llvm/include/llvm/ProfileData/SampleProfReader.h
@@ -591,9 +591,7 @@ public:
 
   /// It includes all the names that have samples either in outline instance
   /// or inline instance.
-  std::vector<FunctionId> *getNameTable() override {
-    return &NameTable;
-  }
+  std::vector<FunctionId> *getNameTable() override { return &NameTable; }
 
 protected:
   /// Read a numeric value of type T from the profile.
diff --git a/llvm/include/llvm/ProfileData/SampleProfWriter.h b/llvm/include/llvm/ProfileData/SampleProfWriter.h
index 963a4d4918e5..54cf5eaaa738 100644
--- a/llvm/include/llvm/ProfileData/SampleProfWriter.h
+++ b/llvm/include/llvm/ProfileData/SampleProfWriter.h
@@ -206,9 +206,9 @@ protected:
   std::error_code writeBody(const FunctionSamples &S);
   inline void stablizeNameTable(MapVector<FunctionId, uint32_t> &NameTable,
                                 std::set<FunctionId> &V);
-  
+
   MapVector<FunctionId, uint32_t> NameTable;
-  
+
   void addName(FunctionId FName);
   virtual void addContext(const SampleContext &Context);
   void addNames(const FunctionSamples &S);
diff --git a/llvm/include/llvm/Transforms/IPO/ProfiledCallGraph.h b/llvm/include/llvm/Transforms/IPO/ProfiledCallGraph.h
index 9436e7b41a98..d4b26899d18a 100644
--- a/llvm/include/llvm/Transforms/IPO/ProfiledCallGraph.h
+++ b/llvm/include/llvm/Transforms/IPO/ProfiledCallGraph.h
@@ -52,10 +52,9 @@ struct ProfiledCallGraphNode {
   using edges = std::set<edge, ProfiledCallGraphEdgeComparer>;
   using iterator = edges::iterator;
   using const_iterator = edges::const_iterator;
-  
-  ProfiledCallGraphNode(FunctionId FName = FunctionId()) : Name(FName)
-  {}
-  
+
+  ProfiledCallGraphNode(FunctionId FName = FunctionId()) : Name(FName) {}
+
   FunctionId Name;
   edges Edges;
 };
@@ -136,7 +135,7 @@ public:
   iterator begin() { return Root.Edges.begin(); }
   iterator end() { return Root.Edges.end(); }
   ProfiledCallGraphNode *getEntryNode() { return &Root; }
-  
+
   void addProfiledFunction(FunctionId Name) {
     if (!ProfiledFunctions.count(Name)) {
       // Link to synthetic root to make sure every node is reachable
diff --git a/llvm/include/llvm/Transforms/IPO/SampleContextTracker.h b/llvm/include/llvm/Transforms/IPO/SampleContextTracker.h
index f4c999ab93d8..bf622148691c 100644
--- a/llvm/include/llvm/Transforms/IPO/SampleContextTracker.h
+++ b/llvm/include/llvm/Transforms/IPO/SampleContextTracker.h
@@ -118,8 +118,7 @@ public:
   FunctionSamples *getBaseSamplesFor(const Function &Func,
                                      bool MergeContext = true);
   // Query base profile for a given function by name.
-  FunctionSamples *getBaseSamplesFor(FunctionId Name,
-                                     bool MergeContext = true);
+  FunctionSamples *getBaseSamplesFor(FunctionId Name, bool MergeContext = true);
   // Retrieve the context trie node for given profile context
   ContextTrieNode *getContextFor(const SampleContext &Context);
   // Get real function name for a given trie node.
@@ -141,8 +140,8 @@ public:
       return nullptr;
     return I->second;
   }
-  HashKeyMap<std::unordered_map, FunctionId, ContextSamplesTy>
-      &getFuncToCtxtProfiles() {
+  HashKeyMap<std::unordered_map, FunctionId, ContextSamplesTy> &
+  getFuncToCtxtProfiles() {
     return FuncToCtxtProfiles;
   }
 
diff --git a/llvm/lib/ProfileData/SampleProfReader.cpp b/llvm/lib/ProfileData/SampleProfReader.cpp
index a69e9d584904..04017e408a44 100644
--- a/llvm/lib/ProfileData/SampleProfReader.cpp
+++ b/llvm/lib/ProfileData/SampleProfReader.cpp
@@ -405,10 +405,10 @@ std::error_code SampleProfileReaderText::readImpl() {
         }
         FunctionSamples &FProfile = *InlineStack.back();
         for (const auto &name_count : TargetCountMap) {
-          MergeResult(Result, FProfile.addCalledTargetSamples(
-                                  LineOffset, Discriminator,
-                                  FunctionId(name_count.first),
-                                  name_count.second));
+          MergeResult(Result,
+                      FProfile.addCalledTargetSamples(
+                          LineOffset, Discriminator,
+                          FunctionId(name_count.first), name_count.second));
         }
         MergeResult(Result, FProfile.addBodySamples(LineOffset, Discriminator,
                                                     NumSamples));
@@ -1231,8 +1231,7 @@ SampleProfileReaderExtBinaryBase::readFuncMetadata(bool ProfileHasAttribute,
         if (FProfile) {
           CalleeProfile = const_cast<FunctionSamples *>(
               &FProfile->functionSamplesAt(LineLocation(
-                  *LineOffset,
-                  *Discriminator))[FContext.getFunction()]);
+                  *LineOffset, *Discriminator))[FContext.getFunction()]);
         }
         if (std::error_code EC =
                 readFuncMetadata(ProfileHasAttribute, CalleeProfile))
@@ -1712,8 +1711,7 @@ std::error_code SampleProfileReaderGCC::readOneFunctionProfile(
 
       if (Update)
         FProfile->addCalledTargetSamples(LineOffset, Discriminator,
-                                         FunctionId(TargetName),
-                                         TargetCount);
+                                         FunctionId(TargetName), TargetCount);
     }
   }
 
diff --git a/llvm/lib/Transforms/IPO/SampleContextTracker.cpp b/llvm/lib/Transforms/IPO/SampleContextTracker.cpp
index 09dd5c11b309..91be5c252a42 100644
--- a/llvm/lib/Transforms/IPO/SampleContextTracker.cpp
+++ b/llvm/lib/Transforms/IPO/SampleContextTracker.cpp
@@ -232,7 +232,7 @@ SampleContextTracker::getCalleeContextSamplesFor(const CallBase &Inst,
     return nullptr;
 
   CalleeName = FunctionSamples::getCanonicalFnName(CalleeName);
-  
+
   FunctionId FName = getRepInFormat(CalleeName);
 
   // For indirect call, CalleeName will be empty, in which case the context
@@ -487,9 +487,8 @@ ContextTrieNode *SampleContextTracker::getContextFor(const DILocation *DIL) {
     StringRef Name = PrevDIL->getScope()->getSubprogram()->getLinkageName();
     if (Name.empty())
       Name = PrevDIL->getScope()->getSubprogram()->getName();
-    S.push_back(
-        std::make_pair(FunctionSamples::getCallSiteIdentifier(DIL),
-                       getRepInFormat(Name)));
+    S.push_back(std::make_pair(FunctionSamples::getCallSiteIdentifier(DIL),
+                               getRepInFormat(Name)));
     PrevDIL = DIL;
   }
 
@@ -498,8 +497,7 @@ ContextTrieNode *SampleContextTracker::getContextFor(const DILocation *DIL) {
   StringRef RootName = PrevDIL->getScope()->getSubprogram()->getLinkageName();
   if (RootName.empty())
     RootName = PrevDIL->getScope()->getSubprogram()->getName();
-  S.push_back(std::make_pair(LineLocation(0, 0),
-                             getRepInFormat(RootName)));
+  S.push_back(std::make_pair(LineLocation(0, 0), getRepInFormat(RootName)));
 
   ContextTrieNode *ContextNode = &RootContext;
   int I = S.size();
@@ -527,8 +525,7 @@ SampleContextTracker::getOrCreateContextPath(const SampleContext &Context,
       ContextNode =
           ContextNode->getOrCreateChildContext(CallSiteLoc, Callsite.Func);
     } else {
-      ContextNode =
-          ContextNode->getChildContext(CallSiteLoc, Callsite.Func);
+      ContextNode = ContextNode->getChildContext(CallSiteLoc, Callsite.Func);
     }
     CallSiteLoc = Callsite.Location;
   }
diff --git a/llvm/lib/Transforms/IPO/SampleProfile.cpp b/llvm/lib/Transforms/IPO/SampleProfile.cpp
index a7773737ef16..9030f63b30a2 100644
--- a/llvm/lib/Transforms/IPO/SampleProfile.cpp
+++ b/llvm/lib/Transforms/IPO/SampleProfile.cpp
@@ -477,8 +477,7 @@ private:
                      std::map<LineLocation, StringRef> &IRAnchors);
   void findProfileAnchors(
       const FunctionSamples &FS,
-      std::map<LineLocation, std::unordered_set<FunctionId>>
-          &ProfileAnchors);
+      std::map<LineLocation, std::unordered_set<FunctionId>> &ProfileAnchors);
   void countMismatchedSamples(const FunctionSamples &FS);
   void countProfileMismatches(
       const Function &F, const FunctionSamples &FS,
@@ -995,16 +994,16 @@ bool SampleProfileLoader::tryPromoteAndInlineCandidate(
   // recursive. As llvm does not inline recursive calls, we will
   // simply ignore it instead of handling it explicitly.
   if (!R->second->isDeclaration() && R->second->getSubprogram() &&
-      R->second->hasFnAttribute("use-sample-profile") &&
-      R->second != &F && isLegalToPromote(CI, R->second, &Reason)) {
+      R->second->hasFnAttribute("use-sample-profile") && R->second != &F &&
+      isLegalToPromote(CI, R->second, &Reason)) {
     // For promoted target, set its value with NOMORE_ICP_MAGICNUM count
     // in the value profile metadata so the target won't be promoted again.
     SmallVector<InstrProfValueData, 1> SortedCallTargets = {InstrProfValueData{
         Function::getGUID(R->second->getName()), NOMORE_ICP_MAGICNUM}};
     updateIDTMetaData(CI, SortedCallTargets, 0);
 
-    auto *DI = &pgo::promoteIndirectCall(
-        CI, R->second, Candidate.CallsiteCount, Sum, false, ORE);
+    auto *DI = &pgo::promoteIndirectCall(CI, R->second, Candidate.CallsiteCount,
+                                         Sum, false, ORE);
     if (DI) {
       Sum -= Candidate.CallsiteCount;
       // Do not prorate the indirect callsite distribution since the original
@@ -1034,8 +1033,8 @@ bool SampleProfileLoader::tryPromoteAndInlineCandidate(
   } else {
     LLVM_DEBUG(dbgs() << "\nFailed to promote indirect call to "
                       << FunctionSamples::getCanonicalFnName(
-                             Candidate.CallInstr->getName())<< " because "
-                      << Reason << "\n");
+                             Candidate.CallInstr->getName())
+                      << " because " << Reason << "\n");
   }
   return false;
 }
@@ -1129,7 +1128,7 @@ void SampleProfileLoader::findExternalInlineCandidate(
         CalleeSample->getContext().hasAttribute(ContextShouldBeInlined);
     if (!PreInline && CalleeSample->getHeadSamplesEstimate() < Threshold)
       continue;
-    
+
     Function *Func = SymbolMap.lookup(CalleeSample->getFunction());
     // Add to the import list only when it's defined out of module.
     if (!Func || Func->isDeclaration())
@@ -1648,8 +1647,7 @@ static SmallVector<InstrProfValueData, 2>
 GetSortedValueDataFromCallTargets(const SampleRecord::CallTargetMap &M) {
   SmallVector<InstrProfValueData, 2> R;
   for (const auto &I : SampleRecord::SortCallTargets(M)) {
-    R.emplace_back(
-        InstrProfValueData{I.first.getHashCode(), I.second});
+    R.emplace_back(InstrProfValueData{I.first.getHashCode(), I.second});
   }
   return R;
 }
@@ -1877,7 +1875,7 @@ SampleProfileLoader::buildProfiledCallGraph(Module &M) {
     if (F.isDeclaration() || !F.hasFnAttribute("use-sample-profile"))
       continue;
     ProfiledCG->addProfiledFunction(
-          getRepInFormat(FunctionSamples::getCanonicalFnName(F)));
+        getRepInFormat(FunctionSamples::getCanonicalFnName(F)));
   }
 
   return ProfiledCG;
@@ -2291,8 +2289,9 @@ void SampleProfileMatcher::countProfileCallsiteMismatches(
   }
 }
 
-void SampleProfileMatcher::findProfileAnchors(const FunctionSamples &FS,
-                                              std::map<LineLocation, std::unordered_set<FunctionId>> &ProfileAnchors) {
+void SampleProfileMatcher::findProfileAnchors(
+    const FunctionSamples &FS,
+    std::map<LineLocation, std::unordered_set<FunctionId>> &ProfileAnchors) {
   auto isInvalidLineOffset = [](uint32_t LineOffset) {
     return LineOffset & 0x8000;
   };
@@ -2302,8 +2301,8 @@ void SampleProfileMatcher::findProfileAnchors(const FunctionSamples &FS,
     if (isInvalidLineOffset(Loc.LineOffset))
       continue;
     for (const auto &I : I.second.getCallTargets()) {
-      auto Ret = ProfileAnchors.try_emplace(Loc,
-                                            std::unordered_set<FunctionId>());
+      auto Ret =
+          ProfileAnchors.try_emplace(Loc, std::unordered_set<FunctionId>());
       Ret.first->second.insert(I.first);
     }
   }
@@ -2314,8 +2313,8 @@ void SampleProfileMatcher::findProfileAnchors(const FunctionSamples &FS,
       continue;
     const auto &CalleeMap = I.second;
     for (const auto &I : CalleeMap) {
-      auto Ret = ProfileAnchors.try_emplace(Loc,
-                                            std::unordered_set<FunctionId>());
+      auto Ret =
+          ProfileAnchors.try_emplace(Loc, std::unordered_set<FunctionId>());
       Ret.first->second.insert(I.first);
     }
   }
@@ -2339,8 +2338,7 @@ void SampleProfileMatcher::findProfileAnchors(const FunctionSamples &FS,
 //   [1, 2, 3(foo), 4,  7,  8(bar), 9]
 // The output mapping: [2->3, 3->4, 5->7, 6->8, 7->9].
 void SampleProfileMatcher::runStaleProfileMatching(
-    const Function &F,
-    const std::map<LineLocation, StringRef> &IRAnchors,
+    const Function &F, const std::map<LineLocation, StringRef> &IRAnchors,
     const std::map<LineLocation, std::unordered_set<FunctionId>>
         &ProfileAnchors,
     LocToLocMap &IRToProfileLocationMap) {
@@ -2349,8 +2347,7 @@ void SampleProfileMatcher::runStaleProfileMatching(
   assert(IRToProfileLocationMap.empty() &&
          "Run stale profile matching only once per function");
 
-  std::unordered_map<FunctionId, std::set<LineLocation>>
-      CalleeToCallsitesMap;
+  std::unordered_map<FunctionId, std::set<LineLocation>> CalleeToCallsitesMap;
   for (const auto &I : ProfileAnchors) {
     const auto &Loc = I.first;
     const auto &Callees = I.second;
@@ -2379,8 +2376,8 @@ void SampleProfileMatcher::runStaleProfileMatching(
     bool IsMatchedAnchor = false;
     // Match the anchor location in lexical order.
     if (!CalleeName.empty()) {
-      auto CandidateAnchors = CalleeToCallsitesMap.find(
-          getRepInFormat(CalleeName));
+      auto CandidateAnchors =
+          CalleeToCallsitesMap.find(getRepInFormat(CalleeName));
       if (CandidateAnchors != CalleeToCallsitesMap.end() &&
           !CandidateAnchors->second.empty()) {
         auto CI = CandidateAnchors->second.begin();
diff --git a/llvm/tools/llvm-profdata/llvm-profdata.cpp b/llvm/tools/llvm-profdata/llvm-profdata.cpp
index 7d665a8005b0..f6211b7264a3 100644
--- a/llvm/tools/llvm-profdata/llvm-profdata.cpp
+++ b/llvm/tools/llvm-profdata/llvm-profdata.cpp
@@ -905,8 +905,8 @@ remapSamples(const sampleprof::FunctionSamples &Samples,
                           BodySample.second.getSamples());
     for (const auto &Target : BodySample.second.getCallTargets()) {
       Result.addCalledTargetSamples(BodySample.first.LineOffset,
-                                    MaskedDiscriminator,
-                                    Remapper(Target.first), Target.second);
+                                    MaskedDiscriminator, Remapper(Target.first),
+                                    Target.second);
     }
   }
   for (const auto &CallsiteSamples : Samples.getCallsiteSamples()) {
diff --git a/llvm/tools/llvm-profgen/ProfileGenerator.cpp b/llvm/tools/llvm-profgen/ProfileGenerator.cpp
index c4028e6b1328..39295d7ec2dd 100644
--- a/llvm/tools/llvm-profgen/ProfileGenerator.cpp
+++ b/llvm/tools/llvm-profgen/ProfileGenerator.cpp
@@ -585,8 +585,8 @@ void ProfileGenerator::populateBoundarySamplesWithProbesForAllFunctions(
           getLeafProfileAndAddTotalSamples(FrameVec, 0);
       FunctionProfile.addCalledTargetSamples(
           FrameVec.back().Location.LineOffset,
-          FrameVec.back().Location.Discriminator,
-          FunctionId(CalleeName), Count);
+          FrameVec.back().Location.Discriminator, FunctionId(CalleeName),
+          Count);
     }
   }
 }
@@ -609,8 +609,7 @@ FunctionSamples &ProfileGenerator::getLeafProfileAndAddTotalSamples(
         getBaseDiscriminator(FrameVec[I - 1].Location.Discriminator));
     FunctionSamplesMap &SamplesMap =
         FunctionProfile->functionSamplesAt(Callsite);
-    auto Ret =
-        SamplesMap.emplace(FrameVec[I].Func, FunctionSamples());
+    auto Ret = SamplesMap.emplace(FrameVec[I].Func, FunctionSamples());
     if (Ret.second) {
       SampleContext Context(FrameVec[I].Func);
       Ret.first->second.setContext(Context);
@@ -901,16 +900,14 @@ void CSProfileGenerator::populateBoundarySamplesForFunction(
         CallerNode->getFunctionSamples()->addCalledTargetSamples(
             LeafLoc->Location.LineOffset,
             getBaseDiscriminator(LeafLoc->Location.Discriminator),
-            FunctionId(CalleeName),
-            Count);
+            FunctionId(CalleeName), Count);
         // Record head sample for called target(callee)
         CalleeCallSite = LeafLoc->Location;
       }
     }
 
-    ContextTrieNode *CalleeNode =
-        CallerNode->getOrCreateChildContext(CalleeCallSite,
-                                            FunctionId(CalleeName));
+    ContextTrieNode *CalleeNode = CallerNode->getOrCreateChildContext(
+        CalleeCallSite, FunctionId(CalleeName));
     FunctionSamples *CalleeProfile = getOrCreateFunctionSamples(CalleeNode);
     CalleeProfile->addHeadSamples(Count);
   }
diff --git a/llvm/tools/llvm-profgen/ProfiledBinary.cpp b/llvm/tools/llvm-profgen/ProfiledBinary.cpp
index 6db25b5541b4..b1d690e16593 100644
--- a/llvm/tools/llvm-profgen/ProfiledBinary.cpp
+++ b/llvm/tools/llvm-profgen/ProfiledBinary.cpp
@@ -144,9 +144,8 @@ void BinarySizeContextTracker::trackInlineesOptimizedAway(
     for (auto &ProbeFrame : reverse(ProbeContext)) {
       StringRef CallerName = ProbeFrame.first;
       LineLocation CallsiteLoc(ProbeFrame.second, 0);
-      SizeContext =
-          SizeContext->getOrCreateChildContext(CallsiteLoc,
-                                               FunctionId(CallerName));
+      SizeContext = SizeContext->getOrCreateChildContext(
+          CallsiteLoc, FunctionId(CallerName));
     }
     // Add 0 size to make known.
     SizeContext->addFunctionSize(0);
@@ -840,7 +839,6 @@ void ProfiledBinary::loadSymbolsFromDWARF(ObjectFile &Obj) {
     WithColor::warning() << "Loading of DWARF info completed, but no binary "
                             "functions have been retrieved.\n";
 
-
   // Populate the hash binary function map for MD5 function name lookup. This
   // is done after BinaryFunctions are finalized.
   for (auto &BinaryFunction : BinaryFunctions) {
diff --git a/llvm/tools/llvm-profgen/ProfiledBinary.h b/llvm/tools/llvm-profgen/ProfiledBinary.h
index 8c607d665dee..e3c0f399ecbf 100644
--- a/llvm/tools/llvm-profgen/ProfiledBinary.h
+++ b/llvm/tools/llvm-profgen/ProfiledBinary.h
@@ -480,7 +480,7 @@ public:
   void setProfiledFunctions(std::unordered_set<const BinaryFunction *> &Funcs) {
     ProfiledFunctions = Funcs;
   }
-  
+
   BinaryFunction *getBinaryFunction(FunctionId FName) {
     if (FName.isStringRef()) {
       auto I = BinaryFunctions.find(FName.str());
diff --git a/llvm/unittests/ProfileData/SampleProfTest.cpp b/llvm/unittests/ProfileData/SampleProfTest.cpp
index 3abba4743093..fb2be411373b 100644
--- a/llvm/unittests/ProfileData/SampleProfTest.cpp
+++ b/llvm/unittests/ProfileData/SampleProfTest.cpp
@@ -254,9 +254,8 @@ struct SampleProfTest : ::testing::Test {
     // inline instance for GooName. Test the correct FunctionSamples can be
     // found with Remapper support.
     const FunctionSamples *ReadGooSamples =
-        ReadFooSamples->findFunctionSamplesAt(LineLocation(7, 0),
-                                              GooName.stringRef(),
-                                              Reader->getRemapper());
+        ReadFooSamples->findFunctionSamplesAt(
+            LineLocation(7, 0), GooName.stringRef(), Reader->getRemapper());
     ASSERT_TRUE(ReadGooSamples != nullptr);
     ASSERT_EQ(502u, ReadGooSamples->getTotalSamples());
 
@@ -264,9 +263,8 @@ struct SampleProfTest : ::testing::Test {
     // no inline instance for GooName. Test no FunctionSamples will be
     // found with Remapper support.
     const FunctionSamples *ReadGooSamplesAgain =
-        ReadFooSamples->findFunctionSamplesAt(LineLocation(9, 0),
-                                              GooName.stringRef(),
-                                              Reader->getRemapper());
+        ReadFooSamples->findFunctionSamplesAt(
+            LineLocation(9, 0), GooName.stringRef(), Reader->getRemapper());
     ASSERT_TRUE(ReadGooSamplesAgain == nullptr);
 
     // The inline instance of Hoo is inside of the inline instance of Goo.
@@ -274,9 +272,8 @@ struct SampleProfTest : ::testing::Test {
     // inline instance for HooName. Test the correct FunctionSamples can be
     // found with Remapper support.
     const FunctionSamples *ReadHooSamples =
-        ReadGooSamples->findFunctionSamplesAt(LineLocation(9, 0),
-                                              HooName.stringRef(),
-                                              Reader->getRemapper());
+        ReadGooSamples->findFunctionSamplesAt(
+            LineLocation(9, 0), HooName.stringRef(), Reader->getRemapper());
     ASSERT_TRUE(ReadHooSamples != nullptr);
     ASSERT_EQ(317u, ReadHooSamples->getTotalSamples());
 
@@ -307,7 +304,7 @@ struct SampleProfTest : ::testing::Test {
     FunctionSamples *ReadBooSamples = Reader->getSamplesFor(BooName);
     ASSERT_TRUE(ReadBooSamples != nullptr);
     ASSERT_EQ(1232u, ReadBooSamples->getTotalSamples());
-    
+
     FunctionId MconstructRep = getRepInFormat(MconstructName);
     FunctionId StringviewRep = getRepInFormat(StringviewName);
     ASSERT_EQ(1000u, CTMap.get()[MconstructRep]);

@david-xl
Copy link
Contributor

The change looks good to me, but wait for other reviewers to chime in as well. Also a good idea to run tests with sanitizers on.

llvm/include/llvm/ProfileData/SampleProf.h Outdated Show resolved Hide resolved
/// and can be pointer-casted as one.
/// We disallow implicit cast to StringRef because there are too many instances
/// that it may cause break the code, such as using it in a StringMap.
class ProfileFuncRef {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This name ProfileFuncRef can be misleading. It's not a reference to profile function which should include the actual profile.

Maybe something like ProfileFuncName.

Copy link
Contributor

@MatzeB MatzeB Oct 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe FunctionIdentifier or FunctionId? (The Profile prefix seems superfluous as this is already in a SampleProf namespace)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FunctionId or FuncId sounds good to me @huangjd

Comment on lines 40 to 43
const char *Data = nullptr;

/// Use uint64_t instead of size_t so that it can also hold a MD5 value.
uint64_t LengthOrHashCode = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently the indicator for string vs hash is implicit: Data == null means it's hash. This isn't good for readability.

Can we represent them with union and an explicit flag?

union {
    StringRef Name;
    uint64_t Hash;
};
bool UseStringName;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class is on critical path so I am making it as efficient as possible. Adding a bool member will increase the total object size by 8 bytes (50% increase), which has significant implication as this class is used various containers (for example the name table)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Admittedly I would tend for the current solution to save the extra bool. This is class-private and has accessor functions so the readability problems are limited to this file...

}
for (const auto &CS : CallsiteSamples)
for (const auto &NameFS : CS.second)
NameFS.second.findInlinedFunctions(S, SymbolMap, Threshold);
}

/// Set the name of the function.
void setName(StringRef FunctionName) { Context.setName(FunctionName); }
void setName(ProfileFuncRef FunctionName) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's ProfileFuncRef that is misleading. It's really just a wrapper around name, not wrapper for function. It should be ProfileFuncName.

So I think it makes sense to keep Name, instead of using Function. Same goes for other places.

/// by converting a MD5 sample profile to a format that does not support MD5,
/// and if so, convert the numerical string to a hash code first. We assume
/// that no function name (from a profile) can be a pure number.
explicit ProfileFuncRef(StringRef Str)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it necessary for this to be explicit? Allow implicit conversion and remove some of the changes later.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This question is unanswered/unaddressed.

explicit ProfileFuncRef(StringRef Str)
: Data(Str.data()), LengthOrHashCode(Str.size()) {
// Only need to check for base 10 digits, fail faster if otherwise.
if (Str.size() > 0 && isdigit(Str[0]) &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think Str.size() > 0 && isdigit(Str[0]) is going to speed up things meaningfully. They are the first two things that getAsInteger checks anyways. Suggest remove them and just do getAsInteger for simplicity and readability.

llvm/include/llvm/Transforms/IPO/ProfiledCallGraph.h Outdated Show resolved Hide resolved
@@ -836,7 +840,7 @@ std::error_code SampleProfileWriterBinary::writeBody(const FunctionSamples &S) {
encodeULEB128(Sample.getSamples(), OS);
encodeULEB128(Sample.getCallTargets().size(), OS);
for (const auto &J : Sample.getSortedCallTargets()) {
StringRef Callee = J.first;
auto Callee = J.first;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@WenleiHe
Copy link
Member

When testing on an internal large profile (> 1 GB, with more than 10 million functions), the full profile load time is reduced from 28 sec to 25 sec in average, and reading function offset table from 0.78s to 0.7s

What is the E2E compile time change for the case you measured? Is there any memory footprint change?

@WenleiHe WenleiHe requested review from MatzeB and removed request for a team October 4, 2023 17:34
Copy link
Contributor

@MatzeB MatzeB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added several comments for the first part of the patch but did not make it through the whole change yet.

Comment on lines 9 to 11
// This file defines the StringRefOrHashCode class. It is to represent function
// names in a sample profile, which can be in one of two forms - either a
// regular string, or a 64-bit hash code.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Should use /// @file for documentation about a file so doxygen can pick it up.
  • This comment seems wrong ("StringRegOrHashCode" was renamed I guess?). It also seems to mainly repeat what the one class in here is about.

I would suggest to simply remove the file comment (and instead make sure the comment on the class is good).

/// This class represents a function name that is read from a sample profile. It
/// comes with two forms: a string or a hash code. For efficient storage, a
/// sample profile may store function names as 64-bit MD5 values, so when
/// reading the profile, this class can represnet them without converting it to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*represent

/// and can be pointer-casted as one.
/// We disallow implicit cast to StringRef because there are too many instances
/// that it may cause break the code, such as using it in a StringMap.
class ProfileFuncRef {
Copy link
Contributor

@MatzeB MatzeB Oct 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe FunctionIdentifier or FunctionId? (The Profile prefix seems superfluous as this is already in a SampleProf namespace)

Comment on lines 40 to 43
const char *Data = nullptr;

/// Use uint64_t instead of size_t so that it can also hold a MD5 value.
uint64_t LengthOrHashCode = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Admittedly I would tend for the current solution to save the extra bool. This is class-private and has accessor functions so the readability problems are limited to this file...

llvm/include/llvm/ProfileData/ProfileFuncRef.h Outdated Show resolved Hide resolved
return OS;
}

inline uint64_t MD5Hash(const sampleprof::ProfileFuncRef &Obj) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need for explicit sampleprof::.

llvm/include/llvm/ProfileData/ProfileFuncRef.h Outdated Show resolved Hide resolved
llvm/include/llvm/ProfileData/SampleProf.h Outdated Show resolved Hide resolved
@@ -618,7 +623,7 @@ class SampleContext {
void clearState(ContextStateMask S) { State &= (uint32_t)~S; }
bool hasContext() const { return State != UnknownContext; }
bool isBaseContext() const { return FullContext.size() == 1; }
StringRef getName() const { return Name; }
ProfileFuncRef getName() const { return Name; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this should be getId() now (or however we end up naming the ProfileFuncRef now).

// For non-context function name, use its MD5 as hash value, so that it is
// consistent with the profile map's key.
return hashFuncName(getName());
return getName().getHashCode();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return getName().getHashCode();
return hash_value(Name);

@huangjd
Copy link
Contributor Author

huangjd commented Oct 5, 2023

To clarify the implementation of ProfileFuncRef and how it guarantees correctness.

There are two ways ProfileFuncRef can originate (ignoring copy constructions):
1 - From reading the profile, in which all ProfileFuncRef should have the same representation (either StringRef or MD5) from the source. A profile mixing both is currently considered malformed so we don't care what happens next. I am going to address this issue in a next patch to make the reader more robust.
2 - From the module's function (Func) names. Function names are always StringRef.

To ensure correctness, the following conventions need to be enforced:
A - Whenever ProfileFuncRef is used as key in a map, all keys should have the same representation. This excludes the scenario that both representations of the same function are inserted as different entries.
B - Whenever we lookup a ProflieFuncRef-keyed map, we need to make sure the key we query has the same representation as the keys in the map.

The way this work is to use the function getRepInFormat(), which checks bool FunctionSamples::UseMD5. This flag is set immediately after reading the profile, and before any ProflieFuncRef-keyed map (outside of the sample profile reader) is created. Whenever we perform the action in A or B, if the key originates from 2 then it is converted to the representation matching those originates from 1 using getRepInFormat().

Exception - If we are using HashKeyMap, then we don't need these requirements because both StringRef and MD5 representation of the same function have the same hash value, so a lookup using either form will get the match. (Assume we ignore real hash collision)

It is known that the two representations of the same function are not compared equal, but there shouldn't be any case that we run into this situation, as long as A and B are enforced.

Note that there is no correctness change compared to the original implementation where we use the numerical string as key, because when querying functions we still need to use getRepInFormat to convert the raw StringRef to the correct representation, so anything that will break in this patch was already broken previously

Comment on lines 63 to 64
if (!Str.getAsInteger(10, LengthOrHashCode))
Data = nullptr;
Copy link
Contributor

@MatzeB MatzeB Oct 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we keep the constructor simple and just initialize the ProfileFuncRef with a string and keep the "try parsing as an integer" logic to outside callers?

This also seems to rely on function names not being a string of digits, which is not true on the object file level (so maybe some language or language feature I am not aware of could produce such names). I can also create such functions with the asm label extension:

$ cat x.cpp
void james_bond() __asm__("007");
void james_bond() {}
$ clang++ -S -o - -emit-llvm x.cpp
...
define dso_local void @"007"() #0 {
...

@MatzeB
Copy link
Contributor

MatzeB commented Oct 5, 2023

There are two ways ProfileFuncRef can originate (ignoring copy constructions):
...

So you are saying there can be hashcode ProfileFuncRefs coming from the profile, but an llvm::Function would always produce a string ProfileFuncRef?

B - Whenever we lookup a ProflieFuncRef-keyed map, we need to make sure the key we query has the same representation as the keys in the map.

But doesn't SampleProfileReader::getSamplesFor(const Function &F) create a string ProfileFuncRef and uses that for a lookup?

@huangjd
Copy link
Contributor Author

huangjd commented Oct 5, 2023

There are two ways ProfileFuncRef can originate (ignoring copy constructions):
...

So you are saying there can be hashcode ProfileFuncRefs coming from the profile, but an llvm::Function would always produce a string ProfileFuncRef?

B - Whenever we lookup a ProflieFuncRef-keyed map, we need to make sure the key we query has the same representation as the keys in the map.

But doesn't SampleProfileReader::getSamplesFor(const Function &F) create a string ProfileFuncRef and uses that for a lookup?

See "exception" in my comment, the profile map is a HashKeyMap, which was specifically made to work when looking up with either String or MD5 of the same function name, because both produce the same hash as key

@huangjd
Copy link
Contributor Author

huangjd commented Oct 5, 2023

It looks like after using HashKeyMap, the purpose of SymbolMap and GUIDToFuncNameMap can be merged. SymbolMap maps StringRef to Function*, GUIDToFuncNameMap maps ProfileFuncRef's hash code to StringRef, and it's almost exclusively cascaded to SymbolMap lookup, so we can actually combine two maps into one, saving a lot of time.

Working on it now

llvm/include/llvm/ProfileData/HashKeyMap.h Outdated Show resolved Hide resolved
llvm/lib/ProfileData/SampleProfReader.cpp Outdated Show resolved Hide resolved
llvm/lib/ProfileData/SampleProfReader.cpp Outdated Show resolved Hide resolved
llvm/lib/ProfileData/SampleProfReader.cpp Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:X86 llvm:transforms PGO Profile Guided Optimizations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants