[Exegesis] Add supports to serialize/deserialize object files into benchmarks #121993

mshockwave · 2025-01-07T20:48:51Z

This patch adds support to serialize the assembled object files into the benchmark results, so that we can deserialize them later on and run the measurements. This is useful when the overhead of end-to-end execution (snippet generation + benchmark measurement) is too high and we want to separate it into two stages.

The object file is compressed and serialized into base64 string. It has fantastic compression rate because there are lots of (nearly) identical instructions in the file.

Currently this patch can only resume before the measure phase. It does not support repetition modes that require more than one snippet (i.e. min and middle-half-loop/duplicate) either.

This PR stacks on top of #121991

llvmbot · 2025-01-07T20:49:25Z

@llvm/pr-subscribers-llvm-support

@llvm/pr-subscribers-llvm-binary-utilities

Author: Min-Yih Hsu (mshockwave)

Changes

This patch adds support to serialize the assembled object files into the benchmark results, so that we can deserialize them later on and run the measurements. This is useful when the overhead of end-to-end execution (snippet generation + benchmark measurement) is too high and we want to separate it into two stages.

The object file is compressed and serialized into base64 string. It has fantastic compression rate because there are lots of (nearly) identical instructions in the file.

Currently this patch can only resume before the measure phase. It does not support repetition modes that require more than one snippet (i.e. min and middle-half-loop/duplicate) either.

This PR stacks on top of #121991

Patch is 31.41 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/121993.diff

8 Files Affected:

(modified) llvm/docs/CommandGuide/llvm-exegesis.rst (+15-1)
(added) llvm/test/tools/llvm-exegesis/RISCV/serialize-obj-file.test (+33)
(added) llvm/test/tools/llvm-exegesis/dry-run-measurement.test (+11)
(modified) llvm/tools/llvm-exegesis/lib/BenchmarkResult.cpp (+93-2)
(modified) llvm/tools/llvm-exegesis/lib/BenchmarkResult.h (+20)
(modified) llvm/tools/llvm-exegesis/lib/BenchmarkRunner.cpp (+62-5)
(modified) llvm/tools/llvm-exegesis/lib/BenchmarkRunner.h (+9-2)
(modified) llvm/tools/llvm-exegesis/llvm-exegesis.cpp (+159-97)

diff --git a/llvm/docs/CommandGuide/llvm-exegesis.rst b/llvm/docs/CommandGuide/llvm-exegesis.rst
index 8266d891a5e6b1..c3580cdecab7b9 100644
--- a/llvm/docs/CommandGuide/llvm-exegesis.rst
+++ b/llvm/docs/CommandGuide/llvm-exegesis.rst
@@ -299,9 +299,18 @@ OPTIONS
   However, it is possible to stop at some stage before measuring. Choices are:
   * ``prepare-snippet``: Only generate the minimal instruction sequence.
   * ``prepare-and-assemble-snippet``: Same as ``prepare-snippet``, but also dumps an excerpt of the sequence (hex encoded).
-  * ``assemble-measured-code``: Same as ``prepare-and-assemble-snippet``. but also creates the full sequence that can be dumped to a file using ``--dump-object-to-disk``.
+  * ``assemble-measured-code``: Same as ``prepare-and-assemble-snippet``. but
+    also creates the full sequence that can be dumped to a file using ``--dump-object-to-disk``.
+    If either zlib or zstd is available and we're using either duplicate or
+    loop repetition mode, this phase generates benchmarks with a serialized
+    snippet object file attached to it.
   * ``measure``: Same as ``assemble-measured-code``, but also runs the measurement.
 
+.. option:: --run-measurement=<benchmarks file>
+
+  Given a benchmarks file generated after the ``assembly-measured-code`` phase,
+  resume the measurement phase from it.
+
 .. option:: --x86-lbr-sample-period=<nBranches/sample>
 
   Specify the LBR sampling period - how many branches before we take a sample.
@@ -449,6 +458,11 @@ OPTIONS
  crash when hardware performance counters are unavailable and for
  debugging :program:`llvm-exegesis` itself.
 
+.. option:: --dry-run-measurement
+  If set, llvm-exegesis runs everything except the actual snippet execution.
+  This is useful if we want to test some part of the code without actually
+  running on native platforms.
+
 .. option:: --execution-mode=[inprocess,subprocess]
 
   This option specifies what execution mode to use. The `inprocess` execution
diff --git a/llvm/test/tools/llvm-exegesis/RISCV/serialize-obj-file.test b/llvm/test/tools/llvm-exegesis/RISCV/serialize-obj-file.test
new file mode 100644
index 00000000000000..befd16699bef1a
--- /dev/null
+++ b/llvm/test/tools/llvm-exegesis/RISCV/serialize-obj-file.test
@@ -0,0 +1,33 @@
+# RUN: llvm-exegesis -mtriple=riscv64 -mcpu=sifive-p470 --opcode-name=SH3ADD --benchmark-phase=assemble-measured-code --mode=latency --benchmarks-file=%t.yaml
+# RUN: FileCheck --input-file=%t.yaml %s --check-prefixes=CHECK,SERIALIZE
+# RUN: llvm-exegesis -mtriple=riscv64 -mcpu=sifive-p470 --run-measurement=%t.yaml --mode=latency --dry-run-measurement --use-dummy-perf-counters \
+# RUN:    --dump-object-to-disk=%t.o | FileCheck %s --check-prefixes=CHECK,DESERIALIZE
+# RUN: llvm-objdump -d %t.o | FileCheck %s --check-prefix=OBJDUMP
+# RUN: llvm-exegesis -mtriple=riscv64 -mcpu=sifive-p470 --opcode-name=SH3ADD --mode=latency --dry-run-measurement --use-dummy-perf-counters | \
+# RUN:    FileCheck %s --check-prefix=NO-SERIALIZE
+# RUN: llvm-exegesis -mtriple=riscv64 -mcpu=sifive-p470 --opcode-name=SH3ADD --mode=latency --benchmark-phase=assemble-measured-code --repetition-mode=min | \
+# RUN:    FileCheck %s --check-prefix=NO-SERIALIZE
+# RUN: llvm-exegesis -mtriple=riscv64 -mcpu=sifive-p470 --opcode-name=SH3ADD --mode=latency --benchmark-phase=assemble-measured-code --repetition-mode=middle-half-loop | \
+# RUN:    FileCheck %s --check-prefix=NO-SERIALIZE
+# RUN: llvm-exegesis -mtriple=riscv64 -mcpu=sifive-p470 --opcode-name=SH3ADD --mode=latency --benchmark-phase=assemble-measured-code --repetition-mode=middle-half-duplicate | \
+# RUN:    FileCheck %s --check-prefix=NO-SERIALIZE
+# REQUIRES: zlib || zstd
+
+# A round-trip test for serialize/deserialize benchmarks.
+
+# CHECK: mode: latency
+# CHECK:  instructions:
+# CHECK-NEXT: - 'SH3ADD X{{.*}} X{{.*}} X{{.*}}'
+# CHECK: cpu_name:        sifive-p470
+# CHECK-NEXT: llvm_triple:     riscv64
+# CHECK-NEXT: min_instructions: 10000
+# CHECK-NEXT: measurements:    []
+# SERIALIZE: error: actual measurements skipped.
+# DESERIALIZE: error:           ''
+# CHECK: info:            Repeating a single explicitly serial instruction
+
+# OBJDUMP: sh3add
+
+# Negative tests: we shouldn't serialize object files in some scenarios.
+
+# NO-SERIALIZE-NOT: object_file:
diff --git a/llvm/test/tools/llvm-exegesis/dry-run-measurement.test b/llvm/test/tools/llvm-exegesis/dry-run-measurement.test
new file mode 100644
index 00000000000000..82857e7998b5e6
--- /dev/null
+++ b/llvm/test/tools/llvm-exegesis/dry-run-measurement.test
@@ -0,0 +1,11 @@
+# RUN: llvm-exegesis --mtriple=riscv64 --mcpu=sifive-p470 --mode=latency --opcode-name=ADD --use-dummy-perf-counters --dry-run-measurement | FileCheck %s
+# REQUIRES: riscv-registered-target
+
+# This test makes sure that llvm-exegesis doesn't execute "cross-compiled" snippets in the presence of
+# --dry-run-measurement. RISC-V was chosen simply because most of the time we run tests on X86 machines.
+
+# Should not contain misleading results.
+# CHECK: measurements:    []
+
+# Should not contain error messages like "snippet crashed while running: Segmentation fault".
+# CHECK: error:           ''
diff --git a/llvm/tools/llvm-exegesis/lib/BenchmarkResult.cpp b/llvm/tools/llvm-exegesis/lib/BenchmarkResult.cpp
index 84dc23b343c6c0..eff5a6d547cbda 100644
--- a/llvm/tools/llvm-exegesis/lib/BenchmarkResult.cpp
+++ b/llvm/tools/llvm-exegesis/lib/BenchmarkResult.cpp
@@ -15,10 +15,13 @@
 #include "llvm/ADT/StringRef.h"
 #include "llvm/ADT/bit.h"
 #include "llvm/ObjectYAML/YAML.h"
+#include "llvm/Support/Base64.h"
+#include "llvm/Support/CommandLine.h"
 #include "llvm/Support/Errc.h"
 #include "llvm/Support/FileOutputBuffer.h"
 #include "llvm/Support/FileSystem.h"
 #include "llvm/Support/Format.h"
+#include "llvm/Support/Timer.h"
 #include "llvm/Support/raw_ostream.h"
 
 static constexpr const char kIntegerPrefix[] = "i_0x";
@@ -27,6 +30,12 @@ static constexpr const char kInvalidOperand[] = "INVALID";
 
 namespace llvm {
 
+static cl::opt<compression::Format> ForceObjectFileCompressionFormat(
+    "exegesis-force-obj-compress-format", cl::Hidden,
+    cl::desc("Force to use this compression format for object files."),
+    cl::values(clEnumValN(compression::Format::Zstd, "zstd", "Using Zstandard"),
+               clEnumValN(compression::Format::Zlib, "zlib", "Using LibZ")));
+
 namespace {
 
 // A mutable struct holding an LLVMState that can be passed through the
@@ -278,6 +287,13 @@ template <> struct ScalarTraits<exegesis::RegisterValue> {
   static const bool flow = true;
 };
 
+template <> struct ScalarEnumerationTraits<compression::Format> {
+  static void enumeration(IO &Io, compression::Format &Format) {
+    Io.enumCase(Format, "zstd", compression::Format::Zstd);
+    Io.enumCase(Format, "zlib", compression::Format::Zlib);
+  }
+};
+
 template <> struct MappingContextTraits<exegesis::BenchmarkKey, YamlContext> {
   static void mapping(IO &Io, exegesis::BenchmarkKey &Obj,
                       YamlContext &Context) {
@@ -288,6 +304,33 @@ template <> struct MappingContextTraits<exegesis::BenchmarkKey, YamlContext> {
   }
 };
 
+template <> struct MappingTraits<exegesis::Benchmark::ObjectFile> {
+  struct NormalizedBase64Binary {
+    std::string Base64Str;
+
+    NormalizedBase64Binary(IO &) {}
+    NormalizedBase64Binary(IO &, const std::vector<uint8_t> &Data)
+        : Base64Str(llvm::encodeBase64(Data)) {}
+
+    std::vector<uint8_t> denormalize(IO &) {
+      std::vector<char> Buffer;
+      if (Error E = llvm::decodeBase64(Base64Str, Buffer))
+        report_fatal_error(std::move(E));
+
+      StringRef Data(Buffer.data(), Buffer.size());
+      return std::vector<uint8_t>(Data.bytes_begin(), Data.bytes_end());
+    }
+  };
+
+  static void mapping(IO &Io, exegesis::Benchmark::ObjectFile &Obj) {
+    Io.mapRequired("compression", Obj.CompressionFormat);
+    Io.mapRequired("original_size", Obj.UncompressedSize);
+    MappingNormalization<NormalizedBase64Binary, std::vector<uint8_t>>
+        ObjFileString(Io, Obj.CompressedBytes);
+    Io.mapRequired("compressed_bytes", ObjFileString->Base64Str);
+  }
+};
+
 template <> struct MappingContextTraits<exegesis::Benchmark, YamlContext> {
   struct NormalizedBinary {
     NormalizedBinary(IO &io) {}
@@ -325,9 +368,11 @@ template <> struct MappingContextTraits<exegesis::Benchmark, YamlContext> {
     Io.mapRequired("error", Obj.Error);
     Io.mapOptional("info", Obj.Info);
     // AssembledSnippet
-    MappingNormalization<NormalizedBinary, std::vector<uint8_t>> BinaryString(
+    MappingNormalization<NormalizedBinary, std::vector<uint8_t>> SnippetString(
         Io, Obj.AssembledSnippet);
-    Io.mapOptional("assembled_snippet", BinaryString->Binary);
+    Io.mapOptional("assembled_snippet", SnippetString->Binary);
+    // ObjectFile
+    Io.mapOptional("object_file", Obj.ObjFile);
   }
 };
 
@@ -364,6 +409,52 @@ Benchmark::readTriplesAndCpusFromYamls(MemoryBufferRef Buffer) {
   return Result;
 }
 
+Error Benchmark::setObjectFile(StringRef RawBytes) {
+  SmallVector<uint8_t> CompressedBytes;
+  llvm::compression::Format CompressionFormat;
+
+  auto isFormatAvailable = [](llvm::compression::Format F) -> bool {
+    switch (F) {
+    case compression::Format::Zstd:
+      return compression::zstd::isAvailable();
+    case compression::Format::Zlib:
+      return compression::zlib::isAvailable();
+    }
+  };
+  if (ForceObjectFileCompressionFormat.getNumOccurrences() > 0) {
+    CompressionFormat = ForceObjectFileCompressionFormat;
+    if (!isFormatAvailable(CompressionFormat))
+      return make_error<StringError>(
+          "The designated compression format is not available.",
+          inconvertibleErrorCode());
+  } else if (isFormatAvailable(compression::Format::Zstd)) {
+    // Try newer compression algorithm first.
+    CompressionFormat = compression::Format::Zstd;
+  } else if (isFormatAvailable(compression::Format::Zlib)) {
+    CompressionFormat = compression::Format::Zlib;
+  } else {
+    return make_error<StringError>(
+        "None of the compression methods is available.",
+        inconvertibleErrorCode());
+  }
+
+  switch (CompressionFormat) {
+  case compression::Format::Zstd:
+    compression::zstd::compress({RawBytes.bytes_begin(), RawBytes.bytes_end()},
+                                CompressedBytes);
+    break;
+  case compression::Format::Zlib:
+    compression::zlib::compress({RawBytes.bytes_begin(), RawBytes.bytes_end()},
+                                CompressedBytes);
+    break;
+  }
+
+  ObjFile = {CompressionFormat,
+             RawBytes.size(),
+             {CompressedBytes.begin(), CompressedBytes.end()}};
+  return Error::success();
+}
+
 Expected<Benchmark> Benchmark::readYaml(const LLVMState &State,
                                         MemoryBufferRef Buffer) {
   yaml::Input Yin(Buffer);
diff --git a/llvm/tools/llvm-exegesis/lib/BenchmarkResult.h b/llvm/tools/llvm-exegesis/lib/BenchmarkResult.h
index 3c09a8380146e5..a5217566204a14 100644
--- a/llvm/tools/llvm-exegesis/lib/BenchmarkResult.h
+++ b/llvm/tools/llvm-exegesis/lib/BenchmarkResult.h
@@ -21,6 +21,7 @@
 #include "llvm/ADT/StringRef.h"
 #include "llvm/MC/MCInst.h"
 #include "llvm/MC/MCInstBuilder.h"
+#include "llvm/Support/Compression.h"
 #include "llvm/Support/YAMLTraits.h"
 #include <limits>
 #include <set>
@@ -76,6 +77,11 @@ struct BenchmarkKey {
   uintptr_t SnippetAddress = 0;
   // The register that should be used to hold the loop counter.
   unsigned LoopRegister;
+
+  bool operator==(const BenchmarkKey &RHS) const {
+    return Config == RHS.Config &&
+           Instructions[0].getOpcode() == RHS.Instructions[0].getOpcode();
+  }
 };
 
 struct BenchmarkMeasure {
@@ -122,6 +128,16 @@ struct Benchmark {
   std::string Error;
   std::string Info;
   std::vector<uint8_t> AssembledSnippet;
+
+  struct ObjectFile {
+    llvm::compression::Format CompressionFormat;
+    size_t UncompressedSize = 0;
+    std::vector<uint8_t> CompressedBytes;
+
+    bool isValid() const { return UncompressedSize && CompressedBytes.size(); }
+  };
+  std::optional<ObjectFile> ObjFile;
+
   // How to aggregate measurements.
   enum ResultAggregationModeE { Min, Max, Mean, MinVariance };
 
@@ -132,6 +148,10 @@ struct Benchmark {
   Benchmark &operator=(const Benchmark &) = delete;
   Benchmark &operator=(Benchmark &&) = delete;
 
+  // Compress raw object file bytes and assign the result and compression type
+  // to CompressedObjectFile and ObjFileCompression, respectively.
+  class Error setObjectFile(StringRef RawBytes);
+
   // Read functions.
   static Expected<Benchmark> readYaml(const LLVMState &State,
                                                  MemoryBufferRef Buffer);
diff --git a/llvm/tools/llvm-exegesis/lib/BenchmarkRunner.cpp b/llvm/tools/llvm-exegesis/lib/BenchmarkRunner.cpp
index a7771b99e97b1a..3bca6ed13d8fc8 100644
--- a/llvm/tools/llvm-exegesis/lib/BenchmarkRunner.cpp
+++ b/llvm/tools/llvm-exegesis/lib/BenchmarkRunner.cpp
@@ -53,6 +53,12 @@
 namespace llvm {
 namespace exegesis {
 
+static cl::opt<bool>
+    DryRunMeasurement("dry-run-measurement",
+                      cl::desc("Run every steps in the measurement phase "
+                               "except executing the snippet."),
+                      cl::init(false), cl::Hidden);
+
 BenchmarkRunner::BenchmarkRunner(const LLVMState &State, Benchmark::ModeE Mode,
                                  BenchmarkPhaseSelectorE BenchmarkPhaseSelector,
                                  ExecutionModeE ExecutionMode,
@@ -140,13 +146,21 @@ class InProcessFunctionExecutorImpl : public BenchmarkRunner::FunctionExecutor {
     Scratch->clear();
     {
       auto PS = ET.withSavedState();
+      // We can't directly capture DryRunMeasurement in the lambda below.
+      bool DryRun = DryRunMeasurement;
       CrashRecoveryContext CRC;
       CrashRecoveryContext::Enable();
-      const bool Crashed = !CRC.RunSafely([this, Counter, ScratchPtr]() {
-        Counter->start();
-        this->Function(ScratchPtr);
-        Counter->stop();
-      });
+      const bool Crashed =
+          !CRC.RunSafely([this, Counter, ScratchPtr, DryRun]() {
+            if (DryRun) {
+              Counter->start();
+              Counter->stop();
+            } else {
+              Counter->start();
+              this->Function(ScratchPtr);
+              Counter->stop();
+            }
+          });
       CrashRecoveryContext::Disable();
       PS.reset();
       if (Crashed) {
@@ -610,6 +624,7 @@ Expected<SmallString<0>> BenchmarkRunner::assembleSnippet(
 Expected<BenchmarkRunner::RunnableConfiguration>
 BenchmarkRunner::getRunnableConfiguration(
     const BenchmarkCode &BC, unsigned MinInstructions, unsigned LoopBodySize,
+    Benchmark::RepetitionModeE RepetitionMode,
     const SnippetRepetitor &Repetitor) const {
   RunnableConfiguration RC;
 
@@ -654,12 +669,54 @@ BenchmarkRunner::getRunnableConfiguration(
                         LoopBodySize, GenerateMemoryInstructions);
     if (Error E = Snippet.takeError())
       return std::move(E);
+    // There is no need to serialize/deserialize the object file if we're
+    // simply running end-to-end measurements.
+    // Same goes for any repetition mode that requires more than a single
+    // snippet.
+    if (BenchmarkPhaseSelector < BenchmarkPhaseSelectorE::Measure &&
+        (RepetitionMode == Benchmark::Loop ||
+         RepetitionMode == Benchmark::Duplicate)) {
+      if (Error E = BenchmarkResult.setObjectFile(*Snippet))
+        return std::move(E);
+    }
     RC.ObjectFile = getObjectFromBuffer(*Snippet);
   }
 
   return std::move(RC);
 }
 
+Expected<BenchmarkRunner::RunnableConfiguration>
+BenchmarkRunner::getRunnableConfiguration(Benchmark &&B) const {
+  assert(B.ObjFile.has_value() && B.ObjFile->isValid() &&
+         "No serialized obejct file is attached?");
+  const Benchmark::ObjectFile &ObjFile = *B.ObjFile;
+  SmallVector<uint8_t> DecompressedObjFile;
+  switch (ObjFile.CompressionFormat) {
+  case compression::Format::Zstd:
+    if (!compression::zstd::isAvailable())
+      return make_error<StringError>("zstd is not available for decompression.",
+                                     inconvertibleErrorCode());
+    if (Error E = compression::zstd::decompress(ObjFile.CompressedBytes,
+                                                DecompressedObjFile,
+                                                ObjFile.UncompressedSize))
+      return std::move(E);
+    break;
+  case compression::Format::Zlib:
+    if (!compression::zlib::isAvailable())
+      return make_error<StringError>("zlib is not available for decompression.",
+                                     inconvertibleErrorCode());
+    if (Error E = compression::zlib::decompress(ObjFile.CompressedBytes,
+                                                DecompressedObjFile,
+                                                ObjFile.UncompressedSize))
+      return std::move(E);
+    break;
+  }
+
+  StringRef Buffer(reinterpret_cast<const char *>(DecompressedObjFile.begin()),
+                   DecompressedObjFile.size());
+  return RunnableConfiguration{std::move(B), getObjectFromBuffer(Buffer)};
+}
+
 Expected<std::unique_ptr<BenchmarkRunner::FunctionExecutor>>
 BenchmarkRunner::createFunctionExecutor(
     object::OwningBinary<object::ObjectFile> ObjectFile,
diff --git a/llvm/tools/llvm-exegesis/lib/BenchmarkRunner.h b/llvm/tools/llvm-exegesis/lib/BenchmarkRunner.h
index e688b814d1c83d..ef9446bdd5bbe8 100644
--- a/llvm/tools/llvm-exegesis/lib/BenchmarkRunner.h
+++ b/llvm/tools/llvm-exegesis/lib/BenchmarkRunner.h
@@ -54,18 +54,25 @@ class BenchmarkRunner {
     RunnableConfiguration &operator=(RunnableConfiguration &&) = delete;
     RunnableConfiguration &operator=(const RunnableConfiguration &) = delete;
 
+    Benchmark BenchmarkResult;
+    object::OwningBinary<object::ObjectFile> ObjectFile;
+
   private:
     RunnableConfiguration() = default;
 
-    Benchmark BenchmarkResult;
-    object::OwningBinary<object::ObjectFile> ObjectFile;
+    RunnableConfiguration(Benchmark &&B,
+                          object::OwningBinary<object::ObjectFile> &&OF)
+        : BenchmarkResult(std::move(B)), ObjectFile(std::move(OF)) {}
   };
 
   Expected<RunnableConfiguration>
   getRunnableConfiguration(const BenchmarkCode &Configuration,
                            unsigned MinInstructions, unsigned LoopUnrollFactor,
+                           Benchmark::RepetitionModeE RepetitionMode,
                            const SnippetRepetitor &Repetitor) const;
 
+  Expected<RunnableConfiguration> getRunnableConfiguration(Benchmark &&B) const;
+
   std::pair<Error, Benchmark>
   runConfiguration(RunnableConfiguration &&RC,
                    const std::optional<StringRef> &DumpFile,
diff --git a/llvm/tools/llvm-exegesis/llvm-exegesis.cpp b/llvm/tools/llvm-exegesis/llvm-exegesis.cpp
index fa37e05956be8c..a21f3bdb5fba5f 100644
--- a/llvm/tools/llvm-exegesis/llvm-exegesis.cpp
+++ b/llvm/tools/llvm-exegesis/llvm-exegesis.cpp
@@ -114,8 +114,7 @@ static cl::opt<bool> BenchmarkMeasurementsPrintProgress(
 
 static cl::opt<BenchmarkPhaseSelectorE> BenchmarkPhaseSelector(
     "benchmark-phase",
-    cl::desc(
-        "it is possible to stop the benchmarking process after some phase"),
+    cl::desc("Stop the benchmarking process after some phase"),
     cl::cat(BenchmarkOptions),
     cl::values(
         clEnumValN(BenchmarkPhaseSelectorE::PrepareSnippet, "prepare-snippet",
@@ -135,6 +134,13 @@ static cl::opt<BenchmarkPhaseSelectorE> BenchmarkPhaseSelector(
             "(default)")),
     cl::init(BenchmarkPhaseSelectorE::Measure));
 
+static cl::opt<std::string> RunMeasurement(
+    "run-measurement",
+    cl::desc(
+        "Run measurement phase with a benchmarks file generated previously"),
+    cl::cat(BenchmarkOptions), cl::value_desc("<benchmarks file>"),
+    cl::init(""));
+
 static cl::opt<bool>
     UseDummyPerfCounters("use-dummy-perf-counters",
                          cl::desc("Do not read real performance counters, use "
@@ -397,11 +403,55 @@ generateSnippets(const LLVMState &State, unsigned Opcode,
   return Benchmarks;
 }
 
-static void runBenchmarkConfigurations(
-    const LLVMState &State, ArrayRef<BenchmarkCode> Configurations,
+static void deserializeRunnableConfigurations(
+    std::vector<Benchmark> &Benchmarks, const BenchmarkRunner &Runner,
+    std::ve...
[truncated]

llvmbot · 2025-01-07T20:49:25Z

@llvm/pr-subscribers-tools-llvm-exegesis

Author: Min-Yih Hsu (mshockwave)

Changes

This patch adds support to serialize the assembled object files into the benchmark results, so that we can deserialize them later on and run the measurements. This is useful when the overhead of end-to-end execution (snippet generation + benchmark measurement) is too high and we want to separate it into two stages.

The object file is compressed and serialized into base64 string. It has fantastic compression rate because there are lots of (nearly) identical instructions in the file.

Currently this patch can only resume before the measure phase. It does not support repetition modes that require more than one snippet (i.e. min and middle-half-loop/duplicate) either.

This PR stacks on top of #121991

Patch is 31.41 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/121993.diff

8 Files Affected:

(modified) llvm/docs/CommandGuide/llvm-exegesis.rst (+15-1)
(added) llvm/test/tools/llvm-exegesis/RISCV/serialize-obj-file.test (+33)
(added) llvm/test/tools/llvm-exegesis/dry-run-measurement.test (+11)
(modified) llvm/tools/llvm-exegesis/lib/BenchmarkResult.cpp (+93-2)
(modified) llvm/tools/llvm-exegesis/lib/BenchmarkResult.h (+20)
(modified) llvm/tools/llvm-exegesis/lib/BenchmarkRunner.cpp (+62-5)
(modified) llvm/tools/llvm-exegesis/lib/BenchmarkRunner.h (+9-2)
(modified) llvm/tools/llvm-exegesis/llvm-exegesis.cpp (+159-97)

diff --git a/llvm/docs/CommandGuide/llvm-exegesis.rst b/llvm/docs/CommandGuide/llvm-exegesis.rst
index 8266d891a5e6b1..c3580cdecab7b9 100644
--- a/llvm/docs/CommandGuide/llvm-exegesis.rst
+++ b/llvm/docs/CommandGuide/llvm-exegesis.rst
@@ -299,9 +299,18 @@ OPTIONS
   However, it is possible to stop at some stage before measuring. Choices are:
   * ``prepare-snippet``: Only generate the minimal instruction sequence.
   * ``prepare-and-assemble-snippet``: Same as ``prepare-snippet``, but also dumps an excerpt of the sequence (hex encoded).
-  * ``assemble-measured-code``: Same as ``prepare-and-assemble-snippet``. but also creates the full sequence that can be dumped to a file using ``--dump-object-to-disk``.
+  * ``assemble-measured-code``: Same as ``prepare-and-assemble-snippet``. but
+    also creates the full sequence that can be dumped to a file using ``--dump-object-to-disk``.
+    If either zlib or zstd is available and we're using either duplicate or
+    loop repetition mode, this phase generates benchmarks with a serialized
+    snippet object file attached to it.
   * ``measure``: Same as ``assemble-measured-code``, but also runs the measurement.
 
+.. option:: --run-measurement=<benchmarks file>
+
+  Given a benchmarks file generated after the ``assembly-measured-code`` phase,
+  resume the measurement phase from it.
+
 .. option:: --x86-lbr-sample-period=<nBranches/sample>
 
   Specify the LBR sampling period - how many branches before we take a sample.
@@ -449,6 +458,11 @@ OPTIONS
  crash when hardware performance counters are unavailable and for
  debugging :program:`llvm-exegesis` itself.
 
+.. option:: --dry-run-measurement
+  If set, llvm-exegesis runs everything except the actual snippet execution.
+  This is useful if we want to test some part of the code without actually
+  running on native platforms.
+
 .. option:: --execution-mode=[inprocess,subprocess]
 
   This option specifies what execution mode to use. The `inprocess` execution
diff --git a/llvm/test/tools/llvm-exegesis/RISCV/serialize-obj-file.test b/llvm/test/tools/llvm-exegesis/RISCV/serialize-obj-file.test
new file mode 100644
index 00000000000000..befd16699bef1a
--- /dev/null
+++ b/llvm/test/tools/llvm-exegesis/RISCV/serialize-obj-file.test
@@ -0,0 +1,33 @@
+# RUN: llvm-exegesis -mtriple=riscv64 -mcpu=sifive-p470 --opcode-name=SH3ADD --benchmark-phase=assemble-measured-code --mode=latency --benchmarks-file=%t.yaml
+# RUN: FileCheck --input-file=%t.yaml %s --check-prefixes=CHECK,SERIALIZE
+# RUN: llvm-exegesis -mtriple=riscv64 -mcpu=sifive-p470 --run-measurement=%t.yaml --mode=latency --dry-run-measurement --use-dummy-perf-counters \
+# RUN:    --dump-object-to-disk=%t.o | FileCheck %s --check-prefixes=CHECK,DESERIALIZE
+# RUN: llvm-objdump -d %t.o | FileCheck %s --check-prefix=OBJDUMP
+# RUN: llvm-exegesis -mtriple=riscv64 -mcpu=sifive-p470 --opcode-name=SH3ADD --mode=latency --dry-run-measurement --use-dummy-perf-counters | \
+# RUN:    FileCheck %s --check-prefix=NO-SERIALIZE
+# RUN: llvm-exegesis -mtriple=riscv64 -mcpu=sifive-p470 --opcode-name=SH3ADD --mode=latency --benchmark-phase=assemble-measured-code --repetition-mode=min | \
+# RUN:    FileCheck %s --check-prefix=NO-SERIALIZE
+# RUN: llvm-exegesis -mtriple=riscv64 -mcpu=sifive-p470 --opcode-name=SH3ADD --mode=latency --benchmark-phase=assemble-measured-code --repetition-mode=middle-half-loop | \
+# RUN:    FileCheck %s --check-prefix=NO-SERIALIZE
+# RUN: llvm-exegesis -mtriple=riscv64 -mcpu=sifive-p470 --opcode-name=SH3ADD --mode=latency --benchmark-phase=assemble-measured-code --repetition-mode=middle-half-duplicate | \
+# RUN:    FileCheck %s --check-prefix=NO-SERIALIZE
+# REQUIRES: zlib || zstd
+
+# A round-trip test for serialize/deserialize benchmarks.
+
+# CHECK: mode: latency
+# CHECK:  instructions:
+# CHECK-NEXT: - 'SH3ADD X{{.*}} X{{.*}} X{{.*}}'
+# CHECK: cpu_name:        sifive-p470
+# CHECK-NEXT: llvm_triple:     riscv64
+# CHECK-NEXT: min_instructions: 10000
+# CHECK-NEXT: measurements:    []
+# SERIALIZE: error: actual measurements skipped.
+# DESERIALIZE: error:           ''
+# CHECK: info:            Repeating a single explicitly serial instruction
+
+# OBJDUMP: sh3add
+
+# Negative tests: we shouldn't serialize object files in some scenarios.
+
+# NO-SERIALIZE-NOT: object_file:
diff --git a/llvm/test/tools/llvm-exegesis/dry-run-measurement.test b/llvm/test/tools/llvm-exegesis/dry-run-measurement.test
new file mode 100644
index 00000000000000..82857e7998b5e6
--- /dev/null
+++ b/llvm/test/tools/llvm-exegesis/dry-run-measurement.test
@@ -0,0 +1,11 @@
+# RUN: llvm-exegesis --mtriple=riscv64 --mcpu=sifive-p470 --mode=latency --opcode-name=ADD --use-dummy-perf-counters --dry-run-measurement | FileCheck %s
+# REQUIRES: riscv-registered-target
+
+# This test makes sure that llvm-exegesis doesn't execute "cross-compiled" snippets in the presence of
+# --dry-run-measurement. RISC-V was chosen simply because most of the time we run tests on X86 machines.
+
+# Should not contain misleading results.
+# CHECK: measurements:    []
+
+# Should not contain error messages like "snippet crashed while running: Segmentation fault".
+# CHECK: error:           ''
diff --git a/llvm/tools/llvm-exegesis/lib/BenchmarkResult.cpp b/llvm/tools/llvm-exegesis/lib/BenchmarkResult.cpp
index 84dc23b343c6c0..eff5a6d547cbda 100644
--- a/llvm/tools/llvm-exegesis/lib/BenchmarkResult.cpp
+++ b/llvm/tools/llvm-exegesis/lib/BenchmarkResult.cpp
@@ -15,10 +15,13 @@
 #include "llvm/ADT/StringRef.h"
 #include "llvm/ADT/bit.h"
 #include "llvm/ObjectYAML/YAML.h"
+#include "llvm/Support/Base64.h"
+#include "llvm/Support/CommandLine.h"
 #include "llvm/Support/Errc.h"
 #include "llvm/Support/FileOutputBuffer.h"
 #include "llvm/Support/FileSystem.h"
 #include "llvm/Support/Format.h"
+#include "llvm/Support/Timer.h"
 #include "llvm/Support/raw_ostream.h"
 
 static constexpr const char kIntegerPrefix[] = "i_0x";
@@ -27,6 +30,12 @@ static constexpr const char kInvalidOperand[] = "INVALID";
 
 namespace llvm {
 
+static cl::opt<compression::Format> ForceObjectFileCompressionFormat(
+    "exegesis-force-obj-compress-format", cl::Hidden,
+    cl::desc("Force to use this compression format for object files."),
+    cl::values(clEnumValN(compression::Format::Zstd, "zstd", "Using Zstandard"),
+               clEnumValN(compression::Format::Zlib, "zlib", "Using LibZ")));
+
 namespace {
 
 // A mutable struct holding an LLVMState that can be passed through the
@@ -278,6 +287,13 @@ template <> struct ScalarTraits<exegesis::RegisterValue> {
   static const bool flow = true;
 };
 
+template <> struct ScalarEnumerationTraits<compression::Format> {
+  static void enumeration(IO &Io, compression::Format &Format) {
+    Io.enumCase(Format, "zstd", compression::Format::Zstd);
+    Io.enumCase(Format, "zlib", compression::Format::Zlib);
+  }
+};
+
 template <> struct MappingContextTraits<exegesis::BenchmarkKey, YamlContext> {
   static void mapping(IO &Io, exegesis::BenchmarkKey &Obj,
                       YamlContext &Context) {
@@ -288,6 +304,33 @@ template <> struct MappingContextTraits<exegesis::BenchmarkKey, YamlContext> {
   }
 };
 
+template <> struct MappingTraits<exegesis::Benchmark::ObjectFile> {
+  struct NormalizedBase64Binary {
+    std::string Base64Str;
+
+    NormalizedBase64Binary(IO &) {}
+    NormalizedBase64Binary(IO &, const std::vector<uint8_t> &Data)
+        : Base64Str(llvm::encodeBase64(Data)) {}
+
+    std::vector<uint8_t> denormalize(IO &) {
+      std::vector<char> Buffer;
+      if (Error E = llvm::decodeBase64(Base64Str, Buffer))
+        report_fatal_error(std::move(E));
+
+      StringRef Data(Buffer.data(), Buffer.size());
+      return std::vector<uint8_t>(Data.bytes_begin(), Data.bytes_end());
+    }
+  };
+
+  static void mapping(IO &Io, exegesis::Benchmark::ObjectFile &Obj) {
+    Io.mapRequired("compression", Obj.CompressionFormat);
+    Io.mapRequired("original_size", Obj.UncompressedSize);
+    MappingNormalization<NormalizedBase64Binary, std::vector<uint8_t>>
+        ObjFileString(Io, Obj.CompressedBytes);
+    Io.mapRequired("compressed_bytes", ObjFileString->Base64Str);
+  }
+};
+
 template <> struct MappingContextTraits<exegesis::Benchmark, YamlContext> {
   struct NormalizedBinary {
     NormalizedBinary(IO &io) {}
@@ -325,9 +368,11 @@ template <> struct MappingContextTraits<exegesis::Benchmark, YamlContext> {
     Io.mapRequired("error", Obj.Error);
     Io.mapOptional("info", Obj.Info);
     // AssembledSnippet
-    MappingNormalization<NormalizedBinary, std::vector<uint8_t>> BinaryString(
+    MappingNormalization<NormalizedBinary, std::vector<uint8_t>> SnippetString(
         Io, Obj.AssembledSnippet);
-    Io.mapOptional("assembled_snippet", BinaryString->Binary);
+    Io.mapOptional("assembled_snippet", SnippetString->Binary);
+    // ObjectFile
+    Io.mapOptional("object_file", Obj.ObjFile);
   }
 };
 
@@ -364,6 +409,52 @@ Benchmark::readTriplesAndCpusFromYamls(MemoryBufferRef Buffer) {
   return Result;
 }
 
+Error Benchmark::setObjectFile(StringRef RawBytes) {
+  SmallVector<uint8_t> CompressedBytes;
+  llvm::compression::Format CompressionFormat;
+
+  auto isFormatAvailable = [](llvm::compression::Format F) -> bool {
+    switch (F) {
+    case compression::Format::Zstd:
+      return compression::zstd::isAvailable();
+    case compression::Format::Zlib:
+      return compression::zlib::isAvailable();
+    }
+  };
+  if (ForceObjectFileCompressionFormat.getNumOccurrences() > 0) {
+    CompressionFormat = ForceObjectFileCompressionFormat;
+    if (!isFormatAvailable(CompressionFormat))
+      return make_error<StringError>(
+          "The designated compression format is not available.",
+          inconvertibleErrorCode());
+  } else if (isFormatAvailable(compression::Format::Zstd)) {
+    // Try newer compression algorithm first.
+    CompressionFormat = compression::Format::Zstd;
+  } else if (isFormatAvailable(compression::Format::Zlib)) {
+    CompressionFormat = compression::Format::Zlib;
+  } else {
+    return make_error<StringError>(
+        "None of the compression methods is available.",
+        inconvertibleErrorCode());
+  }
+
+  switch (CompressionFormat) {
+  case compression::Format::Zstd:
+    compression::zstd::compress({RawBytes.bytes_begin(), RawBytes.bytes_end()},
+                                CompressedBytes);
+    break;
+  case compression::Format::Zlib:
+    compression::zlib::compress({RawBytes.bytes_begin(), RawBytes.bytes_end()},
+                                CompressedBytes);
+    break;
+  }
+
+  ObjFile = {CompressionFormat,
+             RawBytes.size(),
+             {CompressedBytes.begin(), CompressedBytes.end()}};
+  return Error::success();
+}
+
 Expected<Benchmark> Benchmark::readYaml(const LLVMState &State,
                                         MemoryBufferRef Buffer) {
   yaml::Input Yin(Buffer);
diff --git a/llvm/tools/llvm-exegesis/lib/BenchmarkResult.h b/llvm/tools/llvm-exegesis/lib/BenchmarkResult.h
index 3c09a8380146e5..a5217566204a14 100644
--- a/llvm/tools/llvm-exegesis/lib/BenchmarkResult.h
+++ b/llvm/tools/llvm-exegesis/lib/BenchmarkResult.h
@@ -21,6 +21,7 @@
 #include "llvm/ADT/StringRef.h"
 #include "llvm/MC/MCInst.h"
 #include "llvm/MC/MCInstBuilder.h"
+#include "llvm/Support/Compression.h"
 #include "llvm/Support/YAMLTraits.h"
 #include <limits>
 #include <set>
@@ -76,6 +77,11 @@ struct BenchmarkKey {
   uintptr_t SnippetAddress = 0;
   // The register that should be used to hold the loop counter.
   unsigned LoopRegister;
+
+  bool operator==(const BenchmarkKey &RHS) const {
+    return Config == RHS.Config &&
+           Instructions[0].getOpcode() == RHS.Instructions[0].getOpcode();
+  }
 };
 
 struct BenchmarkMeasure {
@@ -122,6 +128,16 @@ struct Benchmark {
   std::string Error;
   std::string Info;
   std::vector<uint8_t> AssembledSnippet;
+
+  struct ObjectFile {
+    llvm::compression::Format CompressionFormat;
+    size_t UncompressedSize = 0;
+    std::vector<uint8_t> CompressedBytes;
+
+    bool isValid() const { return UncompressedSize && CompressedBytes.size(); }
+  };
+  std::optional<ObjectFile> ObjFile;
+
   // How to aggregate measurements.
   enum ResultAggregationModeE { Min, Max, Mean, MinVariance };
 
@@ -132,6 +148,10 @@ struct Benchmark {
   Benchmark &operator=(const Benchmark &) = delete;
   Benchmark &operator=(Benchmark &&) = delete;
 
+  // Compress raw object file bytes and assign the result and compression type
+  // to CompressedObjectFile and ObjFileCompression, respectively.
+  class Error setObjectFile(StringRef RawBytes);
+
   // Read functions.
   static Expected<Benchmark> readYaml(const LLVMState &State,
                                                  MemoryBufferRef Buffer);
diff --git a/llvm/tools/llvm-exegesis/lib/BenchmarkRunner.cpp b/llvm/tools/llvm-exegesis/lib/BenchmarkRunner.cpp
index a7771b99e97b1a..3bca6ed13d8fc8 100644
--- a/llvm/tools/llvm-exegesis/lib/BenchmarkRunner.cpp
+++ b/llvm/tools/llvm-exegesis/lib/BenchmarkRunner.cpp
@@ -53,6 +53,12 @@
 namespace llvm {
 namespace exegesis {
 
+static cl::opt<bool>
+    DryRunMeasurement("dry-run-measurement",
+                      cl::desc("Run every steps in the measurement phase "
+                               "except executing the snippet."),
+                      cl::init(false), cl::Hidden);
+
 BenchmarkRunner::BenchmarkRunner(const LLVMState &State, Benchmark::ModeE Mode,
                                  BenchmarkPhaseSelectorE BenchmarkPhaseSelector,
                                  ExecutionModeE ExecutionMode,
@@ -140,13 +146,21 @@ class InProcessFunctionExecutorImpl : public BenchmarkRunner::FunctionExecutor {
     Scratch->clear();
     {
       auto PS = ET.withSavedState();
+      // We can't directly capture DryRunMeasurement in the lambda below.
+      bool DryRun = DryRunMeasurement;
       CrashRecoveryContext CRC;
       CrashRecoveryContext::Enable();
-      const bool Crashed = !CRC.RunSafely([this, Counter, ScratchPtr]() {
-        Counter->start();
-        this->Function(ScratchPtr);
-        Counter->stop();
-      });
+      const bool Crashed =
+          !CRC.RunSafely([this, Counter, ScratchPtr, DryRun]() {
+            if (DryRun) {
+              Counter->start();
+              Counter->stop();
+            } else {
+              Counter->start();
+              this->Function(ScratchPtr);
+              Counter->stop();
+            }
+          });
       CrashRecoveryContext::Disable();
       PS.reset();
       if (Crashed) {
@@ -610,6 +624,7 @@ Expected<SmallString<0>> BenchmarkRunner::assembleSnippet(
 Expected<BenchmarkRunner::RunnableConfiguration>
 BenchmarkRunner::getRunnableConfiguration(
     const BenchmarkCode &BC, unsigned MinInstructions, unsigned LoopBodySize,
+    Benchmark::RepetitionModeE RepetitionMode,
     const SnippetRepetitor &Repetitor) const {
   RunnableConfiguration RC;
 
@@ -654,12 +669,54 @@ BenchmarkRunner::getRunnableConfiguration(
                         LoopBodySize, GenerateMemoryInstructions);
     if (Error E = Snippet.takeError())
       return std::move(E);
+    // There is no need to serialize/deserialize the object file if we're
+    // simply running end-to-end measurements.
+    // Same goes for any repetition mode that requires more than a single
+    // snippet.
+    if (BenchmarkPhaseSelector < BenchmarkPhaseSelectorE::Measure &&
+        (RepetitionMode == Benchmark::Loop ||
+         RepetitionMode == Benchmark::Duplicate)) {
+      if (Error E = BenchmarkResult.setObjectFile(*Snippet))
+        return std::move(E);
+    }
     RC.ObjectFile = getObjectFromBuffer(*Snippet);
   }
 
   return std::move(RC);
 }
 
+Expected<BenchmarkRunner::RunnableConfiguration>
+BenchmarkRunner::getRunnableConfiguration(Benchmark &&B) const {
+  assert(B.ObjFile.has_value() && B.ObjFile->isValid() &&
+         "No serialized obejct file is attached?");
+  const Benchmark::ObjectFile &ObjFile = *B.ObjFile;
+  SmallVector<uint8_t> DecompressedObjFile;
+  switch (ObjFile.CompressionFormat) {
+  case compression::Format::Zstd:
+    if (!compression::zstd::isAvailable())
+      return make_error<StringError>("zstd is not available for decompression.",
+                                     inconvertibleErrorCode());
+    if (Error E = compression::zstd::decompress(ObjFile.CompressedBytes,
+                                                DecompressedObjFile,
+                                                ObjFile.UncompressedSize))
+      return std::move(E);
+    break;
+  case compression::Format::Zlib:
+    if (!compression::zlib::isAvailable())
+      return make_error<StringError>("zlib is not available for decompression.",
+                                     inconvertibleErrorCode());
+    if (Error E = compression::zlib::decompress(ObjFile.CompressedBytes,
+                                                DecompressedObjFile,
+                                                ObjFile.UncompressedSize))
+      return std::move(E);
+    break;
+  }
+
+  StringRef Buffer(reinterpret_cast<const char *>(DecompressedObjFile.begin()),
+                   DecompressedObjFile.size());
+  return RunnableConfiguration{std::move(B), getObjectFromBuffer(Buffer)};
+}
+
 Expected<std::unique_ptr<BenchmarkRunner::FunctionExecutor>>
 BenchmarkRunner::createFunctionExecutor(
     object::OwningBinary<object::ObjectFile> ObjectFile,
diff --git a/llvm/tools/llvm-exegesis/lib/BenchmarkRunner.h b/llvm/tools/llvm-exegesis/lib/BenchmarkRunner.h
index e688b814d1c83d..ef9446bdd5bbe8 100644
--- a/llvm/tools/llvm-exegesis/lib/BenchmarkRunner.h
+++ b/llvm/tools/llvm-exegesis/lib/BenchmarkRunner.h
@@ -54,18 +54,25 @@ class BenchmarkRunner {
     RunnableConfiguration &operator=(RunnableConfiguration &&) = delete;
     RunnableConfiguration &operator=(const RunnableConfiguration &) = delete;
 
+    Benchmark BenchmarkResult;
+    object::OwningBinary<object::ObjectFile> ObjectFile;
+
   private:
     RunnableConfiguration() = default;
 
-    Benchmark BenchmarkResult;
-    object::OwningBinary<object::ObjectFile> ObjectFile;
+    RunnableConfiguration(Benchmark &&B,
+                          object::OwningBinary<object::ObjectFile> &&OF)
+        : BenchmarkResult(std::move(B)), ObjectFile(std::move(OF)) {}
   };
 
   Expected<RunnableConfiguration>
   getRunnableConfiguration(const BenchmarkCode &Configuration,
                            unsigned MinInstructions, unsigned LoopUnrollFactor,
+                           Benchmark::RepetitionModeE RepetitionMode,
                            const SnippetRepetitor &Repetitor) const;
 
+  Expected<RunnableConfiguration> getRunnableConfiguration(Benchmark &&B) const;
+
   std::pair<Error, Benchmark>
   runConfiguration(RunnableConfiguration &&RC,
                    const std::optional<StringRef> &DumpFile,
diff --git a/llvm/tools/llvm-exegesis/llvm-exegesis.cpp b/llvm/tools/llvm-exegesis/llvm-exegesis.cpp
index fa37e05956be8c..a21f3bdb5fba5f 100644
--- a/llvm/tools/llvm-exegesis/llvm-exegesis.cpp
+++ b/llvm/tools/llvm-exegesis/llvm-exegesis.cpp
@@ -114,8 +114,7 @@ static cl::opt<bool> BenchmarkMeasurementsPrintProgress(
 
 static cl::opt<BenchmarkPhaseSelectorE> BenchmarkPhaseSelector(
     "benchmark-phase",
-    cl::desc(
-        "it is possible to stop the benchmarking process after some phase"),
+    cl::desc("Stop the benchmarking process after some phase"),
     cl::cat(BenchmarkOptions),
     cl::values(
         clEnumValN(BenchmarkPhaseSelectorE::PrepareSnippet, "prepare-snippet",
@@ -135,6 +134,13 @@ static cl::opt<BenchmarkPhaseSelectorE> BenchmarkPhaseSelector(
             "(default)")),
     cl::init(BenchmarkPhaseSelectorE::Measure));
 
+static cl::opt<std::string> RunMeasurement(
+    "run-measurement",
+    cl::desc(
+        "Run measurement phase with a benchmarks file generated previously"),
+    cl::cat(BenchmarkOptions), cl::value_desc("<benchmarks file>"),
+    cl::init(""));
+
 static cl::opt<bool>
     UseDummyPerfCounters("use-dummy-perf-counters",
                          cl::desc("Do not read real performance counters, use "
@@ -397,11 +403,55 @@ generateSnippets(const LLVMState &State, unsigned Opcode,
   return Benchmarks;
 }
 
-static void runBenchmarkConfigurations(
-    const LLVMState &State, ArrayRef<BenchmarkCode> Configurations,
+static void deserializeRunnableConfigurations(
+    std::vector<Benchmark> &Benchmarks, const BenchmarkRunner &Runner,
+    std::ve...
[truncated]

boomanaiden154

Some initial comments. I still need to go through the changes in llvm-exegesis.cpp.

boomanaiden154 · 2025-01-09T03:11:35Z

llvm/tools/llvm-exegesis/lib/BenchmarkResult.cpp

@@ -278,6 +287,13 @@ template <> struct ScalarTraits<exegesis::RegisterValue> {
  static const bool flow = true;
 };

+template <> struct ScalarEnumerationTraits<compression::Format> {


I don't think we're supposed to create these for types that aren't owned by us (or at least that's what I was told when I attempted something similar). This might need to go with where the type is defined.

boomanaiden154 · 2025-01-09T03:13:07Z

llvm/tools/llvm-exegesis/lib/BenchmarkRunner.cpp

+    // Same goes for any repetition mode that requires more than a single
+    // snippet.
+    if (BenchmarkPhaseSelector < BenchmarkPhaseSelectorE::Measure &&
+        (RepetitionMode == Benchmark::Loop ||


Why are you only selecting for these two repetition modes here?

Is it significantly more complicated to support the other repetition modes?

Why are you only selecting for these two repetition modes here?

loop and duplicate are the two that contain only a single snippet / object file per benchmark. Others need at least two different snippets / object files.

Is it significantly more complicated to support the other repetition modes?

Sort of. Take --repetition-mode=min which requires two snippets as an example, there are two ways to do it:

Store both the loop and duplicate snippets in the same benchmark YAML record. In this case we also have to store the repetition mode but more importantly, we either have to "re-split" each benchmark record into their own benchmark instances, or teach both the benchmark runner and result aggregator about this.

Store the loop and duplicate snippets in separate benchmark YAML records. In this case we might need to figure out which of these benchmarks belong to the same "group" (same opcode and same configuration).

Option (2) would probably be easier, but even that I think it will make this patch too big. So I rather add this feature incrementally.

boomanaiden154 · 2025-01-09T03:21:51Z

llvm/test/tools/llvm-exegesis/RISCV/serialize-obj-file.test

+
+# Negative tests: we shouldn't serialize object files in some scenarios.
+
+# NO-SERIALIZE-NOT: object_file:


Should we be throwing an error here instead of just not emitting the object file snippet? This seems like it could be a bit confusing.

Now I throw an error when the users try to use --serialize-benchmarks with unsupported repetition modes.

boomanaiden154 · 2025-01-09T03:22:22Z

llvm/test/tools/llvm-exegesis/RISCV/serialize-obj-file.test

@@ -0,0 +1,33 @@
+# RUN: llvm-exegesis -mtriple=riscv64 -mcpu=sifive-p470 --opcode-name=SH3ADD --benchmark-phase=assemble-measured-code --mode=latency --benchmarks-file=%t.yaml
+# RUN: FileCheck --input-file=%t.yaml %s --check-prefixes=CHECK,SERIALIZE


Can you add some more comments in the test on what exactly you're testing for in different sections.

boomanaiden154 · 2025-01-09T03:24:51Z

llvm/test/tools/llvm-exegesis/RISCV/serialize-obj-file.test

@@ -0,0 +1,33 @@
+# RUN: llvm-exegesis -mtriple=riscv64 -mcpu=sifive-p470 --opcode-name=SH3ADD --benchmark-phase=assemble-measured-code --mode=latency --benchmarks-file=%t.yaml


I'm thinking we might want to add a new flag that controls the emission of object_file? I would prefer not to have it when messing around with exegesis.

I've been wanting to do the same thing with the assembled_snippet output for a while given I don't find it particularly useful and it reduces performance (another round trip through the MCJIT/assembler), but that's a separate thing.

I'm thinking we might want to add a new flag that controls the emission of object_file? I would prefer not to have it when messing around with exegesis.

Previously I turned the serialization off if the user is running end-to-end measurement, but I agree that we should make the serialization optional. A new flag has been added

I've been wanting to do the same thing with the assembled_snippet output for a while given I don't find it particularly useful

It's useful to check if I generated the desired shape of snippets when it showed up in the inconsistency reports. But maybe that's just for RISC-V where ill-formed snippets are more common.

TBA...

topperc · 2025-01-13T23:55:30Z

llvm/test/tools/llvm-exegesis/serialize-obj-file.test

+# RUN:    --dump-object-to-disk=%t.o | FileCheck %s --check-prefixes=CHECK,DESERIALIZE
+# RUN: llvm-objdump -d %t.o | FileCheck %s --check-prefix=OBJDUMP
+
+# We should not serialie benchmarks by default.


topperc · 2025-01-13T23:56:44Z

llvm/tools/llvm-exegesis/lib/BenchmarkResult.cpp

+    CompressionFormat = compression::Format::Zlib;
+  } else {
+    return make_error<StringError>(
+        "None of the compression methods is available.",


topperc · 2025-01-13T23:57:37Z

llvm/tools/llvm-exegesis/lib/BenchmarkRunner.cpp

+    SerializeBenchmarks("serialize-benchmarks",
+                        cl::desc("Generate fully-serialized benchmarks "
+                                 "that can later be deserialized and "
+                                 "resuming the measurement."),


resuming -> resume

topperc · 2025-01-13T23:59:42Z

llvm/tools/llvm-exegesis/llvm-exegesis.cpp

+      Configurations = ExitOnErr(readSnippets(State, SnippetsFile));
+      for (const auto &Configuration : Configurations) {
+        if (ExecutionMode != BenchmarkRunner::ExecutionModeE::SubProcess &&
+            (Configuration.Key.MemoryMappings.size() != 0 ||


mshockwave requested review from legrosbuffle and boomanaiden154 January 7, 2025 20:48

llvmbot added tools:llvm-exegesis llvm:binary-utilities labels Jan 7, 2025

mshockwave mentioned this pull request Jan 8, 2025

[Exegesis] Add the ability to dry-run the measurement phase #121991

Merged

boomanaiden154 reviewed Jan 9, 2025

View reviewed changes

mshockwave added 3 commits January 13, 2025 14:09

[Exegesis] Add supports to serialize/deserialize benchmarks

f69c8ab

TBA...

fixup! Turn serialize-obj-file.test into a generic test

14913ca

fixup! Address review comments

619a4c3

mshockwave force-pushed the patch/exegesis/serialize-benchmarks branch from 74da08e to 619a4c3 Compare January 13, 2025 22:38

llvmbot added the llvm:support label Jan 13, 2025

fixup! Use a shorter flag name

c039376

topperc reviewed Jan 14, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Exegesis] Add supports to serialize/deserialize object files into benchmarks #121993

[Exegesis] Add supports to serialize/deserialize object files into benchmarks #121993

mshockwave commented Jan 7, 2025

llvmbot commented Jan 7, 2025 •

edited

Loading

llvmbot commented Jan 7, 2025

boomanaiden154 left a comment

boomanaiden154 Jan 9, 2025

mshockwave Jan 13, 2025

boomanaiden154 Jan 9, 2025

mshockwave Jan 13, 2025 •

edited

Loading

boomanaiden154 Jan 9, 2025

mshockwave Jan 13, 2025 •

edited

Loading

boomanaiden154 Jan 9, 2025

mshockwave Jan 13, 2025

boomanaiden154 Jan 9, 2025

mshockwave Jan 13, 2025

topperc Jan 13, 2025

topperc Jan 13, 2025

topperc Jan 13, 2025

topperc Jan 13, 2025


		# Negative tests: we shouldn't serialize object files in some scenarios.

		# NO-SERIALIZE-NOT: object_file:

		@@ -0,0 +1,33 @@
		# RUN: llvm-exegesis -mtriple=riscv64 -mcpu=sifive-p470 --opcode-name=SH3ADD --benchmark-phase=assemble-measured-code --mode=latency --benchmarks-file=%t.yaml
		# RUN: FileCheck --input-file=%t.yaml %s --check-prefixes=CHECK,SERIALIZE

[Exegesis] Add supports to serialize/deserialize object files into benchmarks #121993

Are you sure you want to change the base?

[Exegesis] Add supports to serialize/deserialize object files into benchmarks #121993

Conversation

mshockwave commented Jan 7, 2025

llvmbot commented Jan 7, 2025 • edited Loading

llvmbot commented Jan 7, 2025

boomanaiden154 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mshockwave Jan 13, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mshockwave Jan 13, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

llvmbot commented Jan 7, 2025 •

edited

Loading

mshockwave Jan 13, 2025 •

edited

Loading

mshockwave Jan 13, 2025 •

edited

Loading