[Codegen] Add pass to materialize tuning specs #19337

kuhar · 2024-11-28T20:51:41Z

... and update 'Materialize User Configs' to pick up those tuning specs.

The overall flow is as follows:

We pick up any user-specified tuning specs in materialize tuning specs and link them into a single transform dialect library module.
We serialize that linked tuning spec as MLIR bytecode.
We embed this MLIR bytecode as a module attribute. This is so that none of the subsequent passes will accidentally walk or otherwise modify it.
In materialize user configs, we first check if there are any transform libraries provided. If not, then we check if the tuning spec is present.
We deserialize the tuning spec attribute into a transform dialect library module and execute it.
We remove the serialized tuning spec from the module, as it's no longer needed.

I also modified getOrLoadTransformLibraryModule so that it doesn't use the transform::detail::assembleTransformLibraryFromPaths function, because it has some logic to perform library merging that would overwrite module symbol names. There's no need to call it anyway, since we are loading a single library at a time.

This is not added to any codegen pipeline yet -- I will do that in a future PR.

Issue: #19214

compiler/src/iree/compiler/Codegen/Common/MaterializeTuningSpecsPass.cpp

compiler/src/iree/compiler/Codegen/Common/test/materialize_tuning_specs.mlir

Groverkss · 2024-11-29T15:28:36Z

We embed this MLIR bytecode as a module attribute. This is so that none of the subsequent passes will accidentally walk or otherwise modify it.

This will happen per dispatch right? Can we not use a symbol to refer to it? It sounds like something terrible to roundtrip. For each dispatch, you will read the same tuning spec again and again. I would prefer it just being in a seperate module.

bjacob · 2024-11-29T16:39:33Z

@kuhar @Groverkss and I had a video chat about this PR this morning. I think I understand and agree with all what this PR is doing on a high level! I'm out of my depth on low-level details here, particularly where transform dialect is involved, so deferring to @Groverkss for the actual code review.

kuhar · 2024-11-29T20:32:42Z

We embed this MLIR bytecode as a module attribute. This is so that none of the subsequent passes will accidentally walk or otherwise modify it.

This will happen per dispatch right? Can we not use a symbol to refer to it? It sounds like something terrible to roundtrip. For each dispatch, you will read the same tuning spec again and again. I would prefer it just being in a seperate module.

I added some pass description and comments in the pass description to clarify this.

This is wasteful/redundant in the sense that each dispatch will be (most likely) annotated with the same tuning spec attribute, but the underlying storage will be unique by the context. In the future, we can avoid the duplication of work during linking by allowing us to look up linked libraries in the dialect (based on the paths to all of the inputs). I plan to add this once we plumb through default tuning specs; the current implementation does the minimum to make it work, we can optimize later.

kuhar · 2024-11-30T03:35:17Z

I added support for looking up the tuning spec in any of the parent ops like @bjacob suggested (with an accompanying test).

compiler/src/iree/compiler/Codegen/Common/MaterializeUserConfigs.cpp

compiler/src/iree/compiler/Codegen/Common/Passes.h

... and update 'Materialize User Configs' to pick up those tuning specs. The overall flow is as follows: * We pick up any user-specified tuning specs in `materialize tuning specs` and link them into a single transform dialect library module. * We serialize that linked tuning spec as MLIR bytecode. * We embed this MLIR bytecode as a module attribute. This is so that none of the subsequent passes will accidentally `walk` or otherwise modify it. * In `materilize user configs`, we first check if there are any transform libraries provided. If not, then we check if the tuning spec is present. * We deserialize the tuning spec attribute into a transform dialect library module and execute it. * We remove the serialized tuning spec from the module, as it's no longer needed. Signed-off-by: Jakub Kuderski <[email protected]>

MaheshRavishankar

Thanks @kuhar . Left a few more comments.

MaheshRavishankar · 2024-12-03T23:22:46Z

compiler/src/iree/compiler/Codegen/Common/MaterializeTuningSpecsPass.cpp

+    return success();
+  }
+
+  llvm::sys::fs::create_directories(dir);


Should check what happens on Windows. Last time I used llvm::sys::fs it caused issues on Windows.

I copied this code from the --iree-hal-dump-executable-files-to logic, so I'd think it should work in both places, although I haven't checked on windows. If we learn it doesn't work, I can fix it.

➜ ag fs::create_directories src/iree/compiler/Codegen/Common/MaterializeTuningSpecsPass.cpp 60: llvm::sys::fs::create_directories(dir); src/iree/compiler/Dialect/Util/Transforms/DumpModule.cpp 29: llvm::sys::fs::create_directories(llvm::sys::path::parent_path(path)); src/iree/compiler/Dialect/HAL/Transforms/SerializeExecutables.cpp 72: llvm::sys::fs::create_directories(dumpIntermediatesPath); 75: llvm::sys::fs::create_directories(dumpBinariesPath); src/iree/compiler/Dialect/HAL/Transforms/DumpExecutableSources.cpp 50: llvm::sys::fs::create_directories(path); src/iree/compiler/Dialect/HAL/Transforms/DumpExecutableBenchmarks.cpp 514: llvm::sys::fs::create_directories(path);

MaheshRavishankar · 2024-12-03T23:24:44Z

compiler/src/iree/compiler/Codegen/Common/LinkTuningSpecsPass.cpp

-    if (tuningSpecs.empty()) {
-      LDBG("No tuning specs found, exiting without linking");
-      return;
+  for (ModuleOp nested : findNestedModulesWithNamedSequences(module)) {


We should document more carefully what happens when there are conflicts, i.e some ordering of the spec. From the looks of this the default gets higher precedence than the user defined. I would make it the other way round.

I commented in the pass description that the order in the module is preserved, IE, top down:

iree/compiler/src/iree/compiler/Codegen/Common/Passes.td

Lines 409 to 420 in 29229df

def LinkTuningSpecsPass : Pass<"iree-codegen-link-tuning-specs", "ModuleOp"> {

let summary =

"Link nested transform dialect tuning specs named sequences into a single entry point";

let description = [{

Given a module with multiple nested tuning specs, introduce a new named sequence

that includes all the other tuning spec entry points. The order of inclusion is the same

as the in which these nested tuning specs appear in the IR.

A tuning spec entry point is a `transform.named_sequence` op annotated with the

`iree_codegen.tuning_spec` unit attribute. We require it to perform in-place op

modification and not consume the handle.

}];

From the looks of this the default gets higher precedence than the user defined. I would make it the other way round.

Currently we don't have any default specs, so there's nothing to test. But the intention is to have default specs apply at the very end. Once we have that, I'll make sure to add a test that checks the order.

... and update 'Materialize User Configs' to pick up those tuning specs. The overall flow is as follows: * We pick up any user-specified tuning specs in `materialize tuning specs` and link them into a single transform dialect library module. * We serialize that linked tuning spec as MLIR bytecode. * We embed this MLIR bytecode as a module attribute. This is so that none of the subsequent passes will accidentally `walk` or otherwise modify it. * In `materialize user configs`, we first check if there are any transform libraries provided. If not, then we check if the tuning spec is present. * We deserialize the tuning spec attribute into a transform dialect library module and execute it. * We remove the serialized tuning spec from the module, as it's no longer needed. I also modified `getOrLoadTransformLibraryModule` so that it doesn't use the `transform::detail::assembleTransformLibraryFromPaths` function, because it has some logic to perform library merging that would overwrite module symbol names. There's no need to call it anyway, since we are loading a single library at a time. This is not added to any codegen pipeline yet -- I will do that in a future PR. Issue: iree-org#19214 Signed-off-by: Giacomo Serafini <[email protected]>

This clarification was suggested in #19337 (comment).

kuhar requested review from bjacob, bangtianliu, Max191 and Groverkss November 28, 2024 20:51

kuhar requested review from hanhanW and MaheshRavishankar as code owners November 28, 2024 20:51

kuhar force-pushed the materialize-tuning-config branch from 75783f1 to bb212be Compare November 29, 2024 01:11

Groverkss reviewed Nov 29, 2024

View reviewed changes

kuhar force-pushed the materialize-tuning-config branch from bb212be to 9637295 Compare November 29, 2024 20:29

kuhar force-pushed the materialize-tuning-config branch from 2ca1571 to d3cd5b2 Compare November 30, 2024 03:33

Groverkss approved these changes Dec 1, 2024

View reviewed changes

compiler/src/iree/compiler/Codegen/Common/MaterializeUserConfigs.cpp Outdated Show resolved Hide resolved

compiler/src/iree/compiler/Codegen/Common/Passes.h Outdated Show resolved Hide resolved

kuhar force-pushed the materialize-tuning-config branch from 590e8de to 7875df9 Compare December 1, 2024 22:10

kuhar enabled auto-merge (squash) December 1, 2024 22:11

kuhar merged commit ecd87d8 into iree-org:main Dec 1, 2024
34 of 38 checks passed

MaheshRavishankar reviewed Dec 3, 2024

View reviewed changes

kuhar mentioned this pull request Dec 4, 2024

[Codegen][Tuner] Clarify tuning spec linking order. NFC. #19370

Merged

kuhar added a commit that referenced this pull request Dec 4, 2024

[Codegen][Tuner] Clarify tuning spec linking order. NFC. (#19370)

8894f5a

This clarification was suggested in #19337 (comment).

kuhar mentioned this pull request Dec 4, 2024

Include default tuning specs with the compiler #19214

Closed

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Codegen] Add pass to materialize tuning specs #19337

[Codegen] Add pass to materialize tuning specs #19337

kuhar commented Nov 28, 2024 •

edited

Loading

Groverkss commented Nov 29, 2024

bjacob commented Nov 29, 2024

kuhar commented Nov 29, 2024

kuhar commented Nov 30, 2024

MaheshRavishankar left a comment

MaheshRavishankar Dec 3, 2024

kuhar Dec 4, 2024

MaheshRavishankar Dec 3, 2024

kuhar Dec 4, 2024 •

edited

Loading

	def LinkTuningSpecsPass : Pass<"iree-codegen-link-tuning-specs", "ModuleOp"> {
	let summary =
	"Link nested transform dialect tuning specs named sequences into a single entry point";
	let description = [{
	Given a module with multiple nested tuning specs, introduce a new named sequence
	that includes all the other tuning spec entry points. The order of inclusion is the same
	as the in which these nested tuning specs appear in the IR.

	A tuning spec entry point is a `transform.named_sequence` op annotated with the
	`iree_codegen.tuning_spec` unit attribute. We require it to perform in-place op
	modification and not consume the handle.
	}];

[Codegen] Add pass to materialize tuning specs #19337

[Codegen] Add pass to materialize tuning specs #19337

Conversation

kuhar commented Nov 28, 2024 • edited Loading

Groverkss commented Nov 29, 2024

bjacob commented Nov 29, 2024

kuhar commented Nov 29, 2024

kuhar commented Nov 30, 2024

MaheshRavishankar left a comment

Choose a reason for hiding this comment

MaheshRavishankar Dec 3, 2024

Choose a reason for hiding this comment

kuhar Dec 4, 2024

Choose a reason for hiding this comment

MaheshRavishankar Dec 3, 2024

Choose a reason for hiding this comment

kuhar Dec 4, 2024 • edited Loading

Choose a reason for hiding this comment

kuhar commented Nov 28, 2024 •

edited

Loading

kuhar Dec 4, 2024 •

edited

Loading