Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Codegen] Add pass to materialize tuning specs #19337

Merged
merged 1 commit into from
Dec 1, 2024

Conversation

kuhar
Copy link
Member

@kuhar kuhar commented Nov 28, 2024

... and update 'Materialize User Configs' to pick up those tuning specs.

The overall flow is as follows:

  • We pick up any user-specified tuning specs in materialize tuning specs and link them into a single transform dialect library module.
  • We serialize that linked tuning spec as MLIR bytecode.
  • We embed this MLIR bytecode as a module attribute. This is so that none of the subsequent passes will accidentally walk or otherwise modify it.
  • In materialize user configs, we first check if there are any transform libraries provided. If not, then we check if the tuning spec is present.
  • We deserialize the tuning spec attribute into a transform dialect library module and execute it.
  • We remove the serialized tuning spec from the module, as it's no longer needed.

I also modified getOrLoadTransformLibraryModule so that it doesn't use the transform::detail::assembleTransformLibraryFromPaths function, because it has some logic to perform library merging that would overwrite module symbol names. There's no need to call it anyway, since we are loading a single library at a time.

This is not added to any codegen pipeline yet -- I will do that in a future PR.

Issue: #19214

@Groverkss
Copy link
Contributor

We embed this MLIR bytecode as a module attribute. This is so that none of the subsequent passes will accidentally walk or otherwise modify it.

This will happen per dispatch right? Can we not use a symbol to refer to it? It sounds like something terrible to roundtrip. For each dispatch, you will read the same tuning spec again and again. I would prefer it just being in a seperate module.

@bjacob
Copy link
Contributor

bjacob commented Nov 29, 2024

@kuhar @Groverkss and I had a video chat about this PR this morning. I think I understand and agree with all what this PR is doing on a high level! I'm out of my depth on low-level details here, particularly where transform dialect is involved, so deferring to @Groverkss for the actual code review.

@kuhar kuhar force-pushed the materialize-tuning-config branch from bb212be to 9637295 Compare November 29, 2024 20:29
@kuhar
Copy link
Member Author

kuhar commented Nov 29, 2024

We embed this MLIR bytecode as a module attribute. This is so that none of the subsequent passes will accidentally walk or otherwise modify it.

This will happen per dispatch right? Can we not use a symbol to refer to it? It sounds like something terrible to roundtrip. For each dispatch, you will read the same tuning spec again and again. I would prefer it just being in a seperate module.

I added some pass description and comments in the pass description to clarify this.

This is wasteful/redundant in the sense that each dispatch will be (most likely) annotated with the same tuning spec attribute, but the underlying storage will be unique by the context. In the future, we can avoid the duplication of work during linking by allowing us to look up linked libraries in the dialect (based on the paths to all of the inputs). I plan to add this once we plumb through default tuning specs; the current implementation does the minimum to make it work, we can optimize later.

@kuhar kuhar force-pushed the materialize-tuning-config branch from 2ca1571 to d3cd5b2 Compare November 30, 2024 03:33
@kuhar
Copy link
Member Author

kuhar commented Nov 30, 2024

I added support for looking up the tuning spec in any of the parent ops like @bjacob suggested (with an accompanying test).

... and update 'Materialize User Configs' to pick up those tuning specs.

The overall flow is as follows:
* We pick up any user-specified tuning specs in `materialize tuning
  specs` and link them into a single transform dialect library module.
* We serialize that linked tuning spec as MLIR bytecode.
* We embed this MLIR bytecode as a module attribute. This is so that
  none of the subsequent passes will accidentally `walk` or otherwise
  modify it.
* In `materilize user configs`, we first check if there are any
  transform libraries provided. If not, then we check if the tuning spec
  is present.
* We deserialize the tuning spec attribute into a transform dialect
  library module and execute it.
* We remove the serialized tuning spec from the module, as it's no
  longer needed.

Signed-off-by: Jakub Kuderski <[email protected]>
@kuhar kuhar force-pushed the materialize-tuning-config branch from 590e8de to 7875df9 Compare December 1, 2024 22:10
@kuhar kuhar enabled auto-merge (squash) December 1, 2024 22:11
@kuhar kuhar merged commit ecd87d8 into iree-org:main Dec 1, 2024
34 of 38 checks passed
Copy link
Contributor

@MaheshRavishankar MaheshRavishankar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @kuhar . Left a few more comments.

return success();
}

llvm::sys::fs::create_directories(dir);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should check what happens on Windows. Last time I used llvm::sys::fs it caused issues on Windows.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I copied this code from the --iree-hal-dump-executable-files-to logic, so I'd think it should work in both places, although I haven't checked on windows. If we learn it doesn't work, I can fix it.

ag fs::create_directories                  
src/iree/compiler/Codegen/Common/MaterializeTuningSpecsPass.cpp
60:  llvm::sys::fs::create_directories(dir);

src/iree/compiler/Dialect/Util/Transforms/DumpModule.cpp
29:    llvm::sys::fs::create_directories(llvm::sys::path::parent_path(path));

src/iree/compiler/Dialect/HAL/Transforms/SerializeExecutables.cpp
72:      llvm::sys::fs::create_directories(dumpIntermediatesPath);
75:      llvm::sys::fs::create_directories(dumpBinariesPath);

src/iree/compiler/Dialect/HAL/Transforms/DumpExecutableSources.cpp
50:      llvm::sys::fs::create_directories(path);

src/iree/compiler/Dialect/HAL/Transforms/DumpExecutableBenchmarks.cpp
514:      llvm::sys::fs::create_directories(path);

if (tuningSpecs.empty()) {
LDBG("No tuning specs found, exiting without linking");
return;
for (ModuleOp nested : findNestedModulesWithNamedSequences(module)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should document more carefully what happens when there are conflicts, i.e some ordering of the spec. From the looks of this the default gets higher precedence than the user defined. I would make it the other way round.

Copy link
Member Author

@kuhar kuhar Dec 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I commented in the pass description that the order in the module is preserved, IE, top down:

def LinkTuningSpecsPass : Pass<"iree-codegen-link-tuning-specs", "ModuleOp"> {
let summary =
"Link nested transform dialect tuning specs named sequences into a single entry point";
let description = [{
Given a module with multiple nested tuning specs, introduce a new named sequence
that includes all the other tuning spec entry points. The order of inclusion is the same
as the in which these nested tuning specs appear in the IR.
A tuning spec entry point is a `transform.named_sequence` op annotated with the
`iree_codegen.tuning_spec` unit attribute. We require it to perform in-place op
modification and not consume the handle.
}];

From the looks of this the default gets higher precedence than the user defined. I would make it the other way round.

Currently we don't have any default specs, so there's nothing to test. But the intention is to have default specs apply at the very end. Once we have that, I'll make sure to add a test that checks the order.

giacs-epic pushed a commit to giacs-epic/iree that referenced this pull request Dec 4, 2024
... and update 'Materialize User Configs' to pick up those tuning specs.

The overall flow is as follows:
* We pick up any user-specified tuning specs in `materialize tuning
specs` and link them into a single transform dialect library module.
* We serialize that linked tuning spec as MLIR bytecode.
* We embed this MLIR bytecode as a module attribute. This is so that
none of the subsequent passes will accidentally `walk` or otherwise
modify it.
* In `materialize user configs`, we first check if there are any
transform libraries provided. If not, then we check if the tuning spec
is present.
* We deserialize the tuning spec attribute into a transform dialect
library module and execute it.
* We remove the serialized tuning spec from the module, as it's no
longer needed.

I also modified `getOrLoadTransformLibraryModule` so that it doesn't use
the `transform::detail::assembleTransformLibraryFromPaths` function,
because it has some logic to perform library merging that would
overwrite module symbol names. There's no need to call it anyway, since
we are loading a single library at a time.

This is not added to any codegen pipeline yet -- I will do that in a
future PR.

Issue: iree-org#19214
Signed-off-by: Giacomo Serafini <[email protected]>
kuhar added a commit that referenced this pull request Dec 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants