-
Notifications
You must be signed in to change notification settings - Fork 650
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Codegen] Add pass to materialize tuning specs #19337
Conversation
75783f1
to
bb212be
Compare
compiler/src/iree/compiler/Codegen/Common/MaterializeTuningSpecsPass.cpp
Show resolved
Hide resolved
compiler/src/iree/compiler/Codegen/Common/MaterializeTuningSpecsPass.cpp
Show resolved
Hide resolved
compiler/src/iree/compiler/Codegen/Common/MaterializeTuningSpecsPass.cpp
Show resolved
Hide resolved
compiler/src/iree/compiler/Codegen/Common/test/materialize_tuning_specs.mlir
Show resolved
Hide resolved
This will happen per dispatch right? Can we not use a symbol to refer to it? It sounds like something terrible to roundtrip. For each dispatch, you will read the same tuning spec again and again. I would prefer it just being in a seperate module. |
@kuhar @Groverkss and I had a video chat about this PR this morning. I think I understand and agree with all what this PR is doing on a high level! I'm out of my depth on low-level details here, particularly where transform dialect is involved, so deferring to @Groverkss for the actual code review. |
bb212be
to
9637295
Compare
I added some pass description and comments in the pass description to clarify this. This is wasteful/redundant in the sense that each dispatch will be (most likely) annotated with the same tuning spec attribute, but the underlying storage will be unique by the context. In the future, we can avoid the duplication of work during linking by allowing us to look up linked libraries in the dialect (based on the paths to all of the inputs). I plan to add this once we plumb through default tuning specs; the current implementation does the minimum to make it work, we can optimize later. |
2ca1571
to
d3cd5b2
Compare
I added support for looking up the tuning spec in any of the parent ops like @bjacob suggested (with an accompanying test). |
compiler/src/iree/compiler/Codegen/Common/MaterializeUserConfigs.cpp
Outdated
Show resolved
Hide resolved
... and update 'Materialize User Configs' to pick up those tuning specs. The overall flow is as follows: * We pick up any user-specified tuning specs in `materialize tuning specs` and link them into a single transform dialect library module. * We serialize that linked tuning spec as MLIR bytecode. * We embed this MLIR bytecode as a module attribute. This is so that none of the subsequent passes will accidentally `walk` or otherwise modify it. * In `materilize user configs`, we first check if there are any transform libraries provided. If not, then we check if the tuning spec is present. * We deserialize the tuning spec attribute into a transform dialect library module and execute it. * We remove the serialized tuning spec from the module, as it's no longer needed. Signed-off-by: Jakub Kuderski <[email protected]>
590e8de
to
7875df9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @kuhar . Left a few more comments.
return success(); | ||
} | ||
|
||
llvm::sys::fs::create_directories(dir); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should check what happens on Windows. Last time I used llvm::sys::fs it caused issues on Windows.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I copied this code from the --iree-hal-dump-executable-files-to
logic, so I'd think it should work in both places, although I haven't checked on windows. If we learn it doesn't work, I can fix it.
➜ ag fs::create_directories
src/iree/compiler/Codegen/Common/MaterializeTuningSpecsPass.cpp
60: llvm::sys::fs::create_directories(dir);
src/iree/compiler/Dialect/Util/Transforms/DumpModule.cpp
29: llvm::sys::fs::create_directories(llvm::sys::path::parent_path(path));
src/iree/compiler/Dialect/HAL/Transforms/SerializeExecutables.cpp
72: llvm::sys::fs::create_directories(dumpIntermediatesPath);
75: llvm::sys::fs::create_directories(dumpBinariesPath);
src/iree/compiler/Dialect/HAL/Transforms/DumpExecutableSources.cpp
50: llvm::sys::fs::create_directories(path);
src/iree/compiler/Dialect/HAL/Transforms/DumpExecutableBenchmarks.cpp
514: llvm::sys::fs::create_directories(path);
if (tuningSpecs.empty()) { | ||
LDBG("No tuning specs found, exiting without linking"); | ||
return; | ||
for (ModuleOp nested : findNestedModulesWithNamedSequences(module)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should document more carefully what happens when there are conflicts, i.e some ordering of the spec. From the looks of this the default gets higher precedence than the user defined. I would make it the other way round.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I commented in the pass description that the order in the module is preserved, IE, top down:
iree/compiler/src/iree/compiler/Codegen/Common/Passes.td
Lines 409 to 420 in 29229df
def LinkTuningSpecsPass : Pass<"iree-codegen-link-tuning-specs", "ModuleOp"> { | |
let summary = | |
"Link nested transform dialect tuning specs named sequences into a single entry point"; | |
let description = [{ | |
Given a module with multiple nested tuning specs, introduce a new named sequence | |
that includes all the other tuning spec entry points. The order of inclusion is the same | |
as the in which these nested tuning specs appear in the IR. | |
A tuning spec entry point is a `transform.named_sequence` op annotated with the | |
`iree_codegen.tuning_spec` unit attribute. We require it to perform in-place op | |
modification and not consume the handle. | |
}]; |
From the looks of this the default gets higher precedence than the user defined. I would make it the other way round.
Currently we don't have any default specs, so there's nothing to test. But the intention is to have default specs apply at the very end. Once we have that, I'll make sure to add a test that checks the order.
... and update 'Materialize User Configs' to pick up those tuning specs. The overall flow is as follows: * We pick up any user-specified tuning specs in `materialize tuning specs` and link them into a single transform dialect library module. * We serialize that linked tuning spec as MLIR bytecode. * We embed this MLIR bytecode as a module attribute. This is so that none of the subsequent passes will accidentally `walk` or otherwise modify it. * In `materialize user configs`, we first check if there are any transform libraries provided. If not, then we check if the tuning spec is present. * We deserialize the tuning spec attribute into a transform dialect library module and execute it. * We remove the serialized tuning spec from the module, as it's no longer needed. I also modified `getOrLoadTransformLibraryModule` so that it doesn't use the `transform::detail::assembleTransformLibraryFromPaths` function, because it has some logic to perform library merging that would overwrite module symbol names. There's no need to call it anyway, since we are loading a single library at a time. This is not added to any codegen pipeline yet -- I will do that in a future PR. Issue: iree-org#19214 Signed-off-by: Giacomo Serafini <[email protected]>
This clarification was suggested in #19337 (comment).
... and update 'Materialize User Configs' to pick up those tuning specs.
The overall flow is as follows:
materialize tuning specs
and link them into a single transform dialect library module.walk
or otherwise modify it.materialize user configs
, we first check if there are any transform libraries provided. If not, then we check if the tuning spec is present.I also modified
getOrLoadTransformLibraryModule
so that it doesn't use thetransform::detail::assembleTransformLibraryFromPaths
function, because it has some logic to perform library merging that would overwrite module symbol names. There's no need to call it anyway, since we are loading a single library at a time.This is not added to any codegen pipeline yet -- I will do that in a future PR.
Issue: #19214