-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Driver] Single-module lowering flow in driver_api.cc #14985
base: main
Are you sure you want to change the base?
Conversation
Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.
Generated by tvm-bot |
e459552
to
e7bfa2d
Compare
The functionality tested in this commit was added across several recent PRs, each of which tested their features in isolation. This PR adds unit tests to validate the end-to-end behavior of TIR subroutine calls. PRs building up to this point: - TVMScript - apache#14889 - apache#14915 - apache#14919 - apache#14941 - Functionality improvements of existing TIR passes - apache#14913 - apache#14914 - apache#14918 - apache#14951 - Changes to the TIR lowering flow - apache#14942 - apache#14985 - Codegen updates - apache#14958 - apache#14901 - Compatibility updates/fixes - apache#14892 - apache#14950 - apache#14943 - apache#14944 - apache#14945 - apache#14952 - apache#14982 - apache#14949
af1628f
to
870ba06
Compare
Rebased onto main now that #14944 has landed, marked as ready for review. |
870ba06
to
5286508
Compare
5286508
to
3a3329f
Compare
3a3329f
to
4c2dd50
Compare
The functionality tested in this commit was added across several recent PRs, each of which tested their features in isolation. This PR adds unit tests to validate the end-to-end behavior of TIR subroutine calls. PRs building up to this point: - TVMScript - apache#14889 - apache#14915 - apache#14919 - apache#14941 - Functionality improvements of existing TIR passes - apache#14913 - apache#14914 - apache#14918 - apache#14951 - Changes to the TIR lowering flow - apache#14942 - apache#14985 - Codegen updates - apache#14958 - apache#14901 - Compatibility updates/fixes - apache#14892 - apache#14950 - apache#14943 - apache#14944 - apache#14945 - apache#14952 - apache#14982 - apache#14949
99e7073
to
03c39ef
Compare
1d8454e
to
cc7d384
Compare
The functionality tested in this commit was added across several recent PRs, each of which tested their features in isolation. This PR adds unit tests to validate the end-to-end behavior of TIR subroutine calls. PRs building up to this point: - TVMScript - apache#14889 - apache#14915 - apache#14919 - apache#14941 - Functionality improvements of existing TIR passes - apache#14913 - apache#14914 - apache#14918 - apache#14951 - Changes to the TIR lowering flow - apache#14942 - apache#14985 - Codegen updates - apache#14958 - apache#14901 - Compatibility updates/fixes - apache#14892 - apache#14950 - apache#14943 - apache#14944 - apache#14945 - apache#14952 - apache#14982 - apache#14949
The functionality tested in this commit was added across several recent PRs, each of which tested their features in isolation. This PR adds unit tests to validate the end-to-end behavior of TIR subroutine calls. PRs building up to this point: - TVMScript - apache#14889 - apache#14915 - apache#14919 - apache#14941 - Functionality improvements of existing TIR passes - apache#14913 - apache#14914 - apache#14918 - apache#14951 - Changes to the TIR lowering flow - apache#14942 - apache#14985 - Codegen updates - apache#14958 - apache#14901 - Compatibility updates/fixes - apache#14892 - apache#14950 - apache#14943 - apache#14944 - apache#14945 - apache#14952 - apache#14982 - apache#14949
The functionality tested in this commit was added across several recent PRs, each of which tested their features in isolation. This PR adds unit tests to validate the end-to-end behavior of TIR subroutine calls. PRs building up to this point: - TVMScript - apache#14889 - apache#14915 - apache#14919 - apache#14941 - Functionality improvements of existing TIR passes - apache#14913 - apache#14914 - apache#14918 - apache#14951 - Changes to the TIR lowering flow - apache#14942 - apache#14985 - Codegen updates - apache#14958 - apache#14901 - Compatibility updates/fixes - apache#14892 - apache#14950 - apache#14943 - apache#14944 - apache#14945 - apache#14952 - apache#14982 - apache#14949
2c850d6
to
93fde73
Compare
27153ac
to
f20f78c
Compare
The functionality tested in this commit was added across several recent PRs, each of which tested their features in isolation. This PR adds unit tests to validate the end-to-end behavior of TIR subroutine calls. PRs building up to this point: - TVMScript - apache#14889 - apache#14915 - apache#14919 - apache#14941 - Functionality improvements of existing TIR passes - apache#14913 - apache#14914 - apache#14918 - apache#14951 - Changes to the TIR lowering flow - apache#14942 - apache#14985 - Codegen updates - apache#14958 - apache#14901 - Compatibility updates/fixes - apache#14892 - apache#14950 - apache#14943 - apache#14944 - apache#14945 - apache#14952 - apache#14982 - apache#14949
f20f78c
to
b50ccd5
Compare
af2964d
to
2296be8
Compare
The functionality tested in this commit was added across several recent PRs, each of which tested their features in isolation. This PR adds unit tests to validate the end-to-end behavior of TIR subroutine calls. PRs building up to this point: - TVMScript - apache#14889 - apache#14915 - apache#14919 - apache#14941 - Functionality improvements of existing TIR passes - apache#14913 - apache#14914 - apache#14918 - apache#14951 - Changes to the TIR lowering flow - apache#14942 - apache#14985 - Codegen updates - apache#14958 - apache#14901 - Compatibility updates/fixes - apache#14892 - apache#14950 - apache#14943 - apache#14944 - apache#14945 - apache#14952 - apache#14982 - apache#14949
The functionality tested in this commit was added across several recent PRs, each of which tested their features in isolation. This PR adds unit tests to validate the end-to-end behavior of TIR subroutine calls. PRs building up to this point: - TVMScript - apache#14889 - apache#14915 - apache#14919 - apache#14941 - Functionality improvements of existing TIR passes - apache#14913 - apache#14914 - apache#14918 - apache#14951 - Changes to the TIR lowering flow - apache#14942 - apache#14985 - Codegen updates - apache#14958 - apache#14901 - Compatibility updates/fixes - apache#14892 - apache#14950 - apache#14943 - apache#14944 - apache#14945 - apache#14952 - apache#14982 - apache#14949
2b7c0af
to
3e493d5
Compare
Prior to this commit, if device initialization is required, the AOT main function produced a `call_extern()` that included the device context as input. This commit updates the AOT main function to provide the device context only if the function being called accepts a device context as input. If an extra device context argument is included at the call site, the C codegen would produce a function signature that includes the device context for the caller's compilation unit, but a signature without the device context for the callee's compilation unit. While this can compile and run in some cases, it is undefined behavior for the signature to vary between compilation units, and should be avoided. This was initially discovered while debugging apache#14985, in which changes to the lowering flow resulted in the caller and callee being within the same compilation unit.
Prior to this commit, if device initialization is required, the AOT main function produced a `call_extern()` that included the device context as input. This commit updates the AOT main function to provide the device context only if the function being called accepts a device context as input. If an extra device context argument is included at the call site, the C codegen would produce a function signature that includes the device context for the caller's compilation unit, but a signature without the device context for the callee's compilation unit. While this can compile and run in some cases, it is undefined behavior for the signature to vary between compilation units, and should be avoided. This was initially discovered while debugging #14985, in which changes to the lowering flow resulted in the caller and callee being within the same compilation unit.
Prior to this commit, if device initialization is required, the AOT main function produced a `call_extern()` that included the device context as input. This commit updates the AOT main function to provide the device context only if the function being called accepts a device context as input. If an extra device context argument is included at the call site, the C codegen would produce a function signature that includes the device context for the caller's compilation unit, but a signature without the device context for the callee's compilation unit. While this can compile and run in some cases, it is undefined behavior for the signature to vary between compilation units, and should be avoided. This was initially discovered while debugging apache#14985, in which changes to the lowering flow resulted in the caller and callee being within the same compilation unit.
Prior to this commit, if device initialization is required, the AOT main function produced a `call_extern()` that included the device context as input. This commit updates the AOT main function to provide the device context only if the function being called accepts a device context as input. If an extra device context argument is included at the call site, the C codegen would produce a function signature that includes the device context for the caller's compilation unit, but a signature without the device context for the callee's compilation unit. While this can compile and run in some cases, it is undefined behavior for the signature to vary between compilation units, and should be avoided. This was initially discovered while debugging apache#14985, in which changes to the lowering flow resulted in the caller and callee being within the same compilation unit.
Prior to this commit, if device initialization is required, the AOT main function produced a `call_extern()` that included the device context as input. This commit updates the AOT main function to provide the device context only if the function being called accepts a device context as input. If an extra device context argument is included at the call site, the C codegen would produce a function signature that includes the device context for the caller's compilation unit, but a signature without the device context for the callee's compilation unit. While this can compile and run in some cases, it is undefined behavior for the signature to vary between compilation units, and should be avoided. This was initially discovered while debugging apache#14985, in which changes to the lowering flow resulted in the caller and callee being within the same compilation unit.
3e493d5
to
5424a49
Compare
Analogous to #14901, treat GlobalVar callees as internal function calls in CodeGenC. This specific PR doesn't provide new end-to-end functionality, as the target="c" backend isn't compiled. It does lead into allowing subroutines in any target whose codegen derives from CodeGenC, which will depend on the single-module lowering flow in #14985. * [CodeGenC] Added unit tests for desired behavior * [CodeGenC] Handle GlobalVar callee as internal function call * Update CodeGenC subclasses for updated interface - Call `DeclareFunction` for each `PrimFunc`, prior to any `AddFunction` calls - Provide both `GlobalVar` and `PrimFunc` to `AddFunction` calls. * Updated CRT test to expect forward declaration * Provide forward declarations for call_extern in cmsis * Avoid duplicate forward declaration C's automatic pointer cast (e.g. `void*` to `int*`) means that use of the arguments to infer the function signature may be incorrect. If a `call_extern` refers to a function within the same module, only output a single forward declaration based on the PrimFunc's parameters, not based on the CallNode's arguments. * Updated expected ptx cuda * Cast the AOT pools to the arg type * Improved tvm::GetType for tvm_access_ptr and address_of These `Call` instances can return a `PointerType(PrimType(pointee_dtype))` rather than a `PrimType(DataType::Handle())`. * [ARM][Topi] Update micro kernels to use same argument type as caller Previously, the micro kernels for gemm, avg_pool, max_pool, and tensordot relied on C's implicit type conversions for the arguments, when the caller's argument types differ from the signature's parameter types. This works, except when the codegen has auto-generated a forward declaration based on the caller's argument types, such as during AOT, which then causes a conflicting definition. Since the codegen cannot determine the functions names from the `"pragma_import_c"` in order to suppress these forward declarations, this conflict can be more easily resolved by updating the micro kernel signatures. The three types of mismatches are below. - Use of `int` or `long` parameters, whose width may vary by compiler, instead of fixed-width types. - TIR expecting the data array's integer type to also be used as an error code's return type, rather than the micro kernels' `int32_t` error code. - Pointer conversion done during argument conversion. Type conversions are done at the start of each micro kernel, to avoid changing types that are used within the computational sections of each micro kernel. * Updated unit tests with private=True Required for internal functions after PR #15214 * Docstring updates from review
5424a49
to
85b1d5e
Compare
The functionality tested in this commit was added across several recent PRs, each of which tested their features in isolation. This PR adds unit tests to validate the end-to-end behavior of TIR subroutine calls. PRs building up to this point: - TVMScript - apache#14889 - apache#14915 - apache#14919 - apache#14941 - Functionality improvements of existing TIR passes - apache#14913 - apache#14914 - apache#14918 - apache#14951 - Changes to the TIR lowering flow - apache#14942 - apache#14985 - Codegen updates - apache#14958 - apache#14901 - Compatibility updates/fixes - apache#14892 - apache#14950 - apache#14943 - apache#14944 - apache#14945 - apache#14952 - apache#14982 - apache#14949
85b1d5e
to
1124239
Compare
With #16184 landed, this PR should (hopefully) be able to land without requiring all low-level codegen to directly support function calls. For codegens that do not yet support function calls, device-side kernels can have subroutines inlined to avoid a recurrence of #16033. Currently, this PR is rebased on top of main, but does not use of |
1124239
to
38e3ac9
Compare
38e3ac9
to
485bd3e
Compare
485bd3e
to
136a459
Compare
b3ff7c4
to
04ccfef
Compare
Prior to this commit, a build that used multiple targets needed to provide `tvm::build` with a `Map<Target, IRModule>` specifying which target should be used to compile each `IRModule`. As a result, lowering passes could not introduce new targets based on a PrimFunc's content (e.g. a `with T.target()` frame to delegate out to another device), nor simplify based on cross-device subroutines (e.g. simplify a host-side conditional based on the known output of a device-side internal subroutine). This commit makes the `tvm::attr::kTarget` attribute (`"target"`) be the single source of truth for where a `PrimFunc` will be executed. Other existing methods for specifying the target (the `target` parameter for `tvm.build`, the keys in a `Map<Target,IRModule>`, the parameter to the pass `tir::transform::BindTarget`) are still accepted as inputs, and may provide a default value for `tvm::attr::kTarget` if the attribute is missing, but may not overwrite the target attribute. This is part of a series of commits to simplify the handling of multi-target builds.
04ccfef
to
e7bebaf
Compare
The failing unit tests here are blocked by #15757. The |
If a flattened buffer is produced for use in `BufferLoad` and `BufferStore` statements, generate a `DeclBuffer`. This is a subset of the changes made in apache#14778, broken out for ease of testing and review.
When producing a flattened buffer for use in `BufferLoad` and `BufferStore` nodes, generate a `DeclBuffer` for the flattened buffer. This is a subset of the changes made in apache#14778, broken out for ease of testing and review.
Prior to this commit, a build that used multiple targets needed to provide
tvm::build
with aMap<Target, IRModule>
specifying which target should be used to compile eachIRModule
. As a result, lowering passes could not introduce new targets based on a PrimFunc's content (e.g. awith T.target()
frame to delegate out to another device), nor simplify based on cross-device subroutines (e.g. simplify a host-side conditional based on the known output of a device-side internal subroutine).This commit makes the
tvm::attr::kTarget
attribute ("target"
) be the single source of truth for where aPrimFunc
will be executed. Other existing methods for specifying the target (thetarget
parameter fortvm.build
, the keys in aMap<Target,IRModule>
, the parameter to the passtir::transform::BindTarget
) are still accepted as inputs, and may provide a default value fortvm::attr::kTarget
if the attribute is missing, but may not overwrite the target attribute.This is part of a series of commits to simplify the handling of multi-target builds.