-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Allow generated functions to return a CodeInstance
#56650
Conversation
This PR allows generated functions to return a `CodeInstance` containing optimized IR, allowing them to bypass inference and directly adding inferred code into the ordinary course of execution. This is an enabling capability for various external compiler implementations that may want to provide compilation results to the Julia runtime. As a minimal demonstrator of this capability, this adds a Cassette-like `with_new_compiler` higher-order function, which will compile/execute its arguments with the currently loaded `Compiler` package. Unlike `@activate Compiler[:codegen]`, this change is not global and the cache is fully partitioned. This by itself is a very useful feature when developing Compiler code to be able to test the full end-to-end codegen behavior before the changes are capable of fully self-hosting. A key enabler for this was the recent merging of #54899. This PR includes a hacky version of the second TODO left at the end of that PR, just to make everthing work end-to-end. This PR is working end-to-end, but all three parts of it (the CodeInstance return from generated functions, the `with_new_compiler` feature, and the interpreter integration) need some additional cleanup. This PR is mostly intended as a discussion point for what that additional work needs to be.
This sounds awesome - Does that include LLVM codegen, or just Julia IR? |
The ability to provide a In #52964 I provided an intrinsic Of course, there are two different "interfaces" here, my notion of a compiler plugin was based on the abstract interpreter interface and less on the ability to load a second copy of the compiler. I think what is dissatisfying for me with this approach is that we can't execute CodeInstances from a different owner, and instead have to transform it to The key idea in #52964 is to not have to use a Cassette like transform for propagation of the compiler and instead handle generic function calls and tasks consistently. I can rebase #52964, but I without any feedback, I didn't want to spend more energy down a path that has little chance of being adopted. |
The idea is that you can return a code instance either with inferred Julia IR set, in which case the runtime will compile it for you or with the full set of invoke pointers set, in which case it'll just become active immediately. The latter case is for people who are completely writing their own compilers in Julia and just need the entry point. That said, there's additional semantics that need to be made to work in both cases, and while I would like to support the second case, I'm not planning to actually put together anything end-to-end there for the time being. |
Why not?
We can, see the implementation. The key advantage of #52964 over this approach is that it does not force the existence of a concrete signature for the entry dispatcher and that it fully participates in the ordinary compiler cycle detection. However, there's a disadvantage as well, in that the interface is much broader, because compiler data structures become part of the ABI. There's still value to something like #52964 - it's just harder to know what it should look like, since it's a more complicated interface. Extending generated functions is quite natural, since we already know the semantics. That said, I think getting this PR fully working would actually make implementing #52964 properly easier, since it could then be implemented an an optimization over the semantics from this PR. To be concrete, rather than make the invoke_within from #52964 a builtin, make it a compiler generic generated function like |
(To preempt the complaint that |
Yeah, I don't like that additional work xD. I think that's what leaves me a bit unsatisfied, as greedy as I am. We already have a tagged CodeInstance in the System, (tagged on SplitCache here), which we untag by copying it into a new CodeInstance and then additionally we need to run a (linear) cache transformation to modify the IR.
Once the code hits the compiler, we lose track of the source and everything get's treated as one. Of course, we could go with unconditional LLVM plugins and pseudo-intrinsics like the Clang plugin interface. |
This is an alternative mechanism to #56650 that largely achieves the same result, but by hooking into `invoke` rather than a generated function. They are orthogonal mechanisms, and its possible we want both. However, in #56650, both Jameson and Valentin were skeptical of the generated function signature bottleneck. This PR is sort of a hybrid of mechanism in #52964 and what I proposed in #56650 (comment). In particular, this PR: 1. Extends `invoke` to support a CodeInstance in place of its usual `types` argument. 2. Adds a new `typeinf` optimized generic. The semantics of this optimized generic allow the compiler to instead call a companion `typeinf_edge` function, allowing a mid-inference interpreter switch (like #52964), without being forced through a concrete signature bottleneck. However, if calling `typeinf_edge` does not work (e.g. because the compiler version is mismatched), this still has well defined semantics, you just don't get inference support. The additional benefit of the `typeinf` optimized generic is that it lets custom cache owners tell the runtime how to "cure" code instances that have lost their native code. Currently the runtime only knows how to do that for `owner == nothing` CodeInstances (by re-running inference). This extension is not implemented, but the idea is that the runtime would be permitted to call the `typeinf` optimized generic on the dead CodeInstance's `owner` and `def` fields to obtain a cured CodeInstance (or a user-actionable error from the plugin). This PR includes an implementation of `with_new_compiler` from #56650. This PR includes just enough compiler support to make the compiler optimize this to the same code that #56650 produced: ``` julia> @code_typed with_new_compiler(sin, 1.0) CodeInfo( 1 ─ $(Expr(:foreigncall, :(:jl_get_tls_world_age), UInt64, svec(), 0, :(:ccall)))::UInt64 │ %2 = builtin Core.getfield(args, 1)::Float64 │ %3 = invoke sin(%2::Float64)::Float64 └── return %3 ) => Float64 ``` However, the implementation here is extremely incomplete. I'm putting it up only as a directional sketch to see if people prefer it over #56650. If so, I would prepare a cleaned up version of this PR that has the optimized generics as well as the curing support, but not the full inference integration (which needs a fair bit more work).
Something I'm a little confused about, with this PR would the ban on querying type inference form within a generated function body be lifted? |
This PR does not technically call inference within a generated function - it calls a second copy of the same code, but regardless. A lot of the issues with generated functions and calling into inference have been solved over the years by letting generated functions getting passed the world age as an argument and by allowing them to declare world bounds and edges. The primary problem remaining with recursing into inference in generated functions is that you could end up in a loop. For this PR, because the |
Closing in favor of #56660 |
This is an alternative mechanism to #56650 that largely achieves the same result, but by hooking into `invoke` rather than a generated function. They are orthogonal mechanisms, and its possible we want both. However, in #56650, both Jameson and Valentin were skeptical of the generated function signature bottleneck. This PR is sort of a hybrid of mechanism in #52964 and what I proposed in #56650 (comment). In particular, this PR: 1. Extends `invoke` to support a CodeInstance in place of its usual `types` argument. 2. Adds a new `typeinf` optimized generic. The semantics of this optimized generic allow the compiler to instead call a companion `typeinf_edge` function, allowing a mid-inference interpreter switch (like #52964), without being forced through a concrete signature bottleneck. However, if calling `typeinf_edge` does not work (e.g. because the compiler version is mismatched), this still has well defined semantics, you just don't get inference support. The additional benefit of the `typeinf` optimized generic is that it lets custom cache owners tell the runtime how to "cure" code instances that have lost their native code. Currently the runtime only knows how to do that for `owner == nothing` CodeInstances (by re-running inference). This extension is not implemented, but the idea is that the runtime would be permitted to call the `typeinf` optimized generic on the dead CodeInstance's `owner` and `def` fields to obtain a cured CodeInstance (or a user-actionable error from the plugin). This PR includes an implementation of `with_new_compiler` from #56650. That said, this PR does not yet include the compiler optimization that implements the semantics of the optimized generic, which will be in a follow up PR.
This is an alternative mechanism to #56650 that largely achieves the same result, but by hooking into `invoke` rather than a generated function. They are orthogonal mechanisms, and its possible we want both. However, in #56650, both Jameson and Valentin were skeptical of the generated function signature bottleneck. This PR is sort of a hybrid of mechanism in #52964 and what I proposed in #56650 (comment). In particular, this PR: 1. Extends `invoke` to support a CodeInstance in place of its usual `types` argument. 2. Adds a new `typeinf` optimized generic. The semantics of this optimized generic allow the compiler to instead call a companion `typeinf_edge` function, allowing a mid-inference interpreter switch (like #52964), without being forced through a concrete signature bottleneck. However, if calling `typeinf_edge` does not work (e.g. because the compiler version is mismatched), this still has well defined semantics, you just don't get inference support. The additional benefit of the `typeinf` optimized generic is that it lets custom cache owners tell the runtime how to "cure" code instances that have lost their native code. Currently the runtime only knows how to do that for `owner == nothing` CodeInstances (by re-running inference). This extension is not implemented, but the idea is that the runtime would be permitted to call the `typeinf` optimized generic on the dead CodeInstance's `owner` and `def` fields to obtain a cured CodeInstance (or a user-actionable error from the plugin). This PR includes an implementation of `with_new_compiler` from #56650. That said, this PR does not yet include the compiler optimization that implements the semantics of the optimized generic, which will be in a follow up PR.
This is an alternative mechanism to #56650 that largely achieves the same result, but by hooking into `invoke` rather than a generated function. They are orthogonal mechanisms, and its possible we want both. However, in #56650, both Jameson and Valentin were skeptical of the generated function signature bottleneck. This PR is sort of a hybrid of mechanism in #52964 and what I proposed in #56650 (comment). In particular, this PR: 1. Extends `invoke` to support a CodeInstance in place of its usual `types` argument. 2. Adds a new `typeinf` optimized generic. The semantics of this optimized generic allow the compiler to instead call a companion `typeinf_edge` function, allowing a mid-inference interpreter switch (like #52964), without being forced through a concrete signature bottleneck. However, if calling `typeinf_edge` does not work (e.g. because the compiler version is mismatched), this still has well defined semantics, you just don't get inference support. The additional benefit of the `typeinf` optimized generic is that it lets custom cache owners tell the runtime how to "cure" code instances that have lost their native code. Currently the runtime only knows how to do that for `owner == nothing` CodeInstances (by re-running inference). This extension is not implemented, but the idea is that the runtime would be permitted to call the `typeinf` optimized generic on the dead CodeInstance's `owner` and `def` fields to obtain a cured CodeInstance (or a user-actionable error from the plugin). This PR includes an implementation of `with_new_compiler` from #56650. That said, this PR does not yet include the compiler optimization that implements the semantics of the optimized generic, which will be in a follow up PR.
This is an alternative mechanism to #56650 that largely achieves the same result, but by hooking into `invoke` rather than a generated function. They are orthogonal mechanisms, and its possible we want both. However, in #56650, both Jameson and Valentin were skeptical of the generated function signature bottleneck. This PR is sort of a hybrid of mechanism in #52964 and what I proposed in #56650 (comment). In particular, this PR: 1. Extends `invoke` to support a CodeInstance in place of its usual `types` argument. 2. Adds a new `typeinf` optimized generic. The semantics of this optimized generic allow the compiler to instead call a companion `typeinf_edge` function, allowing a mid-inference interpreter switch (like #52964), without being forced through a concrete signature bottleneck. However, if calling `typeinf_edge` does not work (e.g. because the compiler version is mismatched), this still has well defined semantics, you just don't get inference support. The additional benefit of the `typeinf` optimized generic is that it lets custom cache owners tell the runtime how to "cure" code instances that have lost their native code. Currently the runtime only knows how to do that for `owner == nothing` CodeInstances (by re-running inference). This extension is not implemented, but the idea is that the runtime would be permitted to call the `typeinf` optimized generic on the dead CodeInstance's `owner` and `def` fields to obtain a cured CodeInstance (or a user-actionable error from the plugin). This PR includes an implementation of `with_new_compiler` from #56650. This PR includes just enough compiler support to make the compiler optimize this to the same code that #56650 produced: ``` julia> @code_typed with_new_compiler(sin, 1.0) CodeInfo( 1 ─ $(Expr(:foreigncall, :(:jl_get_tls_world_age), UInt64, svec(), 0, :(:ccall)))::UInt64 │ %2 = builtin Core.getfield(args, 1)::Float64 │ %3 = invoke sin(%2::Float64)::Float64 └── return %3 ) => Float64 ``` However, the implementation here is extremely incomplete. I'm putting it up only as a directional sketch to see if people prefer it over #56650. If so, I would prepare a cleaned up version of this PR that has the optimized generics as well as the curing support, but not the full inference integration (which needs a fair bit more work).
This is an alternative mechanism to #56650 that largely achieves the same result, but by hooking into `invoke` rather than a generated function. They are orthogonal mechanisms, and its possible we want both. However, in #56650, both Jameson and Valentin were skeptical of the generated function signature bottleneck. This PR is sort of a hybrid of mechanism in #52964 and what I proposed in #56650 (comment). In particular, this PR: 1. Extends `invoke` to support a CodeInstance in place of its usual `types` argument. 2. Adds a new `typeinf` optimized generic. The semantics of this optimized generic allow the compiler to instead call a companion `typeinf_edge` function, allowing a mid-inference interpreter switch (like #52964), without being forced through a concrete signature bottleneck. However, if calling `typeinf_edge` does not work (e.g. because the compiler version is mismatched), this still has well defined semantics, you just don't get inference support. The additional benefit of the `typeinf` optimized generic is that it lets custom cache owners tell the runtime how to "cure" code instances that have lost their native code. Currently the runtime only knows how to do that for `owner == nothing` CodeInstances (by re-running inference). This extension is not implemented, but the idea is that the runtime would be permitted to call the `typeinf` optimized generic on the dead CodeInstance's `owner` and `def` fields to obtain a cured CodeInstance (or a user-actionable error from the plugin). This PR includes an implementation of `with_new_compiler` from #56650. This PR includes just enough compiler support to make the compiler optimize this to the same code that #56650 produced: ``` julia> @code_typed with_new_compiler(sin, 1.0) CodeInfo( 1 ─ $(Expr(:foreigncall, :(:jl_get_tls_world_age), UInt64, svec(), 0, :(:ccall)))::UInt64 │ %2 = builtin Core.getfield(args, 1)::Float64 │ %3 = invoke sin(%2::Float64)::Float64 └── return %3 ) => Float64 ``` However, the implementation here is extremely incomplete. I'm putting it up only as a directional sketch to see if people prefer it over #56650. If so, I would prepare a cleaned up version of this PR that has the optimized generics as well as the curing support, but not the full inference integration (which needs a fair bit more work).
This PR allows generated functions to return a
CodeInstance
containing optimized IR, allowing them to bypass inference and directly adding inferred code into the ordinary course of execution. This is an enabling capability for various external compiler implementations that may want to provide compilation results to the Julia runtime.As a minimal demonstrator of this capability, this adds a Cassette-like
with_new_compiler
higher-order function, which will compile/execute its arguments with the currently loadedCompiler
package. Unlike@activate Compiler[:codegen]
, this change is not global and the cache is fully partitioned. This by itself is a very useful feature when developing Compiler code to be able to test the full end-to-end codegen behavior before the changes are capable of fully self-hosting.A key enabler for this was the recent merging of #54899. This PR includes a hacky version of the second TODO left at the end of that PR, just to make everthing work end-to-end.
This PR is working end-to-end, but all three parts of it (the CodeInstance return from generated functions, the
with_new_compiler
feature, and the interpreter integration) need some additional cleanup. This PR is mostly intended as a discussion point for what that additional work needs to be.