-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mir_transform: implement #[rustc_force_inline]
#134082
Conversation
Some changes occurred in compiler/rustc_codegen_gcc Some changes occurred to MIR optimizations cc @rust-lang/wg-mir-opt |
This comment was marked as resolved.
This comment was marked as resolved.
3ddee57
to
ef59df9
Compare
This should have zero perf impact because the attribute isn't used, but let's just double check. |
This comment has been minimized.
This comment has been minimized.
mir_transform: implement `#[rustc_force_inline]` Adds `#[rustc_force_inline]` which is similar to always inlining but reports an error if the inlining was not possible. - `#[rustc_force_inline]` can only be applied to free functions to guarantee that the MIR inliner will be able to resolve calls. - `rustc_mir_transform::inline::Inline` is refactored into two passes (`Inline` and `ForceInline`), sharing the vast majority of the implementation. - `rustc_mir_transform::inline::ForceInline` can't be disabled so annotated items are always inlined. - `rustc_mir_transform::inline::ForceInline` runs regardless of optimisation level. - `#[rustc_force_inline]` won't inline unless target features match, as with normal inlining. - MIR validation will ICE if a `#[rustc_force_inline]` isn't inlined, to guarantee that it will never be codegened independently. As a further guarantee, monomorphisation collection will always decide that `#[rustc_force_inline]` functions cannot be codegened locally. - Like other intrinsics, `#[rustc_force_inline]` annotated functions cannot be cast to function pointers. - As with other rustc attrs, this cannot be used by users, just within the compiler and standard library. - This is only implemented within rustc, so should avoid any limitations of LLVM's inlining. It is intended that this attribute be used with intrinsics that must be inlined for security reasons. For example, pointer authentication intrinsics would allow Rust users to make use of pointer authentication instructions, but if these intrinsic functions were in the binary then they could be used as gadgets with ROP attacks, defeating the point of introducing them. We don't have any intrinsics like this today, but I expect to upstream some once a force inlining mechanism such as this is available. cc rust-lang#131687 rust-lang/rfcs#3711 - this approach should resolve the concerns from these previous attempts r? `@saethlin`
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (4ce6fc9): comparison URL. Overall result: ❌ regressions - please read the text belowBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @bors rollup=never Instruction countThis is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.
Max RSS (memory usage)Results (primary 0.5%, secondary -0.8%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResults (primary 1.4%, secondary 2.6%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 767.262s -> 765.996s (-0.17%) |
As far as I can tell, perf impact is just from the overhead from the force inline pass running to check if there are any annotated callees. |
ef59df9
to
7a5ff95
Compare
According to pull request description the motivating use case are intrinsics that should be generated inline. The most obvious implementation strategy would be to do exactly that, in the contrast to the usual practice of creating a function wrapper that contains a call to an intrinsic. Can you elaborate on the advantages of the proposed implementation strategy? Any public function marked with |
In this particular case, the "intrinsics" I have in mind for this aren't just wrapper functions. They use a bunch of declarative macros to generate inline asm (using inline asm to avoid identical calls being optimised away, values being spilled, and to have more codegen guarantees, necessary to maintain the security properties). I looked into implementing these as MIR intrinsics prior to revisiting forced inlining, but it looked like it would end up being quite complex. These are also very architecture-specific and it felt strange to be implementing something like that into the compiler rather than being able to implement it normally. I also considered using a macro rather than a function, but having a proper function signature with const generics was just a lot better - it also allow things like
These are all limitations - I think a lot of these can be mitigated by being very careful in what we apply this attribute to. |
I think a 4% compile time hit on real code is much too high for the implementation of a compiler feature that's not even being used. So I'd like to see that number brought down quite a lot. I think there are a number of ways to improve the compile time hit, at a glance most of the overhead comes from the |
7a5ff95
to
08f365c
Compare
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
mir_transform: implement `#[rustc_force_inline]` Adds `#[rustc_force_inline]` which is similar to always inlining but reports an error if the inlining was not possible. - `#[rustc_force_inline]` can only be applied to free functions to guarantee that the MIR inliner will be able to resolve calls. - `rustc_mir_transform::inline::Inline` is refactored into two passes (`Inline` and `ForceInline`), sharing the vast majority of the implementation. - `rustc_mir_transform::inline::ForceInline` can't be disabled so annotated items are always inlined. - `rustc_mir_transform::inline::ForceInline` runs regardless of optimisation level. - `#[rustc_force_inline]` won't inline unless target features match, as with normal inlining. - MIR validation will ICE if a `#[rustc_force_inline]` isn't inlined, to guarantee that it will never be codegened independently. As a further guarantee, monomorphisation collection will always decide that `#[rustc_force_inline]` functions cannot be codegened locally. - Like other intrinsics, `#[rustc_force_inline]` annotated functions cannot be cast to function pointers. - As with other rustc attrs, this cannot be used by users, just within the compiler and standard library. - This is only implemented within rustc, so should avoid any limitations of LLVM's inlining. It is intended that this attribute be used with intrinsics that must be inlined for security reasons. For example, pointer authentication intrinsics would allow Rust users to make use of pointer authentication instructions, but if these intrinsic functions were in the binary then they could be used as gadgets with ROP attacks, defeating the point of introducing them. We don't have any intrinsics like this today, but I expect to upstream some once a force inlining mechanism such as this is available. cc rust-lang#131687 rust-lang/rfcs#3711 - this approach should resolve the concerns from these previous attempts r? `@saethlin`
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
Adds `#[rustc_force_inline]` which is similar to always inlining but reports an error if the inlining was not possible, and which always attempts to inline annotated items, regardless of optimisation levels. It can only be applied to free functions to guarantee that the MIR inliner will be able to resolve calls.
6bef877
to
cc9a9ec
Compare
If you misspell @bors r=saethlin |
☀️ Test successful - checks-actions |
Finished benchmarking commit (b1a7dfb): comparison URL. Overall result: ❌ regressions - please read the text belowOur benchmarks found a performance regression caused by this PR. Next Steps:
@rustbot label: +perf-regression Instruction countThis is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.
Max RSS (memory usage)Results (primary 1.9%, secondary 3.9%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesThis benchmark run did not return any relevant results for this metric. Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 763.192s -> 764.797s (0.21%) |
@davidtwco looks like some regressions snuck back in since the last run - I'm unsure if something happened in a rebase to reintroduce them. The actual regressions are much smaller than the original perf run, and I don't think they're big enough to demand an investigation so I'll mark this as triaged. @rustbot label: +perf-regression-triaged. |
Adds
#[rustc_force_inline]
which is similar to always inlining but reports an error if the inlining was not possible.#[rustc_force_inline]
can only be applied to free functions to guarantee that the MIR inliner will be able to resolve calls.rustc_mir_transform::inline::Inline
is refactored into two passes (Inline
andForceInline
), sharing the vast majority of the implementation.rustc_mir_transform::inline::ForceInline
can't be disabled so annotated items are always inlined.rustc_mir_transform::inline::ForceInline
runs regardless of optimisation level.#[rustc_force_inline]
won't inline unless target features match, as with normal inlining.#[rustc_force_inline]
isn't inlined, to guarantee that it will never be codegened independently. As a further guarantee, monomorphisation collection will always decide that#[rustc_force_inline]
functions cannot be codegened locally.#[rustc_force_inline]
annotated functions cannot be cast to function pointers.It is intended that this attribute be used with intrinsics that must be inlined for security reasons. For example, pointer authentication intrinsics would allow Rust users to make use of pointer authentication instructions, but if these intrinsic functions were in the binary then they could be used as gadgets with ROP attacks, defeating the point of introducing them. We don't have any intrinsics like this today, but I expect to upstream some once a force inlining mechanism such as this is available.
cc #131687 rust-lang/rfcs#3711 - this approach should resolve the concerns from these previous attempts
r? @saethlin