-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rustc fails to inline trivial functions #37538
Comments
This function can not be inlined cross crate because it is not emitted in the metadata section. You should mark it Using lto also works around this issue. /// Left shift that does not panic when shifting by the integer width.
fn shift_left(x: u8, shift: usize) -> u8 {
debug_assert!(shift <= 8);
// Rust panics when shifting by the integer width, so we have to treat
// that case separately.
if shift >= 8 { 0 } else { x << shift }
} |
This is not about cross-crate inlining. The code that calls the function and the function itself are in the same crate. |
Rustc turns it into being about cross crate inlining, and I agree that it is a bug, it should be fixed. The unit of compilation is a different crate than where shift_left is defined. |
Ah, I think I understand. The code that calls Is that correct? |
Yep, that's right. |
So isn't it the problem that I just ran into this issue because I have a crate whose public API exposes mostly 1 liner functions, and if I don't mark any of them inline, my users don't get them inlined in their code. I would like an option to automatically make all public items of my crate available for inlining in other crates. Would that be a solution for that? |
@gnzlbg While marking |
I see. So probably generic code should be #[inline] by default, and anything it calls should be transitively inline as well? |
If I understand correctly (but I am not very familiar with compiler internals), generic code is eligible for inlining, because other crates need to be able to monomorphise the code, and both inlining and monomorphisation rely on the same data. A possible solution for this issue would indeed be to transitively mark things called by public generic functions as inline. That could have unintended side effects though, such as dramatically increasing the size of rlibs. |
Triage: no changes |
@saethlin Has this been addressed? |
To be clear, I agree with this statement. I don't think it's relevant though; given that there is no reference C++ code. I suspect that a C++ compiler would have also failed to inline the equivalent function, given that Rust and C++ have basically the same compilation model. Putting these little functions in a header or making them The specific failed inlinings reported here were indeed addressed by #116505; these functions are inlined away by default but if you set
@gnzlbg Fixing this is the objective of the PR I linked above. Functions are made cross-crate-inlinable based on heuristics, and one of them is that the optimized MIR must not contain any calls. We do have a MIR inliner so some functions with calls get made inlinable anyway, but sometimes the MIR inliner falls over.
FWIW we have |
I read in multiple places that rustc generating worse code than a C++ compiler would do for an equivalent C++ program is a bug. So here we go:
Summary
Rustc fails to inline trivial functions that compile down to just a few instructions, to the point where calling convention overhead is much worse than the actual function itself.
Steps to reproduce
I tried to write a minimal example at play.rust-lang.org, but everything short that I can come up with does not suffer from this issue. So instead I am going to link the project that caused me to discover this issue:
Actual and expected behavior
I’ll outline some of the disassembly below:
Note that this is not dead code, there are calls to these functions in very hot loops:
I would expect that functions like these would be inlined automatically, but they were not. Note that all of this code is in the same crate.
I encountered about a dozen of these during profiling, where very small functions like the ones above were showing up as hotspots. I’ve been able to speed up my program by as much as 30% just by placing a few
#[inline(always)]
attributes.There are also simple getters like
Block::len
which are not inlined, but these are called from the example program which is a different crate, so that is working as intended I think.Meta
The text was updated successfully, but these errors were encountered: