Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add #[inline] to short public functions #243

Merged
merged 5 commits into from
Nov 20, 2024
Merged

Conversation

schungx
Copy link
Contributor

@schungx schungx commented Nov 19, 2024

This should help those users who don't build with LTO turned on.

@schungx schungx changed the title Add #[inline] to public functions Add #[inline] to short public functions Nov 19, 2024
fn gen_random_64() -> Vec<u8> {
let buf: &mut [u8] = &mut [0; 64];
getrandom::getrandom(buf);
buf.to_vec()
}

// Create a buffer of an integer encoded as a uint32le
#[inline]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure all of these are more suitable to be inline? Did we check what impact this has on binary size?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be fair I haven't done deep research on using inline. Most of my reading is from this article https://matklad.github.io/2021/07/09/inline-in-rust.html and I'm just wondering if this incurs more compile times for all users even if they would have preferred faster compiles with possibly slightly slower code.

Copy link
Contributor Author

@schungx schungx Nov 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Putting #[inline] on functions enable them to be optimized across crates. This is a library crate, so if you don't have that and LTO is not on, the Rust compiler will never inline even very simple function such as getters.

Also, inlining usually enables more optimizations in the call site. For example, the following code:

pub fn flip(x: bool) -> bool {
    if x { false } else { true }
}

// use it in another crate:
if flip(true) { ... }

If you don't compile with LTO, everything stays, including the call to flip because the Rust compiler will not inline across crate boundaries even in release builds. If you put #[inline] on fn flip, the entire code is optimized away to zero.

That's why if you look at Rust standard library code, it is littered with #[inline] all over the place. In fact, it is better to err on the side of having too many #[inline] because the compiler knows when to stop inlining if it thinks it is not worth it. The default of inlining of any function for LLVM is "always inline".

On to your question: inlining reduces binary size for short functions, because you lose all those code for setting up the function call and the tear-down, as well as the code overhead of setting up a call frame in the function. It does not affect compilation time in debug builds because Rust does not inline in debug. With release builds, my experience is that the added compilation time is immaterial compared to all the other optimizations Rust is doing. And I suppose if a user builds for release, he/she would want execution speed, not compilation speed.

Finally, these attributes is one of those things that are very easy to add when you first create the API, but extremely tedious and time-consuming to add back later on, so it pays to put them in on day 1.

Copy link

@c-git c-git Nov 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a cost in terms of compile time by doing that optimization. All I was saying is that by doing this we remove that choice from the end user. Whereas with LTO they can opt in if they want to. If you say it doesn't affect debug then I guess that should be fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK Rust doesn't inline at all in debug mode. That's why performance is way slower than release, because all those as_ref and as_mut etc are actually function calls where with inlining they turn to the original reference.

Copy link
Contributor Author

@schungx schungx Nov 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fourth, when building libraries, proactively add #[inline] to small non-generic functions. Pay special attention to impls: Deref, AsRef and the like often benefit from inlining. A library can’t anticipate all usages upfront, it makes sense to not prematurely pessimize future users. Note that #[inline] is not transitive: if a trivial public function calls a trivial private function, you need to #[inline] both.

This section is actually my exact point. A library should proactively #[inline] wherever it makes sense.

@MathNya
Copy link
Owner

MathNya commented Nov 20, 2024

@schungx
We will merge this information as well.
I did not know the details about #[inline] either.
I have learned about the above comments as well.

@MathNya MathNya merged commit 9185815 into MathNya:master Nov 20, 2024
1 check passed
@schungx
Copy link
Contributor Author

schungx commented Dec 4, 2024

Just a follow up with observations.

I find that once using the new version with #[inline], linking time is reduced to less than 1/3 of the time it took before.

I suspect that since Rust is now doing the inlining at HIR stage, it avoids having to do it with LTO. Looks like LTO takes a huge amount of time if there are many functions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants