-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tracking Issue for optimize_for_size
standard library feature
#125612
Comments
Small aside, but do you think that code using Obviously, to get the best benefits, you're going to want optimizations anyway, but I'm thinking that we probably do want to ensure that code actually isn't compiled in all cases using conditional compilation, rather than just conditions. This also might help building on memory-constrained environments since the code won't have to be kept in memory before it's removed. I also think that longer-term, there should be an easier flag to enable this feature that also takes into account optimization options (making sure loops aren't unrolled, etc.) but at least for now, while it's just a Cargo feature, it makes sense to at least figure out what cases we should optimize for. |
I would be extremely surprised if any cfg-ed out code survived even in debug mode without optimizations. Even something like this: fn main() {
#[cfg(feature = "foo")]
println!("hellorust");
} Doesn't contain the |
in practice we've been using the More generally, the point of this flag is really I say that in part because using proper conditional compilation would make the code harder to maintain, to the point where we may require completely separate functions for when the flag is enabled or not. We've been trying to prevent completely separate implementations from happening so far: that won't always work but when we can I think the result is better. |
Right, note that this is different from: fn main() {
if cfg!(feature = "foo") {
println!("hellorust");
}
} since
I agree here, mostly just wanted to ask because options do exist (like the currently unstable Basically agree with the current implementation, but think it's important to properly clarify so folks are aware of the best way to implement things. |
Ah, sorry, I didn't realize that with |
Hi! I tested this option on my embedded project using embassy on a STM32F401 and my code size did go up from 335kB to 360kB. I can do some debugging/investigation why it happens if giving some instructions what to test for. |
Did it go up vs just using |
Ah, I think I found the issue, adding
without:
Not a lot but it is something. Thanks! |
Hmm, that's weird, on CI we enable it with just I think that $ cargo +nightly build -Zbuild-std -Zbuild-std-features="optimize_for_size,panic_immediate_abort" --target x86_64-unknown-linux-gnu |
With what's currently merged, the max you can get is ~1.5kb savings on opt-level z. |
Oh, right it works both way, doesn't metter if it starts with std/ or not. I assumed from the first posts it needs to start with std/. Thanks! |
Sorry, that was an error on my part, fixed the code example. Thanks! |
This definitely feels like something that could use a proper guide mentioning these sorts of caveats-- the unstable book would probably be a good place to put it for now. That way, we could update it as more cases like these pop up, and it would also help with the bit I mentioned about being able to have a general flag for "use size-optimised libstd" that makes sure the correct optimisation settings, etc. are applied to make it work. (Having a full list of the things people have to do to make it work, means we can take that into account when making a flag that "just works" for this case.) Perhaps in the distant future, we could even see official builds of "size-optimised" libstd for certain architectures on rustup! And we can also link that in the original comment so that folks can see the up-to-date version. |
This has been added to |
…ate, r=Amanieu make `ptr::rotate` smaller when using `optimize_for_size` code to reproduce https://github.com/folkertdev/optimize_for_size-slice-rotate In the example the size of `.text` goes down from 1624 to 276 bytes. ``` > cargo size --release --features "left-std" -- -A slice-rotate : section size addr .vector_table 1024 0x0 .text 1624 0x400 .rodata 0 0xa58 .data 0 0x20000000 .gnu.sgstubs 0 0xa60 .bss 0 0x20000000 .uninit 0 0x20000000 .debug_loc 591 0x0 .debug_abbrev 1452 0x0 .debug_info 10634 0x0 .debug_aranges 480 0x0 .debug_ranges 1504 0x0 .debug_str 11716 0x0 .comment 72 0x0 .ARM.attributes 56 0x0 .debug_frame 1036 0x0 .debug_line 5837 0x0 Total 36026 > cargo size --release --features "left-size" -- -A slice-rotate : section size addr .vector_table 1024 0x0 .text 276 0x400 .rodata 0 0x514 .data 0 0x20000000 .gnu.sgstubs 0 0x520 .bss 0 0x20000000 .uninit 0 0x20000000 .debug_loc 347 0x0 .debug_abbrev 965 0x0 .debug_info 4216 0x0 .debug_aranges 168 0x0 .debug_ranges 216 0x0 .debug_str 3615 0x0 .comment 72 0x0 .ARM.attributes 56 0x0 .debug_frame 232 0x0 .debug_line 723 0x0 Total 11910 ``` tracking issue: rust-lang#125612
…anieu make `ptr::rotate` smaller when using `optimize_for_size` code to reproduce https://github.com/folkertdev/optimize_for_size-slice-rotate In the example the size of `.text` goes down from 1624 to 276 bytes. ``` > cargo size --release --features "left-std" -- -A slice-rotate : section size addr .vector_table 1024 0x0 .text 1624 0x400 .rodata 0 0xa58 .data 0 0x20000000 .gnu.sgstubs 0 0xa60 .bss 0 0x20000000 .uninit 0 0x20000000 .debug_loc 591 0x0 .debug_abbrev 1452 0x0 .debug_info 10634 0x0 .debug_aranges 480 0x0 .debug_ranges 1504 0x0 .debug_str 11716 0x0 .comment 72 0x0 .ARM.attributes 56 0x0 .debug_frame 1036 0x0 .debug_line 5837 0x0 Total 36026 > cargo size --release --features "left-size" -- -A slice-rotate : section size addr .vector_table 1024 0x0 .text 276 0x400 .rodata 0 0x514 .data 0 0x20000000 .gnu.sgstubs 0 0x520 .bss 0 0x20000000 .uninit 0 0x20000000 .debug_loc 347 0x0 .debug_abbrev 965 0x0 .debug_info 4216 0x0 .debug_aranges 168 0x0 .debug_ranges 216 0x0 .debug_str 3615 0x0 .comment 72 0x0 .ARM.attributes 56 0x0 .debug_frame 232 0x0 .debug_line 723 0x0 Total 11910 ``` tracking issue: rust-lang/rust#125612
…anieu make `ptr::rotate` smaller when using `optimize_for_size` code to reproduce https://github.com/folkertdev/optimize_for_size-slice-rotate In the example the size of `.text` goes down from 1624 to 276 bytes. ``` > cargo size --release --features "left-std" -- -A slice-rotate : section size addr .vector_table 1024 0x0 .text 1624 0x400 .rodata 0 0xa58 .data 0 0x20000000 .gnu.sgstubs 0 0xa60 .bss 0 0x20000000 .uninit 0 0x20000000 .debug_loc 591 0x0 .debug_abbrev 1452 0x0 .debug_info 10634 0x0 .debug_aranges 480 0x0 .debug_ranges 1504 0x0 .debug_str 11716 0x0 .comment 72 0x0 .ARM.attributes 56 0x0 .debug_frame 1036 0x0 .debug_line 5837 0x0 Total 36026 > cargo size --release --features "left-size" -- -A slice-rotate : section size addr .vector_table 1024 0x0 .text 276 0x400 .rodata 0 0x514 .data 0 0x20000000 .gnu.sgstubs 0 0x520 .bss 0 0x20000000 .uninit 0 0x20000000 .debug_loc 347 0x0 .debug_abbrev 965 0x0 .debug_info 4216 0x0 .debug_aranges 168 0x0 .debug_ranges 216 0x0 .debug_str 3615 0x0 .comment 72 0x0 .ARM.attributes 56 0x0 .debug_frame 232 0x0 .debug_line 723 0x0 Total 11910 ``` tracking issue: rust-lang/rust#125612
…anieu make `ptr::rotate` smaller when using `optimize_for_size` code to reproduce https://github.com/folkertdev/optimize_for_size-slice-rotate In the example the size of `.text` goes down from 1624 to 276 bytes. ``` > cargo size --release --features "left-std" -- -A slice-rotate : section size addr .vector_table 1024 0x0 .text 1624 0x400 .rodata 0 0xa58 .data 0 0x20000000 .gnu.sgstubs 0 0xa60 .bss 0 0x20000000 .uninit 0 0x20000000 .debug_loc 591 0x0 .debug_abbrev 1452 0x0 .debug_info 10634 0x0 .debug_aranges 480 0x0 .debug_ranges 1504 0x0 .debug_str 11716 0x0 .comment 72 0x0 .ARM.attributes 56 0x0 .debug_frame 1036 0x0 .debug_line 5837 0x0 Total 36026 > cargo size --release --features "left-size" -- -A slice-rotate : section size addr .vector_table 1024 0x0 .text 276 0x400 .rodata 0 0x514 .data 0 0x20000000 .gnu.sgstubs 0 0x520 .bss 0 0x20000000 .uninit 0 0x20000000 .debug_loc 347 0x0 .debug_abbrev 965 0x0 .debug_info 4216 0x0 .debug_aranges 168 0x0 .debug_ranges 216 0x0 .debug_str 3615 0x0 .comment 72 0x0 .ARM.attributes 56 0x0 .debug_frame 232 0x0 .debug_line 723 0x0 Total 11910 ``` tracking issue: rust-lang/rust#125612
… r=tgross35 Skip fast path for dec2flt when optimize_for_size Tracking issue: rust-lang#125612 Skip the fast algorithm when optimizing for size. When compiling for https://github.com/quartiq/stabilizer I get these numbers: Before ``` text data bss dec hex filename 192192 8 49424 241624 3afd8 dual-iir ``` After ``` text data bss dec hex filename 191632 8 49424 241064 3ada8 dual-iir ``` This saves 560 bytes.
… r=tgross35 Skip fast path for dec2flt when optimize_for_size Tracking issue: rust-lang#125612 Skip the fast algorithm when optimizing for size. When compiling for https://github.com/quartiq/stabilizer I get these numbers: Before ``` text data bss dec hex filename 192192 8 49424 241624 3afd8 dual-iir ``` After ``` text data bss dec hex filename 191632 8 49424 241064 3ada8 dual-iir ``` This saves 560 bytes.
Rollup merge of rust-lang#126271 - diondokter:dec2flt-skip-fast-path, r=tgross35 Skip fast path for dec2flt when optimize_for_size Tracking issue: rust-lang#125612 Skip the fast algorithm when optimizing for size. When compiling for https://github.com/quartiq/stabilizer I get these numbers: Before ``` text data bss dec hex filename 192192 8 49424 241624 3afd8 dual-iir ``` After ``` text data bss dec hex filename 191632 8 49424 241064 3ada8 dual-iir ``` This saves 560 bytes.
…sort-impls, r=cuviper Add `optimize_for_size` variants for stable and unstable sort as well as select_nth_unstable - Stable sort uses a simple merge-sort that re-uses the existing - rather gnarly - merge function. - Unstable sort jumps directly to the branchless heapsort fallback. - select_nth_unstable jumps directly to the median_of_medians fallback, which is augmented with a custom tiny smallsort and partition impl. Some code is duplicated but de-duplication would bring it's own problems. For example `swap_if_less` is critical for performance, if the sorting networks don't inline it perf drops drastically, however `#[inline(always)]` is also a poor fit, if the provided comparison function is huge, it gives the compiler an out to only instantiate `swap_if_less` once and call it. Another aspect that would suffer when making `swap_if_less` pub, is having to cfg out dozens of functions in in smallsort module. Part of rust-lang#125612 r? `@Kobzol`
…s, r=cuviper Add `optimize_for_size` variants for stable and unstable sort as well as select_nth_unstable - Stable sort uses a simple merge-sort that re-uses the existing - rather gnarly - merge function. - Unstable sort jumps directly to the branchless heapsort fallback. - select_nth_unstable jumps directly to the median_of_medians fallback, which is augmented with a custom tiny smallsort and partition impl. Some code is duplicated but de-duplication would bring it's own problems. For example `swap_if_less` is critical for performance, if the sorting networks don't inline it perf drops drastically, however `#[inline(always)]` is also a poor fit, if the provided comparison function is huge, it gives the compiler an out to only instantiate `swap_if_less` once and call it. Another aspect that would suffer when making `swap_if_less` pub, is having to cfg out dozens of functions in in smallsort module. Part of rust-lang/rust#125612 r? `@Kobzol`
…s, r=cuviper Add `optimize_for_size` variants for stable and unstable sort as well as select_nth_unstable - Stable sort uses a simple merge-sort that re-uses the existing - rather gnarly - merge function. - Unstable sort jumps directly to the branchless heapsort fallback. - select_nth_unstable jumps directly to the median_of_medians fallback, which is augmented with a custom tiny smallsort and partition impl. Some code is duplicated but de-duplication would bring it's own problems. For example `swap_if_less` is critical for performance, if the sorting networks don't inline it perf drops drastically, however `#[inline(always)]` is also a poor fit, if the provided comparison function is huge, it gives the compiler an out to only instantiate `swap_if_less` once and call it. Another aspect that would suffer when making `swap_if_less` pub, is having to cfg out dozens of functions in in smallsort module. Part of rust-lang/rust#125612 r? `@Kobzol`
#125011 has added a
optimize_for_size
feature to the standard library, which is designed for special casing some algorithms incore
/alloc
/std
to provide an implementation that tries to minimize its effect on binary size.To use this feature, the standard library has to be compiled using
-Zbuild-std
:$ cargo +nightly build -Z build-std -Z build-std-features="optimize_for_size" ...
Optimizations that leverage this flag:
optimize_for_size
#125609ptr::rotate
smaller when usingoptimize_for_size
#125720optimize_for_size
variants for stable and unstable sort as well as select_nth_unstable #129587The text was updated successfully, but these errors were encountered: