-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Build rustc
with a single CGU on x64 Linux
#115554
Conversation
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
⌛ Trying commit 1bffa88 with merge 2d8544092e7b47541698b5a6425f9f1aa131b994... |
Why only on the Linux builder? This should be done on every builder. |
Sure, but we don't have to do that in a single step. We can easily gauge the CI build time and compile performance effects of this on Linux, thanks to try builds and perf.RLO, but it's not that easy for other targets. It's possible that on Windows/macOS this might be a hit to CI times, so we should first investigate that (along with checking how it affects their performance, although that should hopefully be an improvement). It's also possible that this might cause some issues for nightly users, so I would like to test it for a few weeks on nightly on Linux first, before we enable it globally across the board. |
Are we expecting LVM17 to be different from our previous measurements of CI times and performance? |
Well, some things might have changed in the meantime. Apple CI is now much faster (so that's a good thing). And performance has changed a lot on Linux (but again, towards improvements, so hopefully this will be true on other platforms too). It looks like things are improving in general, but I'd still like to do this on Linux first to see if there are no unexpected issues with 1 CGU. |
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
I'm only talking about the comment about other builders, not this PR or the landing strategy: do we need to re-run the 1 CGU PRs on the other builders to gather up-to-date timings on LLVM17 and the new apple builder? |
I think that it would be good, yes. It also depends how we want to enable it for other builders. We can allowlist it for Windows and macOS and test these two separately, or just enable it across the board for everything and then take a look at the total CI time. |
Finished benchmarking commit (2d8544092e7b47541698b5a6425f9f1aa131b994): comparison URL. Overall result: ❌✅ regressions and improvements - ACTION NEEDEDBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @bors rollup=never Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 627.535s -> 623.723s (-0.61%) |
@bors r+ rollup=never |
Build `rustc` with a single CGU on x64 Linux This PR adds the `rust.codegen-units=1` setting when compiling the 64-bit Linux `rustc` artifact (the one used for try builds and Linux rustup distribution). This had mixed results in the past, however after the bump to LLVM 17, the results now seem pretty [incredible](rust-lang#115554 (comment)). Instruction counts, cycles, wall time, max RSS and even artifact sizes see large improvements. The last [try build](https://github.com/rust-lang-ci/rust/actions/runs/6077686494/job/16487768049) with this setting took 1h 8m, which is basically the same duration for try builds that we have seen recently. So there shouldn't be any large hit to CI/build time. I hope that this could potentially also reduce codegen noise of `rustc` a little bit, since small changes within a single `rustc` crate should no longer perturn optimizations because of CGU movement. We still do cross-crate LTO, so it won't eliminate it though. r? `@Mark-Simulacrum`
This affects cargo too https://perf.rust-lang.org/compare.html?start=626a6ab93fafd01b37b1e26c96cb6eec0d39f3eb&end=2d8544092e7b47541698b5a6425f9f1aa131b994&stat=instructions%3Au&tab=artifact-size, so not only rustc? Probably other toolchain components, but they didn't measured here. So this also set Line 460 in 7418413
|
But at the same time, libstd.so size didn't changed at all, strange. |
This comment has been minimized.
This comment has been minimized.
☀️ Test successful - checks-actions |
Finished benchmarking commit (871407a): comparison URL. Overall result: ❌✅ regressions and improvements - ACTION NEEDEDNext Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 630.664s -> 627.128s (-0.56%) |
Cycle count, Max RSS and artifact size is overwhelmingly positive. @rustbot label: +perf-regression-triaged |
Summary of perf results:
Fantastic results! The bootstrap time also went down a tiny bit (-0.56%) but that is misleading. The bootstrap benchmark doesn't use the same configuration as the shipped compiler: no |
23: Fix divergence from upstream `master` r=tshepang a=pvdrz * rust-lang/rust#116381 * rust-lang/rust#116360 * rust-lang/rust#116353 * rust-lang/rust#116406 * rust-lang/rust#116408 * rust-lang/rust#116395 * rust-lang/rust#116393 * rust-lang/rust#116388 * rust-lang/rust#116365 * rust-lang/rust#116363 * rust-lang/rust#116146 * rust-lang/rust#115961 * rust-lang/rust#116386 * rust-lang/rust#116367 * rust-lang/rust#105394 * rust-lang/rust#115301 * rust-lang/rust#116384 * rust-lang/rust#116379 * rust-lang/rust#116328 * rust-lang/rust#116282 * rust-lang/rust#116261 * rust-lang/rust#114654 * rust-lang/rust#116376 * rust-lang/rust#116374 * rust-lang/rust#116371 * rust-lang/rust#116358 * rust-lang/rust#116210 * rust-lang/rust#115863 * rust-lang/rust#115025 * rust-lang/rust#116372 * rust-lang/rust#116361 * rust-lang/rust#116355 * rust-lang/rust#116351 * rust-lang/rust#116158 * rust-lang/rust#115726 * rust-lang/rust#113053 * rust-lang/rust#116083 * rust-lang/rust#102099 * rust-lang/rust#116356 * rust-lang/rust#116350 * rust-lang/rust#116349 * rust-lang/rust#116289 * rust-lang/rust#114454 * rust-lang/rust#114453 * rust-lang/rust#116331 * rust-lang/rust#116346 * rust-lang/rust#116340 * rust-lang/rust#116326 * rust-lang/rust#116313 * rust-lang/rust#116276 * rust-lang/rust#115898 * rust-lang/rust#116325 * rust-lang/rust#116317 * rust-lang/rust#116207 * rust-lang/rust#116281 * rust-lang/rust#116304 * rust-lang/rust#116259 * rust-lang/rust#116228 * rust-lang/rust#116224 * rust-lang/rust#115554 * rust-lang/rust#116311 * rust-lang/rust#116299 * rust-lang/rust#116295 * rust-lang/rust#116292 * rust-lang/rust#116307 * rust-lang/rust#115670 * rust-lang/rust#116225 * rust-lang/rust#116302 * rust-lang/rust#116108 * rust-lang/rust#116160 * rust-lang/rust#116157 * rust-lang/rust#116127 * rust-lang/rust#116286 * rust-lang/rust#116254 * rust-lang/rust#116195 * rust-lang/rust#116280 * rust-lang/rust#115933 * rust-lang/rust#115546 * rust-lang/rust#115368 * rust-lang/rust#116275 * rust-lang/rust#116263 * rust-lang/rust#116241 * rust-lang/rust#116216 * rust-lang/rust#116030 * rust-lang/rust#116024 * rust-lang/rust#112123 * rust-lang/rust#113301 * rust-lang/rust#113797 * rust-lang/rust#115759 * rust-lang/rust#116260 * rust-lang/rust#116253 * rust-lang/rust#116245 * rust-lang/rust#116239 * rust-lang/rust#116234 * rust-lang/rust#116231 * rust-lang/rust#116201 * rust-lang/rust#116133 * rust-lang/rust#116176 * rust-lang/rust#116089 * rust-lang/rust#115986 Co-authored-by: ouz-a <[email protected]> Co-authored-by: Jakub Beránek <[email protected]> Co-authored-by: Federico Stra <[email protected]> Co-authored-by: bohan <[email protected]> Co-authored-by: Jason Newcomb <[email protected]> Co-authored-by: Ralf Jung <[email protected]> Co-authored-by: bors <[email protected]> Co-authored-by: Oli Scherer <[email protected]>
Build `rustc` with a single CGU on x64 Linux This PR adds the `rust.codegen-units=1` setting when compiling the 64-bit Linux `rustc` artifact (the one used for try builds and Linux rustup distribution). This had mixed results in the past, however after the bump to LLVM 17, the results now seem pretty [incredible](rust-lang/rust#115554 (comment)). Instruction counts, cycles, wall time, max RSS and even artifact sizes see large improvements. The last [try build](https://github.com/rust-lang-ci/rust/actions/runs/6077686494/job/16487768049) with this setting took 1h 8m, which is basically the same duration for try builds that we have seen recently. So there shouldn't be any large hit to CI/build time. I hope that this could potentially also reduce codegen noise of `rustc` a little bit, since small changes within a single `rustc` crate should no longer perturb optimizations because of CGU movement. We still do cross-crate LTO, so it won't eliminate it though. r? `@Mark-Simulacrum`
23: Fix divergence from upstream `master` r=tshepang a=pvdrz * rust-lang/rust#116483 * rust-lang/rust#116475 * rust-lang/rust#116329 * rust-lang/rust#116198 * rust-lang/rust#115588 * rust-lang/rust#115522 * rust-lang/rust#115454 * rust-lang/rust#111595 * rust-lang/rust#116018 * rust-lang/rust#116472 * rust-lang/rust#116469 * rust-lang/rust#116421 * rust-lang/rust#116463 * rust-lang/rust#101150 * rust-lang/rust#116269 * rust-lang/rust#116417 * rust-lang/rust#116455 * rust-lang/rust#116452 * rust-lang/rust#116428 * rust-lang/rust#116415 * rust-lang/rust#116288 * rust-lang/rust#116220 * rust-lang/rust#103046 * rust-lang/rust#114042 * rust-lang/rust#104153 * rust-lang/rust#116427 * rust-lang/rust#116443 * rust-lang/rust#116432 * rust-lang/rust#116431 * rust-lang/rust#116429 * rust-lang/rust#116296 * rust-lang/rust#116223 * rust-lang/rust#116273 * rust-lang/rust#116184 * rust-lang/rust#116370 * rust-lang/rust#114417 * rust-lang/rust#115200 * rust-lang/rust#116413 * rust-lang/rust#116381 * rust-lang/rust#116360 * rust-lang/rust#116353 * rust-lang/rust#116406 * rust-lang/rust#116408 * rust-lang/rust#116395 * rust-lang/rust#116393 * rust-lang/rust#116388 * rust-lang/rust#116365 * rust-lang/rust#116363 * rust-lang/rust#116146 * rust-lang/rust#115961 * rust-lang/rust#116386 * rust-lang/rust#116367 * rust-lang/rust#105394 * rust-lang/rust#115301 * rust-lang/rust#116384 * rust-lang/rust#116379 * rust-lang/rust#116328 * rust-lang/rust#116282 * rust-lang/rust#116261 * rust-lang/rust#114654 * rust-lang/rust#116376 * rust-lang/rust#116374 * rust-lang/rust#116371 * rust-lang/rust#116358 * rust-lang/rust#116210 * rust-lang/rust#115863 * rust-lang/rust#115025 * rust-lang/rust#116372 * rust-lang/rust#116361 * rust-lang/rust#116355 * rust-lang/rust#116351 * rust-lang/rust#116158 * rust-lang/rust#115726 * rust-lang/rust#113053 * rust-lang/rust#116083 * rust-lang/rust#102099 * rust-lang/rust#116356 * rust-lang/rust#116350 * rust-lang/rust#116349 * rust-lang/rust#116289 * rust-lang/rust#114454 * rust-lang/rust#114453 * rust-lang/rust#116331 * rust-lang/rust#116346 * rust-lang/rust#116340 * rust-lang/rust#116326 * rust-lang/rust#116313 * rust-lang/rust#116276 * rust-lang/rust#115898 * rust-lang/rust#116325 * rust-lang/rust#116317 * rust-lang/rust#116207 * rust-lang/rust#116281 * rust-lang/rust#116304 * rust-lang/rust#116259 * rust-lang/rust#116228 * rust-lang/rust#116224 * rust-lang/rust#115554 * rust-lang/rust#116311 * rust-lang/rust#116299 * rust-lang/rust#116295 * rust-lang/rust#116292 * rust-lang/rust#116307 * rust-lang/rust#115670 * rust-lang/rust#116225 * rust-lang/rust#116302 * rust-lang/rust#116108 * rust-lang/rust#116160 * rust-lang/rust#116157 * rust-lang/rust#116127 * rust-lang/rust#116286 * rust-lang/rust#116254 * rust-lang/rust#116195 * rust-lang/rust#116280 * rust-lang/rust#115933 * rust-lang/rust#115546 * rust-lang/rust#115368 * rust-lang/rust#116275 * rust-lang/rust#116263 * rust-lang/rust#116241 * rust-lang/rust#116216 * rust-lang/rust#116030 * rust-lang/rust#116024 * rust-lang/rust#112123 * rust-lang/rust#113301 * rust-lang/rust#113797 * rust-lang/rust#115759 * rust-lang/rust#116260 * rust-lang/rust#116253 * rust-lang/rust#116245 * rust-lang/rust#116239 * rust-lang/rust#116234 * rust-lang/rust#116231 * rust-lang/rust#116201 * rust-lang/rust#116133 * rust-lang/rust#116176 * rust-lang/rust#116089 * rust-lang/rust#115986 Co-authored-by: ouz-a <[email protected]> Co-authored-by: bors <[email protected]> Co-authored-by: Samuel Thibault <[email protected]> Co-authored-by: linkmauve <[email protected]> Co-authored-by: onur-ozkan <[email protected]> Co-authored-by: asquared31415 <[email protected]> Co-authored-by: Emmanuel Ferdman <[email protected]> Co-authored-by: Ralf Jung <[email protected]> Co-authored-by: Nadrieril <[email protected]> Co-authored-by: Raekye <[email protected]> Co-authored-by: Mark Rousskov <[email protected]> Co-authored-by: Zalathar <[email protected]>
23: Fix divergence from upstream `master` r=pietroalbini a=pvdrz * rust-lang/rust#116483 * rust-lang/rust#116475 * rust-lang/rust#116329 * rust-lang/rust#116198 * rust-lang/rust#115588 * rust-lang/rust#115522 * rust-lang/rust#115454 * rust-lang/rust#111595 * rust-lang/rust#116018 * rust-lang/rust#116472 * rust-lang/rust#116469 * rust-lang/rust#116421 * rust-lang/rust#116463 * rust-lang/rust#101150 * rust-lang/rust#116269 * rust-lang/rust#116417 * rust-lang/rust#116455 * rust-lang/rust#116452 * rust-lang/rust#116428 * rust-lang/rust#116415 * rust-lang/rust#116288 * rust-lang/rust#116220 * rust-lang/rust#103046 * rust-lang/rust#114042 * rust-lang/rust#104153 * rust-lang/rust#116427 * rust-lang/rust#116443 * rust-lang/rust#116432 * rust-lang/rust#116431 * rust-lang/rust#116429 * rust-lang/rust#116296 * rust-lang/rust#116223 * rust-lang/rust#116273 * rust-lang/rust#116184 * rust-lang/rust#116370 * rust-lang/rust#114417 * rust-lang/rust#115200 * rust-lang/rust#116413 * rust-lang/rust#116381 * rust-lang/rust#116360 * rust-lang/rust#116353 * rust-lang/rust#116406 * rust-lang/rust#116408 * rust-lang/rust#116395 * rust-lang/rust#116393 * rust-lang/rust#116388 * rust-lang/rust#116365 * rust-lang/rust#116363 * rust-lang/rust#116146 * rust-lang/rust#115961 * rust-lang/rust#116386 * rust-lang/rust#116367 * rust-lang/rust#105394 * rust-lang/rust#115301 * rust-lang/rust#116384 * rust-lang/rust#116379 * rust-lang/rust#116328 * rust-lang/rust#116282 * rust-lang/rust#116261 * rust-lang/rust#114654 * rust-lang/rust#116376 * rust-lang/rust#116374 * rust-lang/rust#116371 * rust-lang/rust#116358 * rust-lang/rust#116210 * rust-lang/rust#115863 * rust-lang/rust#115025 * rust-lang/rust#116372 * rust-lang/rust#116361 * rust-lang/rust#116355 * rust-lang/rust#116351 * rust-lang/rust#116158 * rust-lang/rust#115726 * rust-lang/rust#113053 * rust-lang/rust#116083 * rust-lang/rust#102099 * rust-lang/rust#116356 * rust-lang/rust#116350 * rust-lang/rust#116349 * rust-lang/rust#116289 * rust-lang/rust#114454 * rust-lang/rust#114453 * rust-lang/rust#116331 * rust-lang/rust#116346 * rust-lang/rust#116340 * rust-lang/rust#116326 * rust-lang/rust#116313 * rust-lang/rust#116276 * rust-lang/rust#115898 * rust-lang/rust#116325 * rust-lang/rust#116317 * rust-lang/rust#116207 * rust-lang/rust#116281 * rust-lang/rust#116304 * rust-lang/rust#116259 * rust-lang/rust#116228 * rust-lang/rust#116224 * rust-lang/rust#115554 * rust-lang/rust#116311 * rust-lang/rust#116299 * rust-lang/rust#116295 * rust-lang/rust#116292 * rust-lang/rust#116307 * rust-lang/rust#115670 * rust-lang/rust#116225 * rust-lang/rust#116302 * rust-lang/rust#116108 * rust-lang/rust#116160 * rust-lang/rust#116157 * rust-lang/rust#116127 * rust-lang/rust#116286 * rust-lang/rust#116254 * rust-lang/rust#116195 * rust-lang/rust#116280 * rust-lang/rust#115933 * rust-lang/rust#115546 * rust-lang/rust#115368 * rust-lang/rust#116275 * rust-lang/rust#116263 * rust-lang/rust#116241 * rust-lang/rust#116216 * rust-lang/rust#116030 * rust-lang/rust#116024 * rust-lang/rust#112123 * rust-lang/rust#113301 * rust-lang/rust#113797 * rust-lang/rust#115759 * rust-lang/rust#116260 * rust-lang/rust#116253 * rust-lang/rust#116245 * rust-lang/rust#116239 * rust-lang/rust#116234 * rust-lang/rust#116231 * rust-lang/rust#116201 * rust-lang/rust#116133 * rust-lang/rust#116176 * rust-lang/rust#116089 * rust-lang/rust#115986 Co-authored-by: bors <[email protected]> Co-authored-by: ouz-a <[email protected]> Co-authored-by: Samuel Thibault <[email protected]> Co-authored-by: linkmauve <[email protected]> Co-authored-by: onur-ozkan <[email protected]> Co-authored-by: asquared31415 <[email protected]> Co-authored-by: Emmanuel Ferdman <[email protected]> Co-authored-by: Ralf Jung <[email protected]> Co-authored-by: Nadrieril <[email protected]> Co-authored-by: Raekye <[email protected]> Co-authored-by: Mark Rousskov <[email protected]> Co-authored-by: Zalathar <[email protected]>
23: Fix divergence from upstream `master` r=Dajamante a=pvdrz * rust-lang/rust#116483 * rust-lang/rust#116475 * rust-lang/rust#116329 * rust-lang/rust#116198 * rust-lang/rust#115588 * rust-lang/rust#115522 * rust-lang/rust#115454 * rust-lang/rust#111595 * rust-lang/rust#116018 * rust-lang/rust#116472 * rust-lang/rust#116469 * rust-lang/rust#116421 * rust-lang/rust#116463 * rust-lang/rust#101150 * rust-lang/rust#116269 * rust-lang/rust#116417 * rust-lang/rust#116455 * rust-lang/rust#116452 * rust-lang/rust#116428 * rust-lang/rust#116415 * rust-lang/rust#116288 * rust-lang/rust#116220 * rust-lang/rust#103046 * rust-lang/rust#114042 * rust-lang/rust#104153 * rust-lang/rust#116427 * rust-lang/rust#116443 * rust-lang/rust#116432 * rust-lang/rust#116431 * rust-lang/rust#116429 * rust-lang/rust#116296 * rust-lang/rust#116223 * rust-lang/rust#116273 * rust-lang/rust#116184 * rust-lang/rust#116370 * rust-lang/rust#114417 * rust-lang/rust#115200 * rust-lang/rust#116413 * rust-lang/rust#116381 * rust-lang/rust#116360 * rust-lang/rust#116353 * rust-lang/rust#116406 * rust-lang/rust#116408 * rust-lang/rust#116395 * rust-lang/rust#116393 * rust-lang/rust#116388 * rust-lang/rust#116365 * rust-lang/rust#116363 * rust-lang/rust#116146 * rust-lang/rust#115961 * rust-lang/rust#116386 * rust-lang/rust#116367 * rust-lang/rust#105394 * rust-lang/rust#115301 * rust-lang/rust#116384 * rust-lang/rust#116379 * rust-lang/rust#116328 * rust-lang/rust#116282 * rust-lang/rust#116261 * rust-lang/rust#114654 * rust-lang/rust#116376 * rust-lang/rust#116374 * rust-lang/rust#116371 * rust-lang/rust#116358 * rust-lang/rust#116210 * rust-lang/rust#115863 * rust-lang/rust#115025 * rust-lang/rust#116372 * rust-lang/rust#116361 * rust-lang/rust#116355 * rust-lang/rust#116351 * rust-lang/rust#116158 * rust-lang/rust#115726 * rust-lang/rust#113053 * rust-lang/rust#116083 * rust-lang/rust#102099 * rust-lang/rust#116356 * rust-lang/rust#116350 * rust-lang/rust#116349 * rust-lang/rust#116289 * rust-lang/rust#114454 * rust-lang/rust#114453 * rust-lang/rust#116331 * rust-lang/rust#116346 * rust-lang/rust#116340 * rust-lang/rust#116326 * rust-lang/rust#116313 * rust-lang/rust#116276 * rust-lang/rust#115898 * rust-lang/rust#116325 * rust-lang/rust#116317 * rust-lang/rust#116207 * rust-lang/rust#116281 * rust-lang/rust#116304 * rust-lang/rust#116259 * rust-lang/rust#116228 * rust-lang/rust#116224 * rust-lang/rust#115554 * rust-lang/rust#116311 * rust-lang/rust#116299 * rust-lang/rust#116295 * rust-lang/rust#116292 * rust-lang/rust#116307 * rust-lang/rust#115670 * rust-lang/rust#116225 * rust-lang/rust#116302 * rust-lang/rust#116108 * rust-lang/rust#116160 * rust-lang/rust#116157 * rust-lang/rust#116127 * rust-lang/rust#116286 * rust-lang/rust#116254 * rust-lang/rust#116195 * rust-lang/rust#116280 * rust-lang/rust#115933 * rust-lang/rust#115546 * rust-lang/rust#115368 * rust-lang/rust#116275 * rust-lang/rust#116263 * rust-lang/rust#116241 * rust-lang/rust#116216 * rust-lang/rust#116030 * rust-lang/rust#116024 * rust-lang/rust#112123 * rust-lang/rust#113301 * rust-lang/rust#113797 * rust-lang/rust#115759 * rust-lang/rust#116260 * rust-lang/rust#116253 * rust-lang/rust#116245 * rust-lang/rust#116239 * rust-lang/rust#116234 * rust-lang/rust#116231 * rust-lang/rust#116201 * rust-lang/rust#116133 * rust-lang/rust#116176 * rust-lang/rust#116089 * rust-lang/rust#115986 35: Automated pull from `rust-lang/libc` r=pietroalbini a=github-actions[bot] This PR pulls the following changes from the [`rust-lang/libc`](https://github.com/rust-lang/libc) repository: * rust-lang/libc#3335 * rust-lang/libc#3373 * rust-lang/libc#3360 * rust-lang/libc#3374 * rust-lang/libc#3375 * rust-lang/libc#3376 * rust-lang/libc#3377 Co-authored-by: ouz-a <[email protected]> Co-authored-by: Samuel Thibault <[email protected]> Co-authored-by: bors <[email protected]> Co-authored-by: linkmauve <[email protected]> Co-authored-by: onur-ozkan <[email protected]> Co-authored-by: asquared31415 <[email protected]> Co-authored-by: Emmanuel Ferdman <[email protected]> Co-authored-by: Ralf Jung <[email protected]> Co-authored-by: Nadrieril <[email protected]> Co-authored-by: Raekye <[email protected]> Co-authored-by: Mark Rousskov <[email protected]> Co-authored-by: Zalathar <[email protected]> Co-authored-by: Nikolay Arhipov <[email protected]> Co-authored-by: Brian Cain <[email protected]> Co-authored-by: Steve Lau <[email protected]> Co-authored-by: David CARLIER <[email protected]> Co-authored-by: Louis Dupré Bertoni <[email protected]> Co-authored-by: Taiki Endo <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Optimize `librustc_driver.so` with BOLT This PR optimizes `librustc_driver.so` on 64-bit Linux CI with BOLT. ### Code One thing that's not clear yet to me how to resolve is how to best pass a linker flag that we need for BOLT (the second commit). It is currently passed unconditionally, which is not a good idea. We somehow have to: 1) Only pass it when we actually plan to use BOLT. How to best do that? `config.toml` entry? Environment variable? CLI flag for bootstrap? BOLT optimization is done by `opt-dist`, therefore bootstrap doesn't know about it by default. 2) Only pass it to `librustc_driver.so` (see performance below). Some discussion of this flag already happened on [Zulip](https://rust-lang.zulipchat.com/#narrow/stream/326414-t-infra.2Fbootstrap/topic/Adding.20a.20one-off.20linker.20flag). ### Performance Latest perf. results can be found [here](rust-lang#102487 (comment)). Note that instruction counts are not very interesting here, there are only regressions on hello world programs. Probably caused by a larger C++ libstd (?). Summary: - ✔️ `-1.8%` mean improvement in cycle counts across many primary benchmarks. - ✔️ `-1.8%` mean Max-RSS improvement. - ✖️ 34 MiB (+48%) artifact size regression of `librustc_driver.so`. - This is caused by building `librustc_driver.so` with relocations (which are required for BOLT). Hopefully, it will be [fixed](https://discourse.llvm.org/t/bolt-rfc-a-new-mode-to-rewrite-entire-binary/68674) in the future with BOLT improvements, but now trying to reduce this size increase is [tricky](rust-lang#114649). - Note that the size of this file was recently reduced in rust-lang#115554 by pretty much the same amount (33 MiB). So the size after this PR is basically the same as it was for the last ~year. - ✖️ 1.4 MiB (+53%) artifact size regression of `rustc`. - This is annoying and pretty much unnecessary. It is caused by the way relocations are currently applied in this PR, because they are applied both to `librustc_driver.so` (where they are needed) and for `rustc` (where they aren't needed), since both are built with a single cargo invocation. We might need e.g. some tricks in the bootstrap `rustc` shim to only apply the relocation flag for the shared library and not for `rustc`. ### CI time CI (try build) got slower by ~5 minutes, which is fine, IMO. It can be further reduced by running LLVM and `librustc_driver` BOLT profile gathering at the same time (now they are gathered separately for LLVM and `librustc_driver`). r? `@Mark-Simulacrum` Also CC `@onur-ozkan,` primarily for the bootstrap linker flag issue.
Optimize `librustc_driver.so` with BOLT This PR optimizes `librustc_driver.so` on 64-bit Linux CI with BOLT. ### Code One thing that's not clear yet to me how to resolve is how to best pass a linker flag that we need for BOLT (the second commit). It is currently passed unconditionally, which is not a good idea. We somehow have to: 1) Only pass it when we actually plan to use BOLT. How to best do that? `config.toml` entry? Environment variable? CLI flag for bootstrap? BOLT optimization is done by `opt-dist`, therefore bootstrap doesn't know about it by default. 2) Only pass it to `librustc_driver.so` (see performance below). Some discussion of this flag already happened on [Zulip](https://rust-lang.zulipchat.com/#narrow/stream/326414-t-infra.2Fbootstrap/topic/Adding.20a.20one-off.20linker.20flag). ### Performance Latest perf. results can be found [here](rust-lang#102487 (comment)). Note that instruction counts are not very interesting here, there are only regressions on hello world programs. Probably caused by a larger C++ libstd (?). Summary: - ✔️ `-1.8%` mean improvement in cycle counts across many primary benchmarks. - ✔️ `-1.8%` mean Max-RSS improvement. - ✖️ 34 MiB (+48%) artifact size regression of `librustc_driver.so`. - This is caused by building `librustc_driver.so` with relocations (which are required for BOLT). Hopefully, it will be [fixed](https://discourse.llvm.org/t/bolt-rfc-a-new-mode-to-rewrite-entire-binary/68674) in the future with BOLT improvements, but now trying to reduce this size increase is [tricky](rust-lang#114649). - Note that the size of this file was recently reduced in rust-lang#115554 by pretty much the same amount (33 MiB). So the size after this PR is basically the same as it was for the last ~year. - ✖️ 1.4 MiB (+53%) artifact size regression of `rustc`. - This is annoying and pretty much unnecessary. It is caused by the way relocations are currently applied in this PR, because they are applied both to `librustc_driver.so` (where they are needed) and for `rustc` (where they aren't needed), since both are built with a single cargo invocation. We might need e.g. some tricks in the bootstrap `rustc` shim to only apply the relocation flag for the shared library and not for `rustc`. ### CI time CI (try build) got slower by ~5 minutes, which is fine, IMO. It can be further reduced by running LLVM and `librustc_driver` BOLT profile gathering at the same time (now they are gathered separately for LLVM and `librustc_driver`). r? `@Mark-Simulacrum` Also CC `@onur-ozkan,` primarily for the bootstrap linker flag issue.
…ulacrum Optimize `librustc_driver.so` with BOLT This PR optimizes `librustc_driver.so` on 64-bit Linux CI with BOLT. ### Code One thing that's not clear yet to me how to resolve is how to best pass a linker flag that we need for BOLT (the second commit). It is currently passed unconditionally, which is not a good idea. We somehow have to: 1) Only pass it when we actually plan to use BOLT. How to best do that? `config.toml` entry? Environment variable? CLI flag for bootstrap? BOLT optimization is done by `opt-dist`, therefore bootstrap doesn't know about it by default. 2) Only pass it to `librustc_driver.so` (see performance below). Some discussion of this flag already happened on [Zulip](https://rust-lang.zulipchat.com/#narrow/stream/326414-t-infra.2Fbootstrap/topic/Adding.20a.20one-off.20linker.20flag). ### Performance Latest perf. results can be found [here](rust-lang#102487 (comment)). Note that instruction counts are not very interesting here, there are only regressions on hello world programs. Probably caused by a larger C++ libstd (?). Summary: - ✔️ `-1.8%` mean improvement in cycle counts across many primary benchmarks. - ✔️ `-1.8%` mean Max-RSS improvement. - ✖️ 34 MiB (+48%) artifact size regression of `librustc_driver.so`. - This is caused by building `librustc_driver.so` with relocations (which are required for BOLT). Hopefully, it will be [fixed](https://discourse.llvm.org/t/bolt-rfc-a-new-mode-to-rewrite-entire-binary/68674) in the future with BOLT improvements, but now trying to reduce this size increase is [tricky](rust-lang#114649). - Note that the size of this file was recently reduced in rust-lang#115554 by pretty much the same amount (33 MiB). So the size after this PR is basically the same as it was for the last ~year. - ✖️ 1.4 MiB (+53%) artifact size regression of `rustc`. - This is annoying and pretty much unnecessary. It is caused by the way relocations are currently applied in this PR, because they are applied both to `librustc_driver.so` (where they are needed) and for `rustc` (where they aren't needed), since both are built with a single cargo invocation. We might need e.g. some tricks in the bootstrap `rustc` shim to only apply the relocation flag for the shared library and not for `rustc`. ### CI time CI (try build) got slower by ~5 minutes, which is fine, IMO. It can be further reduced by running LLVM and `librustc_driver` BOLT profile gathering at the same time (now they are gathered separately for LLVM and `librustc_driver`). r? `@Mark-Simulacrum` Also CC `@onur-ozkan,` primarily for the bootstrap linker flag issue.
Optimize `librustc_driver.so` with BOLT This PR optimizes `librustc_driver.so` on 64-bit Linux CI with BOLT. ### Code One thing that's not clear yet to me how to resolve is how to best pass a linker flag that we need for BOLT (the second commit). It is currently passed unconditionally, which is not a good idea. We somehow have to: 1) Only pass it when we actually plan to use BOLT. How to best do that? `config.toml` entry? Environment variable? CLI flag for bootstrap? BOLT optimization is done by `opt-dist`, therefore bootstrap doesn't know about it by default. 2) Only pass it to `librustc_driver.so` (see performance below). Some discussion of this flag already happened on [Zulip](https://rust-lang.zulipchat.com/#narrow/stream/326414-t-infra.2Fbootstrap/topic/Adding.20a.20one-off.20linker.20flag). ### Performance Latest perf. results can be found [here](rust-lang/rust#102487 (comment)). Note that instruction counts are not very interesting here, there are only regressions on hello world programs. Probably caused by a larger C++ libstd (?). Summary: - ✔️ `-1.8%` mean improvement in cycle counts across many primary benchmarks. - ✔️ `-1.8%` mean Max-RSS improvement. - ✖️ 34 MiB (+48%) artifact size regression of `librustc_driver.so`. - This is caused by building `librustc_driver.so` with relocations (which are required for BOLT). Hopefully, it will be [fixed](https://discourse.llvm.org/t/bolt-rfc-a-new-mode-to-rewrite-entire-binary/68674) in the future with BOLT improvements, but now trying to reduce this size increase is [tricky](rust-lang/rust#114649). - Note that the size of this file was recently reduced in rust-lang/rust#115554 by pretty much the same amount (33 MiB). So the size after this PR is basically the same as it was for the last ~year. - ✖️ 1.4 MiB (+53%) artifact size regression of `rustc`. - This is annoying and pretty much unnecessary. It is caused by the way relocations are currently applied in this PR, because they are applied both to `librustc_driver.so` (where they are needed) and for `rustc` (where they aren't needed), since both are built with a single cargo invocation. We might need e.g. some tricks in the bootstrap `rustc` shim to only apply the relocation flag for the shared library and not for `rustc`. ### CI time CI (try build) got slower by ~5 minutes, which is fine, IMO. It can be further reduced by running LLVM and `librustc_driver` BOLT profile gathering at the same time (now they are gathered separately for LLVM and `librustc_driver`). r? `@Mark-Simulacrum` Also CC `@onur-ozkan,` primarily for the bootstrap linker flag issue.
Refactor `opt-dist` to simplify local building This PR refactors the `opt-dist` tool to make it easier to invoke it locally, outside of CI, and thus simplify building PGO/BOLT optimized `rustc` builds e.g. for distro maintainers. It should also make it easier to run the PGO/BOLT workflow locally e.g. to profile performance or debug issues (looking at you, rust-lang/rust#115554).
Refactor `opt-dist` to simplify local building This PR refactors the `opt-dist` tool to make it easier to invoke it locally, outside of CI, and thus simplify building PGO/BOLT optimized `rustc` builds e.g. for distro maintainers. It should also make it easier to run the PGO/BOLT workflow locally e.g. to profile performance or debug issues (looking at you, rust-lang/rust#115554).
This PR adds the
rust.codegen-units=1
setting when compiling the 64-bit Linuxrustc
artifact (the one used for try builds and Linux rustup distribution). This had mixed results in the past, however after the bump to LLVM 17, the results now seem pretty incredible. Instruction counts, cycles, wall time, max RSS and even artifact sizes see large improvements.The last try build with this setting took 1h 8m, which is basically the same duration for try builds that we have seen recently. So there shouldn't be any large hit to CI/build time.
I hope that this could potentially also reduce codegen noise of
rustc
a little bit, since small changes within a singlerustc
crate should no longer perturb optimizations because of CGU movement. We still do cross-crate LTO, so it won't eliminate it though.r? @Mark-Simulacrum