-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: add rustc-perf submodule to src/tools #125166
Conversation
rustbot has assigned @Mark-Simulacrum. Use |
|
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an interesting idea! How long does it take to checkout rustc-perf
and how large is it (in vendored/dist archives)? It contains a lot of code.
@@ -1032,6 +1032,10 @@ impl Step for PlainSourceTarball { | |||
.arg(builder.src.join("./compiler/rustc_codegen_gcc/Cargo.toml")) | |||
.arg("--sync") | |||
.arg(builder.src.join("./src/bootstrap/Cargo.toml")) | |||
.arg("--sync") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if we should do this unconditionally, since opt-dist
is only used in two CI dist jobs, most of them don't use it at all 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since opt-dist is only used in two CI dist jobs, most of them don't use it at all
That's true within the Rust project, but my main goal here is to make opt-dist
usable for packagers / end users who need to bootstrap Rust themselves, but still want to benefit from PGO and BOLT.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if we should do this unconditionally
I wish we had a "src tarball for developers", which would not include any submodules for example, and a "src tarball for distributing" which would be "complete", but that would be a pretty large yak to shave.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which "developers" do you think are using it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which "developers" do you think are using it?
I think few people use it today, but that's likely because of how hard it is to leverage opt-dist
outside the Rust CI. I think, in the future, it would be used:
- Directly by anyone who wants to package Rust, needs to bootstrap it themselves, and wants to build an optimized dist with PGO+BOLT. Namely, large distros and companies would benefit from this.
- Indirectly by anyone consuming Rust from the above distributors. Namely, folks using Rust from their package manager, and people using Rust at large companies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For future reference, my argument boils down to:
premises
- the point of the
rustc-dev-src
dist tarball is to allow people to bootstrap Rust - most people bootstrapping Rust want a production-grade toolchain as a result
- the vanilla bootstrap is not sufficient to bootstrap a production-grade rust-toolchain
- it is undesirable for users to be forced to reinvent the wheel and roll their own complex PGO/BOLT bootstrapping pipeline
therefore
- it should be possible to bootstrap a production-grade Rust toolchain from the
rustc-dev-src
tarball usingopt-dist
d32b73b
to
83a847f
Compare
I'd be happy to measure this, but I feel doing so on my machine will not be representative. I could use something like
That's my main concern. I submitted the PR while waiting for the build of
It's not great. I wonder whether it'd be possible to not use all of |
The source tarball here is only really used by distros or other downstream builders, right? I feel like the size of it doesn't matter that much... |
The list of allowed third-party dependencies may have been modified! You must ensure that any new dependencies have compatible licenses before merging. These commits modify the If this was unintentional then you should revert the changes before this PR is merged. |
Yes, we don’t use a lot from rustc-perf actually: not all the compile time benchmarks, the runtime benchmarks, the website or its dependencies, and basically only use the collector like a script to run rustc over the chosen benchmarks, without doing any DB/network perf collection and these dependencies. |
Nice, while I don't think it should block this PR, I wonder if it would make sense to either:
@Kobzol, thoughts? I don't know how de/coupled the collector/corpus/website are. |
They are very coupled, and amongst other things the website depends on code from I guess that we could create a branch that would only contain the |
Although it was not really an original goal of So I guess that it's fine to just vendor everything. What I was more concerned about was the regular development workflow for people hacking on rustc, that's why I was asking about the checkout speed (something like 50 Mbit/s might be reasonable to simulate?) But we have |
people working on this repo often don't have that checked out though |
As-is, the submodule is only initialized/updated when either:
I don't think developer workflows will be impacted by this change at all, unless you're specifically doing a lot of clean I was careful to not include the |
Oh, I thought that it happens by default. Well, as @lovesegfault said, I'm still not sure if we need to checkout the submodule and (primarily) vendor |
For what it's worth, you can leverage BOLT, on the other hand, really only works on x86_64 for now. |
Gathered data on cloning $ hyperfine --warmup 2 --prepare 'rm -rf ./rustc-perf' 'trickle -s -d 50000 git clone https://github.com/rust-lang/rustc-perf'
Benchmark 1: trickle -s -d 50000 git clone https://github.com/rust-lang/rustc-perf
Time (mean ± σ): 3.027 s ± 0.029 s [User: 3.181 s, System: 2.307 s]
Range (min … max): 2.988 s … 3.079 s 10 runs
$ hyperfine --warmup 2 --prepare 'rm -rf ./rustc-perf' 'trickle -s -d 25000 git clone https://github.com/rust-lang/rustc-perf'
Benchmark 1: trickle -s -d 25000 git clone https://github.com/rust-lang/rustc-perf
Time (mean ± σ): 4.341 s ± 0.022 s [User: 3.668 s, System: 3.521 s]
Range (min … max): 4.312 s … 4.372 s 10 runs
$ hyperfine --warmup 2 --prepare 'rm -rf ./rustc-perf' 'trickle -s -d 10000 git clone https://github.com/rust-lang/rustc-perf'
Benchmark 1: trickle -s -d 10000 git clone https://github.com/rust-lang/rustc-perf
Time (mean ± σ): 8.535 s ± 0.037 s [User: 4.533 s, System: 6.928 s]
Range (min … max): 8.485 s … 8.604 s 10 runs |
Another datapoint: a computer I was using to analyze a |
In the grand scheme of things, cloning the submodule will probably be fine, since it's optional. I'm generally in favour of this change, it resolves a minor maintenance annoyance (keeping the rustc-perf version in sync), and could help more distro maintainers get perf. benefits for their Rust toolchain. Let's see if it works in the current state. @bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
This replaces the hardcoded rustc-perf commit and ad-hoc downloading and unpacking of its zipped source with defaulting to use the new rustc-perf submodule. While it would be nice to make `opt-dist` able to initialize the submodule automatically when pointing to a Rust checkout _other_ than the one opt-dist was built in, that would require a bigger refactor that moved `update_submodule`, from bootstrap, into build_helper. Regardless, I imagine it must be quite rare to use `opt-dist` with a checkout that is neither from a rust-src tarball (which will contain the submodule), nor the checkout opt-dist itself was built (bootstrap will update the submodule when opt-dist is built).
It is now available as a submodule in src/tools/rustc-perf, and is initialized when building opt-dist
61c06f1
to
e253718
Compare
@bors r+ |
☀️ Test successful - checks-actions |
Finished benchmarking commit (60faa27): comparison URL. Overall result: no relevant changes - no action needed@rustbot label: -perf-regression Instruction countThis benchmark run did not return any relevant results for this metric. Max RSS (memory usage)Results (primary -0.6%, secondary -2.4%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesThis benchmark run did not return any relevant results for this metric. Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 670.311s -> 675.351s (0.75%) |
An upstream change resulted in a conflict, due to rust-lang/rust#125166.
639: move submodule entry to a place where it will not conflict r=pietroalbini a=tshepang An upstream change resulted in a conflict, due to rust-lang/rust#125166. Co-authored-by: Tshepang Mbambo <[email protected]>
These are the default package set required by opt-dist to correctly work, hence for people wanting to build a production grade of rustc in a sandboxed / air-gapped environment, these need to be vendored. The size of `rustc-src-nightly.tar.xz` before and after this change: * Before: 298M * After: 323M (+8%) These crates are the default set of packages required by opt-dist to correctly work, hence for people wanting to build a production grade of rustc in an sandboxed / air-gapped environment, these need to be vendored. The size of `rustc-src-nightly.tar.xz` before and after this change: * Before: 298M * After: 323M (+8%) Size change might or might not be a concern. See the previous discussion: rust-lang#125166 (comment) Previous efforts on making: * rust-lang#125125 * rust-lang#125166 --- Note that extra works still need to be done to make it fully vendored. * The current pinned rustc-perf uses `tempfile::Tempdir` as the working directory when collecting profiles from some of these packages. This "tmp" working directory usage make it impossible for Cargo to pick up the correct vendor sources setting in `.cargo/config.toml` bundled in the rustc-src tarball. [^1] * opt-dist verifies the final built rustc against a subset of rustc test suite. However it rolls out its own `config.toml` without setting `vendor = true`, and that results in `./vendor/` directory removed. [^2] [^1]: https://github.com/rust-lang/rustc-perf/blob/4f313add609f43e928e98132358e8426ed3969ae/collector/src/compile/benchmark/mod.rs#L164-L173 [^2]: https://github.com/rust-lang/rust/blob/606afbb617a2949a4e35c4b0258ff94c980b9451/src/tools/opt-dist/src/tests.rs#L62-L77
bootstrap: vendor crates required by opt-dist to collect profiles These are the default package set required by opt-dist to correctly work, hence for people wanting to build a production grade of rustc in a sandboxed / air-gapped environment, these need to be vendored. The size of `rustc-src-nightly.tar.xz` before and after this change: * Before: 298M * After: 323M (+8%) Size change might or might not be a concern. See the previous discussion: rust-lang#125166 (comment) Previous efforts on making: * rust-lang#125125 * rust-lang#125166 --- Note that extra works still need to be done to make it fully vendored. * The current pinned rustc-perf uses `tempfile::Tempdir` as the working directory when collecting profiles from some of these packages. This "tmp" working directory usage make it impossible for Cargo to pick up the correct vendor sources setting in `.cargo/config.toml` bundled in the rustc-src tarball. [^1] * opt-dist verifies the final built rustc against a subset of rustc test suite. However it rolls out its own `config.toml` without setting `vendor = true`, and that results in `./vendor/` directory removed. [^2] [^1]: https://github.com/rust-lang/rustc-perf/blob/4f313add609f43e928e98132358e8426ed3969ae/collector/src/compile/benchmark/mod.rs#L164-L173 [^2]: https://github.com/rust-lang/rust/blob/606afbb617a2949a4e35c4b0258ff94c980b9451/src/tools/opt-dist/src/tests.rs#L62-L77
These are the default package set required by opt-dist to correctly work, hence for people wanting to build a production grade of rustc in a sandboxed / air-gapped environment, these need to be vendored. The size of `rustc-src-nightly.tar.xz` before and after this change: * Before: 298M * After: 323M (+8%) These crates are the default set of packages required by opt-dist to correctly work, hence for people wanting to build a production grade of rustc in an sandboxed / air-gapped environment, these need to be vendored. The size of `rustc-src-nightly.tar.xz` before and after this change: * Before: 298M * After: 323M (+8%) Size change might or might not be a concern. See the previous discussion: rust-lang#125166 (comment) Previous efforts on making: * rust-lang#125125 * rust-lang#125166 --- Note that extra works still need to be done to make it fully vendored. * The current pinned rustc-perf uses `tempfile::Tempdir` as the working directory when collecting profiles from some of these packages. This "tmp" working directory usage make it impossible for Cargo to pick up the correct vendor sources setting in `.cargo/config.toml` bundled in the rustc-src tarball. [^1] * opt-dist verifies the final built rustc against a subset of rustc test suite. However it rolls out its own `config.toml` without setting `vendor = true`, and that results in `./vendor/` directory removed. [^2] [^1]: https://github.com/rust-lang/rustc-perf/blob/4f313add609f43e928e98132358e8426ed3969ae/collector/src/compile/benchmark/mod.rs#L164-L173 [^2]: https://github.com/rust-lang/rust/blob/606afbb617a2949a4e35c4b0258ff94c980b9451/src/tools/opt-dist/src/tests.rs#L62-L77
These are the default package set required by opt-dist to correctly work, hence for people wanting to build a production grade of rustc in a sandboxed / air-gapped environment, these need to be vendored. The size of `rustc-src-nightly.tar.xz` before and after this change: * Before: 298M * After: 323M (+8%) These crates are the default set of packages required by opt-dist to correctly work, hence for people wanting to build a production grade of rustc in an sandboxed / air-gapped environment, these need to be vendored. The size of `rustc-src-nightly.tar.xz` before and after this change: * Before: 298M * After: 323M (+8%) Size change might or might not be a concern. See the previous discussion: rust-lang#125166 (comment) Previous efforts on making: * rust-lang#125125 * rust-lang#125166 --- Note that extra works still need to be done to make it fully vendored. * The current pinned rustc-perf uses `tempfile::Tempdir` as the working directory when collecting profiles from some of these packages. This "tmp" working directory usage make it impossible for Cargo to pick up the correct vendor sources setting in `.cargo/config.toml` bundled in the rustc-src tarball. [^1] * opt-dist verifies the final built rustc against a subset of rustc test suite. However it rolls out its own `config.toml` without setting `vendor = true`, and that results in `./vendor/` directory removed. [^2] [^1]: https://github.com/rust-lang/rustc-perf/blob/4f313add609f43e928e98132358e8426ed3969ae/collector/src/compile/benchmark/mod.rs#L164-L173 [^2]: https://github.com/rust-lang/rust/blob/606afbb617a2949a4e35c4b0258ff94c980b9451/src/tools/opt-dist/src/tests.rs#L62-L77
These are the default package set required by opt-dist to correctly work, hence for people wanting to build a production grade of rustc in a sandboxed / air-gapped environment, these need to be vendored. The size of `rustc-src-nightly.tar.xz` before and after this change: * Before: 298M * After: 323M (+8%) These crates are the default set of packages required by opt-dist to correctly work, hence for people wanting to build a production grade of rustc in an sandboxed / air-gapped environment, these need to be vendored. The size of `rustc-src-nightly.tar.xz` before and after this change: * Before: 298M * After: 323M (+8%) Size change might or might not be a concern. See the previous discussion: rust-lang#125166 (comment) Previous efforts on making: * rust-lang#125125 * rust-lang#125166 --- Note that extra works still need to be done to make it fully vendored. * The current pinned rustc-perf uses `tempfile::Tempdir` as the working directory when collecting profiles from some of these packages. This "tmp" working directory usage make it impossible for Cargo to pick up the correct vendor sources setting in `.cargo/config.toml` bundled in the rustc-src tarball. [^1] * opt-dist verifies the final built rustc against a subset of rustc test suite. However it rolls out its own `config.toml` without setting `vendor = true`, and that results in `./vendor/` directory removed. [^2] [^1]: https://github.com/rust-lang/rustc-perf/blob/4f313add609f43e928e98132358e8426ed3969ae/collector/src/compile/benchmark/mod.rs#L164-L173 [^2]: https://github.com/rust-lang/rust/blob/606afbb617a2949a4e35c4b0258ff94c980b9451/src/tools/opt-dist/src/tests.rs#L62-L77
…Simulacrum bootstrap: vendor crates required by opt-dist to collect profiles These are the default package set required by opt-dist to correctly work, hence for people wanting to build a production grade of rustc in a sandboxed / air-gapped environment, these need to be vendored. The size of `rustc-src-nightly.tar.xz` before and after this change: * Before: 298M * After: 323M (+8%) Size change might or might not be a concern. See the previous discussion: rust-lang#125166 (comment) Previous efforts on making: * rust-lang#125125 * rust-lang#125166 --- Note that extra works still need to be done to make it fully vendored. * The current pinned rustc-perf uses `tempfile::Tempdir` as the working directory when collecting profiles from some of these packages. This "tmp" working directory usage make it impossible for Cargo to pick up the correct vendor sources setting in `.cargo/config.toml` bundled in the rustc-src tarball. [^1] * opt-dist verifies the final built rustc against a subset of rustc test suite. However it rolls out its own `config.toml` without setting `vendor = true`, and that results in `./vendor/` directory removed. [^2] [^1]: https://github.com/rust-lang/rustc-perf/blob/4f313add609f43e928e98132358e8426ed3969ae/collector/src/compile/benchmark/mod.rs#L164-L173 [^2]: https://github.com/rust-lang/rust/blob/606afbb617a2949a4e35c4b0258ff94c980b9451/src/tools/opt-dist/src/tests.rs#L62-L77
…Simulacrum bootstrap: vendor crates required by opt-dist to collect profiles These are the default package set required by opt-dist to correctly work, hence for people wanting to build a production grade of rustc in a sandboxed / air-gapped environment, these need to be vendored. The size of `rustc-src-nightly.tar.xz` before and after this change: * Before: 298M * After: 323M (+8%) Size change might or might not be a concern. See the previous discussion: rust-lang#125166 (comment) Previous efforts on making: * rust-lang#125125 * rust-lang#125166 --- Note that extra works still need to be done to make it fully vendored. * The current pinned rustc-perf uses `tempfile::Tempdir` as the working directory when collecting profiles from some of these packages. This "tmp" working directory usage make it impossible for Cargo to pick up the correct vendor sources setting in `.cargo/config.toml` bundled in the rustc-src tarball. [^1] * opt-dist verifies the final built rustc against a subset of rustc test suite. However it rolls out its own `config.toml` without setting `vendor = true`, and that results in `./vendor/` directory removed. [^2] [^1]: https://github.com/rust-lang/rustc-perf/blob/4f313add609f43e928e98132358e8426ed3969ae/collector/src/compile/benchmark/mod.rs#L164-L173 [^2]: https://github.com/rust-lang/rust/blob/606afbb617a2949a4e35c4b0258ff94c980b9451/src/tools/opt-dist/src/tests.rs#L62-L77
Rollup merge of rust-lang#125465 - weihanglo:opt-dist-vendor, r=Mark-Simulacrum bootstrap: vendor crates required by opt-dist to collect profiles These are the default package set required by opt-dist to correctly work, hence for people wanting to build a production grade of rustc in a sandboxed / air-gapped environment, these need to be vendored. The size of `rustc-src-nightly.tar.xz` before and after this change: * Before: 298M * After: 323M (+8%) Size change might or might not be a concern. See the previous discussion: rust-lang#125166 (comment) Previous efforts on making: * rust-lang#125125 * rust-lang#125166 --- Note that extra works still need to be done to make it fully vendored. * The current pinned rustc-perf uses `tempfile::Tempdir` as the working directory when collecting profiles from some of these packages. This "tmp" working directory usage make it impossible for Cargo to pick up the correct vendor sources setting in `.cargo/config.toml` bundled in the rustc-src tarball. [^1] * opt-dist verifies the final built rustc against a subset of rustc test suite. However it rolls out its own `config.toml` without setting `vendor = true`, and that results in `./vendor/` directory removed. [^2] [^1]: https://github.com/rust-lang/rustc-perf/blob/4f313add609f43e928e98132358e8426ed3969ae/collector/src/compile/benchmark/mod.rs#L164-L173 [^2]: https://github.com/rust-lang/rust/blob/606afbb617a2949a4e35c4b0258ff94c980b9451/src/tools/opt-dist/src/tests.rs#L62-L77
These are the default package set required by opt-dist to correctly work, hence for people wanting to build a production grade of rustc in a sandboxed / air-gapped environment, these need to be vendored. The size of `rustc-src-nightly.tar.xz` before and after this change: * Before: 298M * After: 323M (+8%) These crates are the default set of packages required by opt-dist to correctly work, hence for people wanting to build a production grade of rustc in an sandboxed / air-gapped environment, these need to be vendored. The size of `rustc-src-nightly.tar.xz` before and after this change: * Before: 298M * After: 323M (+8%) Size change might or might not be a concern. See the previous discussion: rust-lang#125166 (comment) Previous efforts on making: * rust-lang#125125 * rust-lang#125166 --- Note that extra works still need to be done to make it fully vendored. * The current pinned rustc-perf uses `tempfile::Tempdir` as the working directory when collecting profiles from some of these packages. This "tmp" working directory usage make it impossible for Cargo to pick up the correct vendor sources setting in `.cargo/config.toml` bundled in the rustc-src tarball. [^1] * opt-dist verifies the final built rustc against a subset of rustc test suite. However it rolls out its own `config.toml` without setting `vendor = true`, and that results in `./vendor/` directory removed. [^2] [^1]: https://github.com/rust-lang/rustc-perf/blob/4f313add609f43e928e98132358e8426ed3969ae/collector/src/compile/benchmark/mod.rs#L164-L173 [^2]: https://github.com/rust-lang/rust/blob/606afbb617a2949a4e35c4b0258ff94c980b9451/src/tools/opt-dist/src/tests.rs#L62-L77
Currently, it's very challenging to perform a sandboxed
opt-dist
bootstrap because the tool requires
rustc-perf
to be present, butthere is no proper management/tracking of it. Instead, a specific commit
is hardcoded where it is needed, and a non-checksummed zip is fetched
ad-hoc. This happens in two places:
src/ci/docker/host-x86_64/dist-x86_64-linux/Dockerfile
:src/tools/opt-dist/src/main.rs
This causes a few issues:
every time
opt-dist
in a sandbox, you need to provide your ownrustc-perf
(feat(tools/opt-dist): allow local builds to specify a rustc-perf checkout #125125), but tofigure out which commit to provide you need to grep the Dockerfile
rustc-perf
, itsdependencies are not included in the
vendor/
dir created duringdist
, so it will fail to build from the published source tarballsrustc-perf
in use, leading to stalenessFundamentally, this means
rustc-src
tarballs no longer containeverything you need to bootstrap Rust, and packagers hoping to leverage
opt-dist
need to go out of their way to keep track of this "hidden"dependency on
rustc-perf
.This change adds rustc-perf as a git submodule, pinned to the current
PERF_COMMIT
4f313ad. Subsequentcommits ensure the submodule is initialized when necessary, and make use
of it in
opt-dist
.