-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Report codegen timings for binary crates #10960
Conversation
This is done by enabling `--emit metadata` for binary crates too despite them not actually supporting crate metadata. Rustc will produce a harmless empty metadata file, but also report exactly when codegen starts just like currently happens for rlibs.
r? @ehuss (rust-highfive has picked a reviewer for you, use r? to override) |
To copy what I have said on zulip. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While it's a bit confusing that binary crate type also emits an rmeta with the lib
prefix (e.g. libmybin-deadbeef.rmeta
), I feel like this pull request is generally safe in terms of duplicate artifact names, as most executables were generated with rustc --extra-filename
flag. However, I found some corner cases described in the following code comment:
cargo/src/cargo/core/compiler/context/compilation_files.rs
Lines 675 to 705 in 5db8295
// No metadata in these cases: | |
// | |
// - dylibs: | |
// - if any dylib names are encoded in executables, so they can't be renamed. | |
// - TODO: Maybe use `-install-name` on macOS or `-soname` on other UNIX systems | |
// to specify the dylib name to be used by the linker instead of the filename. | |
// - Windows MSVC executables: The path to the PDB is embedded in the | |
// executable, and we don't want the PDB path to include the hash in it. | |
// - wasm32-unknown-emscripten executables: When using emscripten, the path to the | |
// .wasm file is embedded in the .js file, so we don't want the hash in there. | |
// | |
// This is only done for local packages, as we don't expect to export | |
// dependencies. | |
// | |
// The __CARGO_DEFAULT_LIB_METADATA env var is used to override this to | |
// force metadata in the hash. This is only used for building libstd. For | |
// example, if libstd is placed in a common location, we don't want a file | |
// named /usr/lib/libstd.so which could conflict with other rustc | |
// installs. In addition it prevents accidentally loading a libstd of a | |
// different compiler at runtime. | |
// See https://github.com/rust-lang/cargo/issues/3005 | |
let short_name = bcx.target_data.short_name(&unit.kind); | |
if (unit.target.is_dylib() | |
|| unit.target.is_cdylib() | |
|| (unit.target.is_executable() && short_name == "wasm32-unknown-emscripten") | |
|| (unit.target.is_executable() && short_name.contains("msvc"))) | |
&& unit.pkg.package_id().source_id().is_path() | |
&& env::var("__CARGO_DEFAULT_LIB_METADATA").is_err() | |
{ | |
return false; | |
} |
I didn't have Windows at hand. It seems like if there is a workspace containing one binary crate and one library crate under the same name, they might conflict 🤯
BTW we also need to update these lines to reflect the change of codegen in timing doc.
Ping @bjorn3 . Just checking in to see if you are still interested in working on this, or if you could have time to check there is no artifact name conflict on those cases (and maybe others out of our radar). |
I'm still interested in this. A similar problem with respect to filename conflicts exists for split debuginfo and it seems like we are just going to accept that there are conflicts in that case: #11384 (comment) Split debuginfo is not read back by rustc, so it is merely a problem for debuggers, but rmeta files are read by rustc, so it may actually cause issues. One idea would be to force the metadata hash in the filename for rmeta files even if the main linked artifact doesn't have it. This is safe as rmeta files don't contain their own filename, unlike dylibs. WDYT @weihanglo? |
Thank you for the response. To me, the situation of this PR is different from debuginfo one: The latter would introduce new broken stuff, but the former might break existing code. I think emitting rmeta with metadata hash name might work, though I am not sure how much we will pay for the change. Do you have a workable patch to share? |
I don't have a patch yet as I first wanted to know if it would be acceptable. I think using cargo/src/cargo/core/compiler/mod.rs Lines 915 to 924 in 70898e5
|
Hmm… that doesn't look too good to me. Should we track them by Apart from that, below is a report from |
The blue part is everything that happens before codegen as metadata emission is the last thing happening before codegen. For binaries the metadata is empty, but it is still useful to know when codegen starts. |
Yep. I meant shouldn't the codegen start at the same time of the binary unit starts? Below is the current (1.66) graph which shows no codegen (no lavender color) at all. I am confused. Sorry 😞. And it looks like the graph is drawn as "no rmeta -> all blue" regardless of codegen or not. cargo/src/cargo/core/compiler/timings.js Lines 98 to 109 in c7fb756
|
Codegen can't start until macro expansion, typeck, borrowck, ... are done.
Cargo uses the moment that the rmeta file is emitted as proxy for the moment that codegen starts. As such before this PR it wouldn't record any codegen time as no rmeta file is created, even though codegen does actually happen. This PR unconditionally asks rustc to emit metadata precisely to allow metadata emission to be a proxy for the moment codegen starts for all crate types, not just those that don't need linking. |
Make a lot of sense. Thanks for the explanation! Unfortunately, I don't quite sure about the consequence of setting explicit I believe @ehuss could tell a better story than me. |
Me neither. I would expect it to be properly tracked due to the depinfo file produced by rustc containing the right path, but I'm not familiar with the cargo internals, so I don't know if it would have any other side effects.
Possibly |
I think it would be ok to use an explicit path as long as the output-tracking code was kept in sync. Unfortunately my instinct is that might be a little messy. If it can be done without too much trouble, I think it should be fine. I still feel that it would be better to add messages to rustc to report timing information. I imagine knowing link time would be really useful. I understand that's a much bigger project, but I wouldn't expect it to be too difficult, since rustc is already tracking that information. It seems like it would mostly be a discussion around the exact JSON structure, how it is enabled, and maybe making it clear that the structure can change over time. |
☔ The latest upstream changes (presumably #14209) made this pull request unmergeable. Please resolve the merge conflicts. |
As this is over a year stale, I'm going to close this. I went ahead and created #14265 so we don't lose track of the great research done here! |
This is done by enabling
--emit metadata
for binary crates too despite them not actually supporting crate metadata. Rustc will produce a harmless empty metadata file, but also report exactly when codegen starts just like currently happens for rlibs.Zulip stream