Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Report codegen timings for binary crates #10960

Closed
wants to merge 1 commit into from

Conversation

bjorn3
Copy link
Member

@bjorn3 bjorn3 commented Aug 9, 2022

This is done by enabling --emit metadata for binary crates too despite them not actually supporting crate metadata. Rustc will produce a harmless empty metadata file, but also report exactly when codegen starts just like currently happens for rlibs.

Zulip stream

This is done by enabling `--emit metadata` for binary crates too despite
them not actually supporting crate metadata. Rustc will produce a
harmless empty metadata file, but also report exactly when codegen
starts just like currently happens for rlibs.
@rust-highfive
Copy link

r? @ehuss

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Aug 9, 2022
@bjorn3
Copy link
Member Author

bjorn3 commented Aug 9, 2022

To copy what I have said on zulip. cargo clean -p deletes the new .rmeta files. In addition this is also a prerequisite for rust-lang/rust#93945 once I revive it as that PR needs .rmeta files for dylib crates too.

Copy link
Member

@weihanglo weihanglo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While it's a bit confusing that binary crate type also emits an rmeta with the lib prefix (e.g. libmybin-deadbeef.rmeta), I feel like this pull request is generally safe in terms of duplicate artifact names, as most executables were generated with rustc --extra-filename flag. However, I found some corner cases described in the following code comment:

// No metadata in these cases:
//
// - dylibs:
// - if any dylib names are encoded in executables, so they can't be renamed.
// - TODO: Maybe use `-install-name` on macOS or `-soname` on other UNIX systems
// to specify the dylib name to be used by the linker instead of the filename.
// - Windows MSVC executables: The path to the PDB is embedded in the
// executable, and we don't want the PDB path to include the hash in it.
// - wasm32-unknown-emscripten executables: When using emscripten, the path to the
// .wasm file is embedded in the .js file, so we don't want the hash in there.
//
// This is only done for local packages, as we don't expect to export
// dependencies.
//
// The __CARGO_DEFAULT_LIB_METADATA env var is used to override this to
// force metadata in the hash. This is only used for building libstd. For
// example, if libstd is placed in a common location, we don't want a file
// named /usr/lib/libstd.so which could conflict with other rustc
// installs. In addition it prevents accidentally loading a libstd of a
// different compiler at runtime.
// See https://github.com/rust-lang/cargo/issues/3005
let short_name = bcx.target_data.short_name(&unit.kind);
if (unit.target.is_dylib()
|| unit.target.is_cdylib()
|| (unit.target.is_executable() && short_name == "wasm32-unknown-emscripten")
|| (unit.target.is_executable() && short_name.contains("msvc")))
&& unit.pkg.package_id().source_id().is_path()
&& env::var("__CARGO_DEFAULT_LIB_METADATA").is_err()
{
return false;
}

I didn't have Windows at hand. It seems like if there is a workspace containing one binary crate and one library crate under the same name, they might conflict 🤯


BTW we also need to update these lines to reflect the change of codegen in timing doc.

@weihanglo weihanglo added S-waiting-on-author Status: The marked PR is awaiting some action (such as code changes) from the PR author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 27, 2022
@weihanglo
Copy link
Member

Ping @bjorn3 . Just checking in to see if you are still interested in working on this, or if you could have time to check there is no artifact name conflict on those cases (and maybe others out of our radar).

@bjorn3
Copy link
Member Author

bjorn3 commented Dec 6, 2022

I'm still interested in this. A similar problem with respect to filename conflicts exists for split debuginfo and it seems like we are just going to accept that there are conflicts in that case: #11384 (comment) Split debuginfo is not read back by rustc, so it is merely a problem for debuggers, but rmeta files are read by rustc, so it may actually cause issues. One idea would be to force the metadata hash in the filename for rmeta files even if the main linked artifact doesn't have it. This is safe as rmeta files don't contain their own filename, unlike dylibs. WDYT @weihanglo?

@weihanglo
Copy link
Member

Thank you for the response. To me, the situation of this PR is different from debuginfo one: The latter would introduce new broken stuff, but the former might break existing code.

I think emitting rmeta with metadata hash name might work, though I am not sure how much we will pay for the change. Do you have a workable patch to share?

@bjorn3
Copy link
Member Author

bjorn3 commented Dec 9, 2022

I think emitting rmeta with metadata hash name might work, though I am not sure how much we will pay for the change. Do you have a workable patch to share?

I don't have a patch yet as I first wanted to know if it would be acceptable. I think using --emit metadata=/path/to/libfoo-hash.rmeta rather than --emit metadata at

if unit.mode.is_check() {
cmd.arg("--emit=dep-info,metadata");
} else if !unit.requires_upstream_objects() {
// Always produce metadata files for rlib outputs. Metadata may be used
// in this session for a pipelined compilation, or it may be used in a
// future Cargo session as part of a pipelined compile.
cmd.arg("--emit=dep-info,metadata,link");
} else {
cmd.arg("--emit=dep-info,link");
}
would be enough for that.

@weihanglo
Copy link
Member

weihanglo commented Dec 17, 2022

I think using --emit metadata=/path/to/libfoo-hash.rmeta rather than --emit metadata at

Hmm… that doesn't look too good to me. Should we track them by calc_rustc_outputs(…) and list them in cx.outputs(…)? I am a bit worried about losing tracked by Cargo's compilation outputs.

Apart from that, below is a report from cargo b --timings upon commit 1e3648b. Does it look correct for a binary unit containing both metadata and codegen parts in this graph? I thought binaries only have to codegen, so it should be all lavender. I am not an expert and might be wrong though.

@bjorn3
Copy link
Member Author

bjorn3 commented Dec 17, 2022

Does it look correct for a binary unit containing both metadata and codegen parts in this graph? I thought binaries only have to codegen, so it should be all lavender. I am not an expert and might be wrong though.

The blue part is everything that happens before codegen as metadata emission is the last thing happening before codegen. For binaries the metadata is empty, but it is still useful to know when codegen starts.

@weihanglo
Copy link
Member

weihanglo commented Dec 17, 2022

For binaries the metadata is empty, but it is still useful to know when codegen starts.

Yep. I meant shouldn't the codegen start at the same time of the binary unit starts? Below is the current (1.66) graph which shows no codegen (no lavender color) at all. I am confused. Sorry 😞.

And it looks like the graph is drawn as "no rmeta -> all blue" regardless of codegen or not.

ctx.beginPath();
ctx.fillStyle = unit.mode == 'run-custom-build' ? '#f0b165' : '#95cce8';
roundedRect(ctx, x, y, width, BOX_HEIGHT, RADIUS);
ctx.fill();
if (unit.rmeta_time != null) {
ctx.beginPath();
ctx.fillStyle = '#aa95e8';
let ctime = unit.duration - unit.rmeta_time;
roundedRect(ctx, rmeta_x, y, px_per_sec * ctime, BOX_HEIGHT, RADIUS);
ctx.fill();
}

@bjorn3
Copy link
Member Author

bjorn3 commented Dec 17, 2022

Yep. I meant shouldn't the codegen start at the same time of the binary unit starts?

Codegen can't start until macro expansion, typeck, borrowck, ... are done.

Below is the current (1.66) graph which shows no codegen (no lavender color) at all.

Cargo uses the moment that the rmeta file is emitted as proxy for the moment that codegen starts. As such before this PR it wouldn't record any codegen time as no rmeta file is created, even though codegen does actually happen. This PR unconditionally asks rustc to emit metadata precisely to allow metadata emission to be a proxy for the moment codegen starts for all crate types, not just those that don't need linking.

@weihanglo
Copy link
Member

Make a lot of sense. Thanks for the explanation!

Unfortunately, I don't quite sure about the consequence of setting explicit --emit path 😞. I still feel like it should be tracked in any case. We may also need a new the approach to always give each .rmeta a hash but no for artifact explicitly excluded (dylib and friends?).

I believe @ehuss could tell a better story than me.

@bjorn3
Copy link
Member Author

bjorn3 commented Dec 18, 2022

Unfortunately, I don't quite sure about the consequence of setting explicit --emit path

Me neither. I would expect it to be properly tracked due to the depinfo file produced by rustc containing the right path, but I'm not familiar with the cargo internals, so I don't know if it would have any other side effects.

We may also need a new the approach to always give each .rmeta a hash but no for artifact explicitly excluded (#10960 (review)).

Possibly

@ehuss
Copy link
Contributor

ehuss commented Jan 6, 2023

I think it would be ok to use an explicit path as long as the output-tracking code was kept in sync. Unfortunately my instinct is that might be a little messy. If it can be done without too much trouble, I think it should be fine.

I still feel that it would be better to add messages to rustc to report timing information. I imagine knowing link time would be really useful. I understand that's a much bigger project, but I wouldn't expect it to be too difficult, since rustc is already tracking that information. It seems like it would mostly be a discussion around the exact JSON structure, how it is enabled, and maybe making it clear that the structure can change over time.

@bors
Copy link
Contributor

bors commented Jul 10, 2024

☔ The latest upstream changes (presumably #14209) made this pull request unmergeable. Please resolve the merge conflicts.

@epage
Copy link
Contributor

epage commented Jul 18, 2024

As this is over a year stale, I'm going to close this. I went ahead and created #14265 so we don't lose track of the great research done here!

@epage epage closed this Jul 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-author Status: The marked PR is awaiting some action (such as code changes) from the PR author.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants