-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[experiment] Add -Z no-link
flag
#67195
Conversation
(rust_highfive has picked a reviewer for you, use r? to override) |
The job Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
src/librustc_interface/passes.rs
Outdated
@@ -933,6 +933,10 @@ fn encode_and_write_metadata( | |||
}).max().unwrap_or(MetadataKind::None); | |||
|
|||
let metadata = match metadata_kind { | |||
|
|||
// Metadata is always required for two-staged compile and link. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only necessary data is the data passed to https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_utils/codegen_backend/trait.CodegenBackend.html#tymethod.join_codegen_and_link:
- Session
- OutputFilenames
- CodegenResults (Actually OngoingCodegen is passed, but when joining the codegen, CodegenResults is produced)
DepGraph is not necessary (it is used for storing the location of incremental compilation artifacts, https://doc.rust-lang.org/nightly/nightly-rustc/src/rustc_codegen_llvm/lib.rs.html#307 but after joining codegen, the no-link invocation of rustc can do this itself)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, metadata is for reconstructing CrateInfo (part of CodegenResult), as it need information about dependent libraries (both rlib and native ones).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be possible to just derive Serialize
for CrateInfo
and put it in a .rlink
file together with the other necessary data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, let me look into it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found CrateInfo
+ "other necessary data" is indeed CodegenResults
. So I updated my the PR to serialize CodegenResults
into a .rlink
file.
@@ -308,7 +308,8 @@ impl CodegenBackend for LlvmCodegenBackend { | |||
|
|||
sess.compile_status()?; | |||
|
|||
if !sess.opts.output_types.keys().any(|&i| i == OutputType::Exe || | |||
if sess.opts.debugging_opts.no_link || | |||
!sess.opts.output_types.keys().any(|&i| i == OutputType::Exe || | |||
i == OutputType::Metadata) { | |||
return Ok(()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed that finalize_session_directory
is never called in this case, which means that incremental compilation won't work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should it have been called for Exe
and Metadata
output types, too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current condition says return if neither exe, nor metadata is emitted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just realized that codegen_backend.join_codegen_and_link
needs to be called to actually emit all object files. Maybe split rename it to join_codegen
and return CodegenResults
from it, leaving the linker part to rustc_driver
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good suggestion. In fact, I think it is a good refactoring even without this -Z no-link
change.
r? @tmandry |
OutputType::Bitcode => { | ||
modules_config.emit_bc = true; | ||
if sess.opts.debugging_opts.no_link { | ||
metadata_config.emit_bc = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may be missing something here, but I don't really see why metadata codegen involves bitcode, LLVM IR, etc. This should only control the output of the rmeta
file read by rustc if I'm not mistaken.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe -Zno-link
should still compile everything, only leaving the linking to a next rustc invocation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This metadata_config isn't for the rmeta
file used later for linking. It is to generate a metadata module linking into a Dylib
or ProcMacro
. Bitcode/LLVM IR/etc. won't be generated for any other crate types.
It.. looks like this change hit a file line length tidy limit (3000 lines) in This seems arbitrary, and it's a bit unfair for a new contributor to have to deal with this. Does anyone who's contributed to this file have suggestions on how to break it up? Or should we override it for now? |
@tmandry The limit is arbitrary, but that's the fundamental nature of limits. :) I believe these tidy checks are necessary for code quality to push us towards reasonably sized files (not like I would suggest one of the following strategies:
|
I'm happy to refactor
On top of that, as @Centril suggested, I also would like to move |
I don't personally love the idea of separating the primary data structure definition of That said, just do what seems most reasonable for now and we can see how it looks in review. I don't want to get bogged down in a refactoring bikeshed 😄 |
It may be worth pointing out that simply the presence of |
|
Thank you for bring this up. Yes, at the moment, |
@0dvictor Yes, definitely, and I think that:
...is not something I agree with. :) In particular, it makes a whole lot of sense to separate data definitions and logic so that things are more pluggable and it's also better in general for incremental. This is precisely the thing we've been trying to do with splitting libsyntax from the parser. So I want to encourage your refactoring proposal. |
As a heads up that's unfortunately a breaking change since |
Maybe have something like |
Good point. Let me think about it and update the changes. BTW, unrelated to |
Good suggestion, and it should work well with your previous suggestion on serializing |
@0dvictor that's possible yeah! It's not as parallel since codegen is still serial (one big object file), but it's probably best to leave all this to a separate issue. |
☔ The latest upstream changes (presumably #67449) made this pull request unmergeable. Please resolve the merge conflicts. |
src/librustc/hir/def_id.rs
Outdated
impl rustc_serialize::UseSpecializedDecodable for CrateNum {} | ||
impl rustc_serialize::UseSpecializedEncodable for CrateNum { | ||
fn default_encode<E: Encoder>(&self, e: &mut E) -> Result<(), E::Error> { | ||
e.emit_u32(self.as_u32()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not correct, as the crate numbers may be mapped differently in a different compilation session. Only the Svh
is guaranteed to remain the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for bringing this up. I was actually struggled a lot trying to emit the correct data. I believe we can safely simply emit the crate number here.
As you pointed out, the crate numbers is only meaningful within the same compilation session; therefore, one must have the session info when using crate numbers. For example, one might need include a crate number -> Svh
mapping table in the serialized data, or some other ways to correctly map crate numbers to the actual crates.
In this PR, crate numbers are used within one CodegenResults
, and CodegenResults
only belongs to one specific session; therefore, crate numbers are unique. Then we can use CodegenResults.crate_info.used_crate_source
to find the crate from a crate number.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You also replaced the Encodable
impl of CrateNum
for other usages. Maybe make a new LinkingCrateNum
newtype for u32
and store that in crate_info
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this Encodable
impl is currently used since the default impl just returns an error.
However, I like your idea of making a new LinkingCrateNum
as it conceptually decouples CrateInfo
from a compilation session. I'll update accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/librustc/hir/def_id.rs
Outdated
fn default_encode<S: Encoder>(&self, s: &mut S) -> Result<(), S::Error> { | ||
let krate = u64::from(self.krate.as_u32()); | ||
let index = u64::from(self.index.as_u32()); | ||
s.emit_u64((krate<<32) | index) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, a DefId
is conceptually an interned DefPath
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to CrateNum
, one should include a DefId
to DefPath
mapping in the serialized data. However, in this PR, DefId
isn't used by the linker so that the mapping can be safely omitted.
The job Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
Rebase again to fix the tidy error. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, at the moment,
-Z no-link
would only allow 1 codegen units. I am working on enhancing it to support multiple CGUs.
@0dvictor what's the status of this? It doesn't look like you're changing the output types anymore. Does that mean multiple CGUs are supported now?
There was some discussion about --emit delayed-link
but I'm not sure if it's needed. To my eyes everything looks good.
@@ -160,6 +160,7 @@ pub enum ExternCrateSource { | |||
Path, | |||
} | |||
|
|||
#[derive(RustcEncodable, RustcDecodable)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to save this in a binary file rather than escaped binary in a json string.
That said, I'm okay with doing this later, especially since this is an experimental feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Totally agreed, a binary file is definitely the proper way. I chose JSON in this PR purely because I don't want the file format details hold the feature, especially the proposed approach has not been reviewed yet.
@@ -86,8 +87,16 @@ impl fmt::Display for CrateNum { | |||
} | |||
} | |||
|
|||
impl rustc_serialize::UseSpecializedEncodable for CrateNum {} | |||
impl rustc_serialize::UseSpecializedDecodable for CrateNum {} | |||
impl rustc_serialize::UseSpecializedEncodable for CrateNum { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we decided to stick with this approach (#67516 (comment)), we may want to add a comment (here or on CrateNum
) saying which conditions have to hold when using this.
Correct, multiple CGUs are naturally supported; no extra work is needed.
Yes, though I like the idea of |
It is possible to hide non |
@bors r+ |
📌 Commit be86fb3 has been approved by |
[experiment] Add `-Z no-link` flag Adds a compiler option to allow rustc compile a crate without linking. With this flag, `rustc` serializes codegen_results into a `.rlink` file. Part of Issue rust-lang#64191
⌛ Testing commit be86fb3 with merge 089b0632c6ece90649d7b94596970f458cb9d292... |
Adds a compiler option to allow rustc compile a crate without linking. With this flag, rustc serializes codegen_results into a .rlink file.
@bors r+ |
📌 Commit 6a6ebb4 has been approved by |
[experiment] Add `-Z no-link` flag Adds a compiler option to allow rustc compile a crate without linking. With this flag, `rustc` serializes codegen_results into a `.rlink` file. Part of Issue rust-lang#64191
⌛ Testing commit 6a6ebb4 with merge 26fe0c8fa84a65deb9fe8ee6f2b928ed817c1d0b... |
Thank you all for reviewing and guiding me through the process. I just accepted the suggested changes, squashed all commits into one, and rebased to the latest. The PR is good to go. |
@bors retry rolled up |
Rollup of 10 pull requests Successful merges: - #67195 ([experiment] Add `-Z no-link` flag) - #68253 (add bare metal ARM Cortex-A targets to rustc) - #68361 (Unbreak linking with lld 9 on FreeBSD 13.0-CURRENT i386) - #68388 (Make `TooGeneric` error in WF checking a proper error) - #68409 (Micro-optimize OutputFilenames) - #68410 (Export weak symbols used by MemorySanitizer) - #68425 (Fix try-op diagnostic in E0277 for methods) - #68440 (bootstrap: update clippy subcmd decription) - #68441 (pprust: use as_deref) - #68462 (librustc_mir: don't allocate vectors where slices will do.) Failed merges: r? @ghost
Your PR failed (pretty log, raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem. Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
Adds a compiler option to allow rustc compile a crate without linking.
With this flag,
rustc
serializes codegen_results into a.rlink
file.Part of Issue #64191