Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix some issues with cargo doc and the new feature resolver. #9077

Merged
merged 3 commits into from
Jan 20, 2021

Conversation

ehuss
Copy link
Contributor

@ehuss ehuss commented Jan 14, 2021

This contains two related commits fixing some issues with cargo doc and the new feature resolver.

The first commit fixes a panic. The old code was using UnitFor::new_normal when computing doc units, but this was wrong. That essentially circumvents the new feature resolver, and breaks determining the correct features to use. I don't remember exactly what I was thinking when I wrote that, other than "rustdoc shouldn't care", but it does matter.

Unfortunately changing that makes collisions worse because it aggravates #6313, so the second commit adds some logic for removing some collisions automatically. Specifically:

  • Prefers dependencies for the target over dependencies for the host (such as proc-macros).
  • Prefers the "newest" version if it comes from the same source.

There are still plenty of situations where there can be collisions, but I'm uncertain on the best way to handle those.

Fixes #9064

ehuss added 2 commits January 12, 2021 09:17
There are some cases where `cargo doc` will try to document two things
with the same crate_name. This attempts to automatically remove some of
those duplicates based on some rules:

- Prefers dependencies for the target over dependencies for the host
  (such as proc-macros).
- Prefers the "newest" version if it comes from the same source.

There are still plenty of situations where there can be collisions, but
I'm uncertain on the best way to handle those.
@rust-highfive
Copy link

r? @alexcrichton

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jan 14, 2021
@alexcrichton
Copy link
Member

Oof.

I was gonna review this before heading out for today, but I think it might be best if I left this for next week. FWIW I'll be out until next Tuesday, but I'll get to this then! If this is urgent to fix in nightly, however, feel free to land and I'll review after-the-fact. A quick skim and seems fine to me, but I'd like to review the pruning of the unit graph more closely.

@ehuss
Copy link
Contributor Author

ehuss commented Jan 14, 2021

No worries, and no rush, next week is plenty fast.

There is a comment at the start of remove_duplicate_doc, where I am uncertain about creating orphans in the tree. I'm fine with adding something to prune the orphans, I just wasn't certain if it was justified.

This feels a bit hacky, but I couldn't really think of a cleaner solution (other than redesigning rust, cargo, and rustdoc from scratch 😜 ).

Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh dear this is indeed quite subtle. I do think the best solution here is some extra flags to rustdoc, but that's more of a long-term issue than one to solve right now.

Otherwise can you beef up the documentation for each of the cases where duplicates are removed? Explaining in words a small example of what we're deduplicating is sort of what I'm thinking.

This is definitely going to be buggy in the sense that by selecting the newest version that means that broken links could be generated if links are supposed to reference an older version. Again though this is all just an artifact of rustdoc's current design which wasn't intended to cover cases like this (it was designed in a simpler time...)

/// - Different sources. See `collision_doc_sources` test.
///
/// Ideally this would not be necessary.
fn remove_duplicate_doc(build_config: &BuildConfig, unit_graph: &mut UnitGraph) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW long-term I don't think that this function should exist. Ideally rustdoc would have a feature that we could specify the directory/lookup name and we could pass that in to avoid all duplication issues.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. However, it is not clear to me how that should work. Because the directories are exposed as URLs, I think it would be unpleasant for those to have Cargo-style hashes in them. There could be some pattern like $NAME-$VERSION and then have extra disambiguators when necessary like $NAME-$VERSION-bin-foo. But there are some collisions that are harder (like when they only differ in the features).

What do you think about having a hash in the URL?

This feels like a bit of a mess, and I'm uncertain how to climb out of it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least for cargo doc locally I feel like the hash doesn't matter too much since you'll just be clicking on links anyway (assuming we generate some well-known landing page which we don't already do today). For everything else though (like doc.rust-lang.org or docs.rs) I wouldn't want Cargo-style hashes in URLs.

This function is likely gonna be here for a long long time which is by no means the end of the world. I think it'll take some more thinking to figure out what it might look like to not have this function.

for unit_deps in unit_graph.values_mut() {
unit_deps.retain(|unit_dep| !units.iter().any(|unit| *unit == unit_dep.unit));
}
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm slightly worried about the performance of this since this could be in the worst case O(n^2). Would it be possible to maintain a set of units to remove and they're all processed together at the very end?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, yea. Pushed a change to filter them once.

}
}
// Prefer newer versions over older.
let mut source_map: HashMap<(InternedString, SourceId, CompileKind), Vec<Unit>> =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could the key here perhaps be BTreeMap<Version, Unit>? (to avoid the extra sort later)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so (per the other comment, there can be multiple entries with the same version).

let newest_version = units.last().unwrap().pkg.version().clone();
let (to_remove, keep_units): (Vec<Unit>, Vec<Unit>) = units
.into_iter()
.partition(|unit| unit.pkg.version() < &newest_version);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given a source it's not possible to have multiple crates of the same name at the same version, right?

Is it possible for keep_units to have length greater than 1? (I'm probably missing something...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be very rare, but it is possible.

One example is cargo doc --bins --lib. This will try to document both the library and the binary, which have the same name and thus end up colliding.

I'm not sure how that should behave. Should it be an error? Should it ignore the binary? Should rustdoc put those in separate directories?

There are a bunch of collision opportunities, and this isn't intended to cover all of them, just a few that seemed reasonable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm nah that's a good point. I don't know of a better thing to do here other than warn really at this time unfortunately.

@alexcrichton
Copy link
Member

@bors: r+

@bors
Copy link
Contributor

bors commented Jan 20, 2021

📌 Commit c5e3f17 has been approved by alexcrichton

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 20, 2021
@bors
Copy link
Contributor

bors commented Jan 20, 2021

⌛ Testing commit c5e3f17 with merge 783bc43...

@bors
Copy link
Contributor

bors commented Jan 20, 2021

☀️ Test successful - checks-actions
Approved by: alexcrichton
Pushing 783bc43 to master...

@bors bors merged commit 783bc43 into rust-lang:master Jan 20, 2021
bors added a commit that referenced this pull request Feb 5, 2021
Fix panic with doc collision orphan.

As I feared, the collision removal added in #9077 caused problems due to orphans left in the unit graph. Ironically, it was the collision warning detection code which failed, as it iterates over all the keys of the graph.

The solution is to remove all orphans from the graph after collisions have been removed.

Fixes #9136
ehuss pushed a commit to ehuss/cargo that referenced this pull request Feb 22, 2021
Fix panic with doc collision orphan.

As I feared, the collision removal added in rust-lang#9077 caused problems due to orphans left in the unit graph. Ironically, it was the collision warning detection code which failed, as it iterates over all the keys of the graph.

The solution is to remove all orphans from the graph after collisions have been removed.

Fixes rust-lang#9136
@ehuss ehuss added this to the 1.51.0 milestone Feb 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Panic in cargo doc with resolver = "2" when indirectly depending on proc-macro2
4 participants