Skip to content

Commit

Permalink
Rollup merge of rust-lang#109422 - notriddle:notriddle/impl-disambigu…
Browse files Browse the repository at this point in the history
…ate-search, r=GuillaumeGomez

rustdoc-search: add impl disambiguator to duplicate assoc items

Preview (to see the difference, click the link and pay attention to the specific function that comes up):

| Before | After |
|--|--|
| [`simd<i64>, simd<i64> -> simd<i64>`](https://doc.rust-lang.org/nightly/std/?search=simd%3Ci64%3E%2C%20simd%3Ci64%3E%20-%3E%20simd%3Ci64%3E) | [`simd<i64>, simd<i64> -> simd<i64>`](https://notriddle.com/rustdoc-demo-html-3/impl-disambiguate-search/std/index.html?search=simd%3Ci64%3E%2C%20simd%3Ci64%3E%20-%3E%20simd%3Ci64%3E) |
| [`cow, vec -> bool`](https://doc.rust-lang.org/nightly/std/?search=cow%2C%20vec%20-%3E%20bool) | [`cow, vec -> bool`](https://notriddle.com/rustdoc-demo-html-3/impl-disambiguate-search/std/index.html?search=cow%2C%20vec%20-%3E%20bool)

Helps with rust-lang#90929

This changes the search results, specifically, when there's more than one impl with an associated item with the same name. For example, the search queries `simd<i8> -> simd<i8>` and `simd<i64> -> simd<i64>` don't link to the same function, but most of the functions have the same names.

This change should probably be FCP-ed, especially since it adds a new anchor link format for `main.js` to handle, so that URLs like `struct.Vec.html#impl-AsMut<[T]>-for-Vec<T,+A>/method.as_mut` redirect to `struct.Vec.html#method.as_mut-2`. It's a strange design, but there are a few reasons for it:

* I'd like to avoid making the HTML bigger. Obviously, fixing this bug is going to add at least a little more data to the search index, but adding more HTML penalises viewers for the benefit of searchers.

* Breaking `struct.Vec.html#method.len` would also be a disappointment.

On the other hand:

* The path-style anchors might be less prone to link rot than the numbered anchors. It's definitely less likely to have URLs that appear to "work", but silently point at the wrong thing.

* This commit arranges the path-style anchor to redirect to the numbered anchor. Nothing stops rustdoc from doing the opposite, making path-style anchors the default and redirecting the "legacy" numbered ones.

### The bug

On the "Before" links, this example search calls for `i64`:

![image](https://github.com/rust-lang/rust/assets/1593513/9431d89d-41dc-4f68-bbb1-3e2704a973d2)

But if I click any of the results, I get `f64` instead.

![image](https://github.com/rust-lang/rust/assets/1593513/6d89c692-1847-421a-84d9-22e359d9cf82)

The PR fixes this problem by adding enough information to the search result `href` to disambiguate methods with different types but the same name.

More detailed description of the problem at:
rust-lang#109422 (comment)

> When a struct/enum/union has multiple impls with different type parameters, it can have multiple methods that have the same name, but which are on different impls. Besides Simd, [Any](https://doc.rust-lang.org/nightly/std/any/trait.Any.html?search=any%3A%3Adowncast) also demonstrates this pattern. It has three methods named `downcast`, on three different impls.
>
> When that happens, it presents a challenge in linking to the method. Normally we link like `#method.foo`. When there are multiple `foo`, we number them like `#method.foo`, `#method.foo-1`, `#method.foo-2`, etc.
>
> It also presents a challenge for our search code. Currently we store all the variants in the index, but don’t have any way to generate unambiguous URLs in the results page, or to distinguish them in the SERP.
>
> To fix this, we need three things:
>
> 1. A fragment format that fully specifies the impl type parameters when needed to disambiguate (`#impl-SimdOrd-for-Simd<i64,+LANES>/method.simd_max`)
> 2. A search index that stores methods with enough information to disambiguate the impl they were on.
> 3. A search results interface that can display multiple methods on the same type with the same name, when appropriate OR a disambiguation landing section on item pages?
>
> For reviewers: it can be hard to see the new fragment format in action since it immediately gets rewritten to the numbered form.
  • Loading branch information
matthiaskrgr authored Oct 10, 2023
2 parents 061c330 + 2a4c9d0 commit 9bfcc6d
Show file tree
Hide file tree
Showing 20 changed files with 350 additions and 63 deletions.
73 changes: 42 additions & 31 deletions src/librustdoc/formats/cache.rs
Original file line number Diff line number Diff line change
Expand Up @@ -223,17 +223,16 @@ impl<'a, 'tcx> DocFolder for CacheBuilder<'a, 'tcx> {

// If the impl is from a masked crate or references something from a
// masked crate then remove it completely.
if let clean::ImplItem(ref i) = *item.kind {
if self.cache.masked_crates.contains(&item.item_id.krate())
if let clean::ImplItem(ref i) = *item.kind &&
(self.cache.masked_crates.contains(&item.item_id.krate())
|| i.trait_
.as_ref()
.map_or(false, |t| self.cache.masked_crates.contains(&t.def_id().krate))
|| i.for_
.def_id(self.cache)
.map_or(false, |d| self.cache.masked_crates.contains(&d.krate))
{
return None;
}
.map_or(false, |d| self.cache.masked_crates.contains(&d.krate)))
{
return None;
}

// Propagate a trait method's documentation to all implementors of the
Expand Down Expand Up @@ -334,33 +333,37 @@ impl<'a, 'tcx> DocFolder for CacheBuilder<'a, 'tcx> {
// A crate has a module at its root, containing all items,
// which should not be indexed. The crate-item itself is
// inserted later on when serializing the search-index.
if item.item_id.as_def_id().map_or(false, |idx| !idx.is_crate_root()) {
if item.item_id.as_def_id().map_or(false, |idx| !idx.is_crate_root())
&& let ty = item.type_()
&& (ty != ItemType::StructField
|| u16::from_str_radix(s.as_str(), 10).is_err())
{
let desc =
short_markdown_summary(&item.doc_value(), &item.link_names(self.cache));
let ty = item.type_();
if ty != ItemType::StructField
|| u16::from_str_radix(s.as_str(), 10).is_err()
{
// In case this is a field from a tuple struct, we don't add it into
// the search index because its name is something like "0", which is
// not useful for rustdoc search.
self.cache.search_index.push(IndexItem {
ty,
name: s,
path: join_with_double_colon(path),
desc,
parent,
parent_idx: None,
search_type: get_function_type_for_search(
&item,
self.tcx,
clean_impl_generics(self.cache.parent_stack.last()).as_ref(),
self.cache,
),
aliases: item.attrs.get_doc_aliases(),
deprecation: item.deprecation(self.tcx),
});
}
// In case this is a field from a tuple struct, we don't add it into
// the search index because its name is something like "0", which is
// not useful for rustdoc search.
self.cache.search_index.push(IndexItem {
ty,
name: s,
path: join_with_double_colon(path),
desc,
parent,
parent_idx: None,
impl_id: if let Some(ParentStackItem::Impl { item_id, .. }) = self.cache.parent_stack.last() {
item_id.as_def_id()
} else {
None
},
search_type: get_function_type_for_search(
&item,
self.tcx,
clean_impl_generics(self.cache.parent_stack.last()).as_ref(),
self.cache,
),
aliases: item.attrs.get_doc_aliases(),
deprecation: item.deprecation(self.tcx),
});
}
}
(Some(parent), None) if is_inherent_impl_item => {
Expand All @@ -371,6 +374,13 @@ impl<'a, 'tcx> DocFolder for CacheBuilder<'a, 'tcx> {
parent,
item: item.clone(),
impl_generics,
impl_id: if let Some(ParentStackItem::Impl { item_id, .. }) =
self.cache.parent_stack.last()
{
item_id.as_def_id()
} else {
None
},
});
}
_ => {}
Expand Down Expand Up @@ -541,6 +551,7 @@ impl<'a, 'tcx> DocFolder for CacheBuilder<'a, 'tcx> {

pub(crate) struct OrphanImplItem {
pub(crate) parent: DefId,
pub(crate) impl_id: Option<DefId>,
pub(crate) item: clean::Item,
pub(crate) impl_generics: Option<(clean::Type, clean::Generics)>,
}
Expand Down
41 changes: 28 additions & 13 deletions src/librustdoc/html/render/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ use rustc_data_structures::fx::{FxHashMap, FxHashSet};
use rustc_hir::def_id::{DefId, DefIdSet};
use rustc_hir::Mutability;
use rustc_middle::middle::stability;
use rustc_middle::ty::TyCtxt;
use rustc_middle::ty::{self, TyCtxt};
use rustc_span::{
symbol::{sym, Symbol},
BytePos, FileName, RealFileName,
Expand Down Expand Up @@ -102,6 +102,7 @@ pub(crate) struct IndexItem {
pub(crate) desc: String,
pub(crate) parent: Option<DefId>,
pub(crate) parent_idx: Option<isize>,
pub(crate) impl_id: Option<DefId>,
pub(crate) search_type: Option<IndexItemFunctionType>,
pub(crate) aliases: Box<[Symbol]>,
pub(crate) deprecation: Option<Deprecation>,
Expand Down Expand Up @@ -1877,7 +1878,7 @@ pub(crate) fn render_impl_summary(
aliases: &[String],
) {
let inner_impl = i.inner_impl();
let id = cx.derive_id(get_id_for_impl(&inner_impl.for_, inner_impl.trait_.as_ref(), cx));
let id = cx.derive_id(get_id_for_impl(cx.tcx(), i.impl_item.item_id));
let aliases = if aliases.is_empty() {
String::new()
} else {
Expand Down Expand Up @@ -1994,21 +1995,35 @@ pub(crate) fn small_url_encode(s: String) -> String {
}
}

fn get_id_for_impl(for_: &clean::Type, trait_: Option<&clean::Path>, cx: &Context<'_>) -> String {
match trait_ {
Some(t) => small_url_encode(format!("impl-{:#}-for-{:#}", t.print(cx), for_.print(cx))),
None => small_url_encode(format!("impl-{:#}", for_.print(cx))),
}
fn get_id_for_impl<'tcx>(tcx: TyCtxt<'tcx>, impl_id: ItemId) -> String {
use rustc_middle::ty::print::with_forced_trimmed_paths;
let (type_, trait_) = match impl_id {
ItemId::Auto { trait_, for_ } => {
let ty = tcx.type_of(for_).skip_binder();
(ty, Some(ty::TraitRef::new(tcx, trait_, [ty])))
}
ItemId::Blanket { impl_id, .. } | ItemId::DefId(impl_id) => {
match tcx.impl_subject(impl_id).skip_binder() {
ty::ImplSubject::Trait(trait_ref) => {
(trait_ref.args[0].expect_ty(), Some(trait_ref))
}
ty::ImplSubject::Inherent(ty) => (ty, None),
}
}
};
with_forced_trimmed_paths!(small_url_encode(if let Some(trait_) = trait_ {
format!("impl-{trait_}-for-{type_}", trait_ = trait_.print_only_trait_path())
} else {
format!("impl-{type_}")
}))
}

fn extract_for_impl_name(item: &clean::Item, cx: &Context<'_>) -> Option<(String, String)> {
match *item.kind {
clean::ItemKind::ImplItem(ref i) => {
i.trait_.as_ref().map(|trait_| {
// Alternative format produces no URLs,
// so this parameter does nothing.
(format!("{:#}", i.for_.print(cx)), get_id_for_impl(&i.for_, Some(trait_), cx))
})
clean::ItemKind::ImplItem(ref i) if i.trait_.is_some() => {
// Alternative format produces no URLs,
// so this parameter does nothing.
Some((format!("{:#}", i.for_.print(cx)), get_id_for_impl(cx.tcx(), item.item_id)))
}
_ => None,
}
Expand Down
33 changes: 31 additions & 2 deletions src/librustdoc/html/render/search_index.rs
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ use crate::formats::cache::{Cache, OrphanImplItem};
use crate::formats::item_type::ItemType;
use crate::html::format::join_with_double_colon;
use crate::html::markdown::short_markdown_summary;
use crate::html::render::{IndexItem, IndexItemFunctionType, RenderType, RenderTypeId};
use crate::html::render::{self, IndexItem, IndexItemFunctionType, RenderType, RenderTypeId};

/// Builds the search index from the collected metadata
pub(crate) fn build_index<'tcx>(
Expand All @@ -26,7 +26,8 @@ pub(crate) fn build_index<'tcx>(

// Attach all orphan items to the type's definition if the type
// has since been learned.
for &OrphanImplItem { parent, ref item, ref impl_generics } in &cache.orphan_impl_items {
for &OrphanImplItem { impl_id, parent, ref item, ref impl_generics } in &cache.orphan_impl_items
{
if let Some((fqp, _)) = cache.paths.get(&parent) {
let desc = short_markdown_summary(&item.doc_value(), &item.link_names(cache));
cache.search_index.push(IndexItem {
Expand All @@ -36,6 +37,7 @@ pub(crate) fn build_index<'tcx>(
desc,
parent: Some(parent),
parent_idx: None,
impl_id,
search_type: get_function_type_for_search(item, tcx, impl_generics.as_ref(), cache),
aliases: item.attrs.get_doc_aliases(),
deprecation: item.deprecation(tcx),
Expand Down Expand Up @@ -222,6 +224,29 @@ pub(crate) fn build_index<'tcx>(
})
.collect();

// Find associated items that need disambiguators
let mut associated_item_duplicates = FxHashMap::<(isize, ItemType, Symbol), usize>::default();

for &item in &crate_items {
if item.impl_id.is_some() && let Some(parent_idx) = item.parent_idx {
let count = associated_item_duplicates
.entry((parent_idx, item.ty, item.name))
.or_insert(0);
*count += 1;
}
}

let associated_item_disambiguators = crate_items
.iter()
.enumerate()
.filter_map(|(index, item)| {
let impl_id = ItemId::DefId(item.impl_id?);
let parent_idx = item.parent_idx?;
let count = *associated_item_duplicates.get(&(parent_idx, item.ty, item.name))?;
if count > 1 { Some((index, render::get_id_for_impl(tcx, impl_id))) } else { None }
})
.collect::<Vec<_>>();

struct CrateData<'a> {
doc: String,
items: Vec<&'a IndexItem>,
Expand All @@ -230,6 +255,8 @@ pub(crate) fn build_index<'tcx>(
//
// To be noted: the `usize` elements are indexes to `items`.
aliases: &'a BTreeMap<String, Vec<usize>>,
// Used when a type has more than one impl with an associated item with the same name.
associated_item_disambiguators: &'a Vec<(usize, String)>,
}

struct Paths {
Expand Down Expand Up @@ -382,6 +409,7 @@ pub(crate) fn build_index<'tcx>(
crate_data.serialize_field("f", &functions)?;
crate_data.serialize_field("c", &deprecated)?;
crate_data.serialize_field("p", &paths)?;
crate_data.serialize_field("b", &self.associated_item_disambiguators)?;
if has_aliases {
crate_data.serialize_field("a", &self.aliases)?;
}
Expand All @@ -398,6 +426,7 @@ pub(crate) fn build_index<'tcx>(
items: crate_items,
paths: crate_paths,
aliases: &aliases,
associated_item_disambiguators: &associated_item_disambiguators,
})
.expect("failed serde conversion")
// All these `replace` calls are because we have to go through JS string for JSON content.
Expand Down
3 changes: 1 addition & 2 deletions src/librustdoc/html/render/sidebar.rs
Original file line number Diff line number Diff line change
Expand Up @@ -503,8 +503,7 @@ fn sidebar_render_assoc_items(
.iter()
.filter_map(|it| {
let trait_ = it.inner_impl().trait_.as_ref()?;
let encoded =
id_map.derive(super::get_id_for_impl(&it.inner_impl().for_, Some(trait_), cx));
let encoded = id_map.derive(super::get_id_for_impl(cx.tcx(), it.impl_item.item_id));

let prefix = match it.inner_impl().polarity {
ty::ImplPolarity::Positive | ty::ImplPolarity::Reservation => "",
Expand Down
28 changes: 28 additions & 0 deletions src/librustdoc/html/static/js/main.js
Original file line number Diff line number Diff line change
Expand Up @@ -354,6 +354,34 @@ function preLoadCss(cssUrl) {
expandSection(pageId);
}
}
if (savedHash.startsWith("impl-")) {
// impl-disambiguated links, used by the search engine
// format: impl-X[-for-Y]/method.WHATEVER
// turn this into method.WHATEVER[-NUMBER]
const splitAt = savedHash.indexOf("/");
if (splitAt !== -1) {
const implId = savedHash.slice(0, splitAt);
const assocId = savedHash.slice(splitAt + 1);
const implElem = document.getElementById(implId);
if (implElem && implElem.parentElement.tagName === "SUMMARY" &&
implElem.parentElement.parentElement.tagName === "DETAILS") {
onEachLazy(implElem.parentElement.parentElement.querySelectorAll(
`[id^="${assocId}"]`),
item => {
const numbered = /([^-]+)-([0-9]+)/.exec(item.id);
if (item.id === assocId || (numbered && numbered[1] === assocId)) {
openParentDetails(item);
item.scrollIntoView();
// Let the section expand itself before trying to highlight
setTimeout(() => {
window.location.replace("#" + item.id);
}, 0);
}
}
);
}
}
}
}

function onHashChange(ev) {
Expand Down
19 changes: 16 additions & 3 deletions src/librustdoc/html/static/js/search.js
Original file line number Diff line number Diff line change
Expand Up @@ -1752,6 +1752,7 @@ function initSearch(rawSearchIndex) {
type: item.type,
is_alias: true,
deprecated: item.deprecated,
implDisambiguator: item.implDisambiguator,
};
}

Expand Down Expand Up @@ -2218,7 +2219,7 @@ function initSearch(rawSearchIndex) {
href = ROOT_PATH + name + "/index.html";
} else if (item.parent !== undefined) {
const myparent = item.parent;
let anchor = "#" + type + "." + name;
let anchor = type + "." + name;
const parentType = itemTypes[myparent.ty];
let pageType = parentType;
let pageName = myparent.name;
Expand All @@ -2232,16 +2233,19 @@ function initSearch(rawSearchIndex) {
const enumName = item.path.substr(enumNameIdx + 2);
path = item.path.substr(0, enumNameIdx);
displayPath = path + "::" + enumName + "::" + myparent.name + "::";
anchor = "#variant." + myparent.name + ".field." + name;
anchor = "variant." + myparent.name + ".field." + name;
pageType = "enum";
pageName = enumName;
} else {
displayPath = path + "::" + myparent.name + "::";
}
if (item.implDisambiguator !== null) {
anchor = item.implDisambiguator + "/" + anchor;
}
href = ROOT_PATH + path.replace(/::/g, "/") +
"/" + pageType +
"." + pageName +
".html" + anchor;
".html#" + anchor;
} else {
displayPath = item.path + "::";
href = ROOT_PATH + item.path.replace(/::/g, "/") +
Expand Down Expand Up @@ -2727,6 +2731,10 @@ ${item.displayPath}<span class="${type}">${name}</span>\
* Types are also represented as arrays; the first item is an index into the `p`
* array, while the second is a list of types representing any generic parameters.
*
* b[i] contains an item's impl disambiguator. This is only present if an item
* is defined in an impl block and, the impl block's type has more than one associated
* item with the same name.
*
* `a` defines aliases with an Array of pairs: [name, offset], where `offset`
* points into the n/t/d/q/i/f arrays.
*
Expand All @@ -2746,6 +2754,7 @@ ${item.displayPath}<span class="${type}">${name}</span>\
* i: Array<Number>,
* f: Array<RawFunctionSearchType>,
* p: Array<Object>,
* b: Array<[Number, String]>,
* c: Array<Number>
* }}
*/
Expand All @@ -2766,6 +2775,7 @@ ${item.displayPath}<span class="${type}">${name}</span>\
id: id,
normalizedName: crate.indexOf("_") === -1 ? crate : crate.replace(/_/g, ""),
deprecated: null,
implDisambiguator: null,
};
id += 1;
searchIndex.push(crateRow);
Expand All @@ -2789,6 +2799,8 @@ ${item.displayPath}<span class="${type}">${name}</span>\
const itemFunctionSearchTypes = crateCorpus.f;
// an array of (Number) indices for the deprecated items
const deprecatedItems = new Set(crateCorpus.c);
// an array of (Number) indices for the deprecated items
const implDisambiguator = new Map(crateCorpus.b);
// an array of [(Number) item type,
// (String) name]
const paths = crateCorpus.p;
Expand Down Expand Up @@ -2849,6 +2861,7 @@ ${item.displayPath}<span class="${type}">${name}</span>\
id: id,
normalizedName: word.indexOf("_") === -1 ? word : word.replace(/_/g, ""),
deprecated: deprecatedItems.has(i),
implDisambiguator: implDisambiguator.has(i) ? implDisambiguator.get(i) : null,
};
id += 1;
searchIndex.push(row);
Expand Down
Loading

0 comments on commit 9bfcc6d

Please sign in to comment.