Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleanup markdown span handling, take 2 #80550

Merged
merged 5 commits into from
Jan 2, 2021

Conversation

bugadani
Copy link
Contributor

This PR includes the cleanups made in #80244 except for the removal of locate().

While the biggest conceptual part in #80244 was the removal of locate(), it introduced a diagnostic regression.

Additional cleanup:

  • Use RefCell to avoid building two separate vectors for the links

Work to do:

cc @jyn514 This is the best I can do without patching Pulldown to provide multiple ranges for reference-style links. Also, since locate is probably more efficient than rfind (at least it's constant time), I decided to not check the link type and just cover every &str as it was before.

@rust-highfive

This comment has been minimized.

@rust-highfive
Copy link
Collaborator

r? @jyn514

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Dec 31, 2020
@jyn514 jyn514 added A-intra-doc-links Area: Intra-doc links, the ability to link to items in docs by name T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue. A-diagnostics Area: Messages for errors, warnings, and lints labels Dec 31, 2020
@jyn514
Copy link
Member

jyn514 commented Dec 31, 2020

This is the best I can do without patching Pulldown to provide multiple ranges for reference-style links

I think we should consider asking pulldown if they're willing to do this; I think it would be useful and in scope.

Copy link
Member

@jyn514 jyn514 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Decide if locate() can be simplified by assuming s is always in md

I would expect this to cause unsoundness if wrong, so I'd prefer not to do that. I don't think current pulldown API guarentees it will be borrowed from the original string, it could be borrowed from Box::leak or something.

Overall this is great though ❤️

Comment on lines 1154 to 1162
// Ranges provided for Borrowed strings may differ from where the actual string comes
// from. Since the &str points into `md`, we can calculate its actual location.
CowStr::Borrowed(s) => locate(s, span),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain why only Borrowed variants have the wrong span? I would expect instead for it to be wrong for all spans, but we'd only be able to use locate on Borrowed variants because otherwise it points to a different allocation.

That brings up a couple other questions I have:

  1. Why does locate only work for things within the same allocation? Can we improve it to work regardless?
  2. When would pulldown give us a Boxed or Inlined span?

I think this PR would be good to land regardless, but the comment seems off, we should at least fix that before landing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's get the easy ones out of the way first:

  • The comment tries to hint at the current problem: we get a Borrowed string that contains the destination of a reference-style link, and a range that points at the link usage itself.

  • I'm not sure about Inlined, but it seems like it's a small string optimization for the Boxed variant. It exists to avoid allocating on clone if the boxed string is short enough.

As far as I can see, Pulldown provides a Borrowed if the string directly comes from the source. If any transformation is necessary (e.g. unescaping used when scanning ref defs or such), Pulldown builds and hands out a Boxed string.

The issue is not that Borrowed strings can have an incorrect span, it's that for those, we can actually restore the span we need. For any other variant, it's reasonable (I think) to assume that the string contained in the variant is some sort of derivative of the original source, and can not be located by either pointer arithmetic or string search. (Just as I write this, if I'm correct this makes rfind-ing on the Boxed variant kinda pointless, because the string will not be an exact match anyway. Otherwise, Pulldown would have handed out a Borrowed).

Hope this makes sense.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is not that Borrowed strings can have an incorrect span, it's that for those, we can actually restore the span we need.

Right, that's what I expected. Can you say that in the comment? Right now it makes it sound like the range will only be different for Borrowed, which isn't true.

Can you take a look at 1 to see if it's possible to get rid of the unsafe? (and hopefully make this work for Boxed variants too, but after your later comments that might not be possible.)

Copy link
Contributor Author

@bugadani bugadani Jan 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not completely certain what you're referring to as "1" (ah I realized you meant your own numbered point). Getting rid of the unsafe is as easy as casting the pointer to usize but that's just masking the problem. Otherwise, I would not change locate if there's a chance for a pulldown update that allows us to remove it entirely.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, r=me with the comment fixed then, and hopefully we can get rid of locate in a pulldown update soon.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found where my expectation break down: LinkReplacer can give us Borrow variants that borrow from something other than md. I made a note of this.

@jyn514 jyn514 added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 31, 2020
@jyn514
Copy link
Member

jyn514 commented Jan 1, 2021

@bors delegate=bugadani

#80550 (comment)

@bors
Copy link
Contributor

bors commented Jan 1, 2021

✌️ @bugadani can now approve this pull request

@bugadani
Copy link
Contributor Author

bugadani commented Jan 2, 2021

@bors r=jyn514

@bors
Copy link
Contributor

bors commented Jan 2, 2021

📌 Commit f073543 has been approved by jyn514

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jan 2, 2021
@bugadani bugadani marked this pull request as ready for review January 2, 2021 09:40
@bors
Copy link
Contributor

bors commented Jan 2, 2021

⌛ Testing commit f073543 with merge fd85ca0...

@bors
Copy link
Contributor

bors commented Jan 2, 2021

☀️ Test successful - checks-actions
Approved by: jyn514
Pushing fd85ca0 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Jan 2, 2021
@bors bors merged commit fd85ca0 into rust-lang:master Jan 2, 2021
@rustbot rustbot added this to the 1.51.0 milestone Jan 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-diagnostics Area: Messages for errors, warnings, and lints A-intra-doc-links Area: Intra-doc links, the ability to link to items in docs by name merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants