-
-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Style guide suggestion: Avoid duplicate links #52
Comments
+1. I'd put this in the same conceptual group as introducing initalisms & acronymns once and then using the shorter form. A |
What is the scope for avoiding repeated links? For long pages, it could be useful to use links in different sections. The nature of webpages is you don't read the whole page, but skip to what is relevant, and may even arrive part way down via an anchor link. |
I'd agree with @JelleZijlstra that, as a general rule, you should link to the same thing only once per paragraph. If the paragraph is long enough that you start wondering if you need to add another link, it's probably time to break up the paragraph. I think it's fine to link to the same thing as soon as you start a new paragraph, in most cases. |
Per paragraph I'd say, as with Alex / Serhiy / Jelle. There are edge cases (enumerations, single sentence paragraphs followed by a full paragraph), but any style guide is incomplete. A |
Like for AmE/BrE spelling, Wikipedia's MoS specifies not repeating wikilinks in close proximity. IIRC in that case it is per-section, but once per paragraph-ish seems fine, particularly for reference docs, with room for context-appropriate discretion (very short paragraphs, etc.). |
IMO, paragraphs are a good rule of thumb for reference docs. |
I've thought a fair bit more about this and put it into practice various places; I typically do so per lowest-level section (or equivalent unit, e.g. class, function or method in API docs) since the reader may have jumped directly to the section via a link, or otherwise only consulting that specific section. In particular, readers of reference docs may only be reading at a high level of granularity and each unit (function, method, etc) should stand on its own, and readers of how-to guides are likely only reading the specific section covering their specific task, same with an individual explanation section, etc. So long as one makes proper use of Sphinx's very powerful linking and cross-referencing capabilities (even between different Sphinx sites with intersphinx), it is very cheap to link/reference things, so the main constraint becomes oversaturating the reader with duplicate links as long as they would be expected to read both (e.g. same paragraph/section). |
It’s not just duplicates. Sometimes just the sheer number of links in a paragraph can be distracting. (Making the links stand out less does not really answer this, it’s no use if there are links but you can’t see them.) |
This recently came up in python/cpython#94636 (comment) too.
However, if additional occurrences are not linked, they will render differently, potentially being more distracting. For example:
|
Just to clarify for others, more or less these guidelines (linking only on first usage within a unit, and not linking something we're inside of already) is what @erlend-aasland (and by extension, myself) have been following on the If we're codifying this in the style guide, I would suggest making the general rule "one link per least readible unit", or whatever we want to call the minimum unit a reader would be expected to read given the Diataxis type. Generally speaking, this means the lowest-level addressable block, e.g. the lowest-level section in a How-To Guide, or per-function parameter in an API Reference. This guideline is more flexible, generally applicable and tailored to the Diataxis type of the documentation while helping guide authors on how to judge this—plus, a lot of places don't have actual "paragraphs" at all. Furthermore, a more rigid application of the fixed "paragraph rule" could lead to overly high link density in some places (successive dependent paragraphs in the same lowest-level How-To section), while missing links in some places they are crucial (each parameter in an API Reference).
I assume you're referring specifically to I see two possibilities, both of them with benefits and tradeoffs:
This isn't a super-strong preference, but while conceptually I like the idea of being consistent on every usage, given the practical tradeoffs for readers I favor linking first usage, mostly to reduce link density for readers and ensure link budget is spent on more reader-useful links, and since the formatting difference is just of the link (which provides meaningful information to readers) and consistent with how prose links are treated. |
For me, this boils down to is this reference going to make life easier/better for the user1? IMO, I do not see how duplicate links and a high link density can possibly improve life for the user; I seems to me an anti-pattern. We should not add refs just because we can (i.e. there is something to link to => I need to link to it). We should only add refs if they improve the docs for the end user. IMO, reducing the link density and making sure that there is only unique refs/links within a paragraph is a great improvement for the end user. Consistency? Yes, making sure that there is only unique links within a paragraph is a schema; following a schema leads to consistency. IMO, in the documentation for method x, we should not link to the documentation for method x2; ditto for functions, classes, and even modules3. Footnotes
|
Just to note, my argument for making the guidance "first use in atomic unit" follows the same basis as yours, and just takes it further—it is focused specifically on ensuring readers see one and only one link to a given target within each "atomic unit" of the docs (e.g. section, admonition, parameter description, etc), which:
|
A possible counterargument is that the markup we are adding (e.g. I wonder if, instead, it would be possible to use roles consistently (thus reducing the cognitive load of the writer, and making updates easier) and have Sphinx automatically add links only to the first occurrence within an atomic unit. This will also save us for rewriting all repeated references in all the documents. |
If that's possible, that would be wonderful! |
Actually, from reading the devguide section on inline markup for python/devguide#1000 , I've come up with a better idea. Per said section on cross referencing roles:
Therefore, instead of doing This has a superset of the advantages of the Also, the same approach can be used for references to an object (module, class, function, etc) inside itself, with the same benefits and essentially no downsides relative to either. Source: * If you prefix the content with ``!``, no reference/hyperlink will be created.
For example, :class:`sqlite3.Connection`
vs. :class:`!sqlite3.Connection`. Result:
It quite likely is possible; we could e.g. write a Transform that iterates through xref doctree nodes and either sets them to non-resolving or subs in the appropriate subclass of xref nodes to not resolve. However, it would be quite non-trivial, mostly due to having to programmatically define at the doctree level exactly what constitutes an "atomic unit"—which again, would be theoretically possible (since there should be enough semantic information to fairly reliably conclude this) but it would require handling a fair number of edge and corner cases due to the number of different node types and nesting, and if it gets one wrong there's no real way for an author to use their judgement and override this. Also, particularly if substantial time is not spent on optimization, it could be quite expensive since it requires iterating through a large number of low-level nested nodes in every single document, perhaps multiple times; given that doc build times (given the large size and even larger build matrix) are already a serious concern, even a few percent slowdown could have significant implications. So, while it possibly could be explored, IMO given it only requires adding one extra character (and the far more time-intensive part is properly linking things to begin with, which this could of course not solve) it might be better to just use manual discretion here, at least for now. |
At least once I intentionally made the second occurrence a link, not the first one. Also, sometimes the unit is larger than paragraph, so only human can make a decision. |
Indeed, very recently I did the same for some carefully thought out reasons
There is theoretically enough semantic information in the doctree to determine appropriate unit size (hardcoding it to paragraph nodes would be a nonstarter as it wouldn't work in a large number of common cases), but as you mention there's still enough special cases this wouldn't handle, and wouldn't provide an easy means to override without even more complexity, that I don't think that aspect is worth it given it is easy to do manually following the newly-suggested above approach. |
Using |
Can you combine the two modifiers? |
@ezio-melotti Its already the very next bullet point :)
@merwok You could also just do |
I know, that’s why I asked! Great that the combo works, thanks Ezio for checking 🙂 |
Sorry, just wanted to clarify :) |
Actually, This could be changed in Sphinx by a few different methods (overriding create_non_xref_node in PyXrefMixin to run parse_reftarget first and use that as the title text, in order to support However, none of them are trivial to implement and test, and would only come with a yet-unreleased Sphinx version, so I'm not sure if it is worth it for a case that could just be handled by manually omitting the undesired parts of the dotted name. @AA-Turner , thoughts? |
Tooling aside; I see a consensus here, would someone like to write it up for the https://devguide.python.org/documentation/style-guide/? |
I've made a pull request for the devguide: python/devguide#1294 |
In python/cpython#117005 (comment), @AlexWaygood suggests using
instead of
because it renders the same, but is less clutter in the .rst file. Semantically, it's a link-suppressed reference to a non-existent method, but the readers' experience will be the same. Thoughts? |
Never mind: I am now seeing that this has been discussed above, and CI fails with |
I think we can close this now python/devguide#1294 is merged, please re-open or comment if there's more to do. |
python/cpython#92199 (comment):
I agree with @serhiy-storchaka; this would be good to mention in the docs style guide.
The text was updated successfully, but these errors were encountered: