-
Notifications
You must be signed in to change notification settings - Fork 22.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Markdown] Decide what to do about class="summary" and class="seoSummary" #3923
Comments
I've done a bit more analysis of the JS docs here. Usage of
|
<p><span class="seoSummary">The <strong><code>export</code></strong> statement is used | |
when creating JavaScript modules to export live bindings to functions, objects, or | |
primitive values from the module so they can be used by other programs with the | |
{{jsxref("Statements/import", "import")}} statement. Bindings that are exported can | |
still be modified locally; when imported, although they can only be read by the | |
importing module the value updates whenever it is updated by the exporting | |
module.</span></p> |
So it's not like we are doing this consistently at all, at the moment.
Usage of summary
in JavaScript
Of 22 uses of summary
:
- 20 match the first paragraph, so the class is a no-op
- in 1 case the
summary
is used to truncate the first paragraph (https://github.com/mdn/content/blob/main/files/en-us/web/javascript/closures/index.html) - in 1 case the
summary
is used to select something different, and completely inappropriate to the page summary:
content/files/en-us/web/javascript/reference/operators/new.target/index.html
Lines 87 to 91 in 1422bad
<p class="summary">Thus from the above example of class <code>C</code> and <code>D</code>, | |
it seems that <code>new.target</code> points to the class definition of class which is | |
initialized. For example, when <code>d</code> was initialized using | |
<code>new D()</code>, the class definition of <code>D</code> was printed; and similarly, | |
in case of <code>c</code>, the class <code>C</code> was printed.</p> |
(but note that in this case the bit marked up by the summary
is not the bit in the <meta>
tag: view-source:https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/new.target. So there's something I don't understand here about how these <meta>
tags are built...)
Next steps
I'd say next steps for JS would be:
- remove all the uses of
summary
andseoSummary
that are no-ops - decide whether the use of these classes to truncate the first paragraph is worth it
- deal with the few cases where a quite different text is selected, perhaps by changing the content.
It might make sense to share the scripts you've used for the analysis, so others could easier replicate the results (or look for flawed logic if something slipped your attention). |
They're pretty rough, and just variations of something like https://gist.github.com/wbamberg/20d0d7a99f38ea9c188b6003a653661c. They just find ".html" files, scrape off the front matter, then use jsdom on the result. Then it's easy to query the doc for things like classes. To check whether the node identified by I did do a bunch of manual checking the sources, to see if it made sense. But I'd be very happy to accept improvement suggestions! |
I agree with your approach here, @wbamberg . the |
It definitely seems to be used here: kumascript/src/info.js and I have tested that it can affect the value of the |
So if there is an agreement … what would be the next steps? At least one for documentation. |
Yes, there are a couple of pieces of this.
For JS: the immediate goal is to convert the JS docs. For JS this is small enough that we can just go through removing all the For everything else: I think we can choose:
I did a bit more digging, and made this table:
(https://docs.google.com/spreadsheets/d/16iSsUHpdDgOQnb1ndBYfvUKJdDFjOse2ONDmtr_nlDo/edit?usp=sharing). The numbers are probably a bit rough - there is some heuristics in deciding which is the "first paragraph" which is supposed to roughly match up with https://github.com/mdn/yari/blob/main/kumascript/src/info.js#L202. But it is close enough I think. This has 5 columns:
Cases 4 and 5 are those where these classes have an effect, and this tells us that it's about 1000 pages, or a little under 10%. My experience of MDN is that almost all first paragraphs are short, so I think in general breaking the "substring" cases is not likely to make the docs much worse. If we did want to look at all these, it would be easy to generate the whole list of pages that would need looking at, and file a bug for it. And it would be quite easy for different people to work on this in parallel, although there would be some overhead of managing things so people didn't tread on each others' toes. It does take some careful thought sometimes in how to rework the pages, it's not quite mechanical. |
@wbamberg Great analysis. I think that the "summary is different" cases are likely to need some work, and IMO it might be a good idea to generate lists of those and start looking at them before switching over (in an ideal world a list showing both the current and resulting text for each title). FWIW I prefer docs where the first sentence/para is a "summary". Usually as a reader it aids in quickly determining relevance (though as a writer it can force you to write "unnaturally"). P.S. This might be a good task for our bigger volunteer pool. I am also happy to help. |
Yes, me too.
OK, then I think I will start by filing an issue like this for the JS docs, which is our first priority, and maybe we can get help. Very happy for you to help either by fixing docs and/or reviewing and helping to guide volunteers. I'll try to get a detailed list out tomorrow. |
All sounds good to me. I do wonder whether it is worth creating a sample of the titles and summaries for 100 sample pages that have had these classes removed, and seeing how many of the summaries read weirdly or wrongly as a result, to give us an idea of the impact of going with @wbamberg 's first option ( |
FWIW I had a look at the "type 4 fixes" fixed in #4085 using the ListSummaries macro and the manually fixed ones look a lot better better than they would have otherwise, and more useful/succinct than their unchanged peers. |
I've updated #3350 (comment) with converter guidance for
This will not be a problem for the JS docs but we will have content work to do in the others. |
seoSummary
andsummary
are classes that get applied to prose content in MDN pages. They're used to help calculate a thing we'll call a "page summary". The calculation is a bit complicated (https://github.com/mdn/yari/blob/main/kumascript/src/info.js#L202), but in general, it's something like:span.seoSummary
exists in the page, then its content is taken as the page summary.summary
element exists in the page, then its content is taken as the page summaryp
element in the document that does not live in a note or warningdiv
.So essentially these classes are used to provide a custom page summary, and if they are not given we default to using the first paragraph.
What's the page summary used for?
It's used in (at least) a couple of ways:
as described in https://developer.mozilla.org/en-US/docs/MDN/Guidelines/CSS_style_guide#.seosummary (although the
.summary
part is not documented) it gets used as the value of a<meta>
tag in the page<head>
, to provide a description that is embedded in search engine results and other embedding contexts.it's available to KumaScript, and seems to be used in the following macros:
How widely are
summary
andseoSummary
used?Across all en-US, I count:
summary
classseoSummary
classAcross just JavaScript:
summary
classseoSummary
classHow should we handle them in Markdown-land?
The most obvious option is: stop supporting them, and just always use the first non-note paragraph for the page summary. It would be good to know if we would be regressing many pages by doing this.
Many pages just mark up the first paragraph with
summary
orseoSummary
, so removing those classes would have no actual impact. For example:content/files/en-us/learn/accessibility/index.html
Line 19 in 1422bad
However, some actually do use it to override the default:
content/files/en-us/web/media/formats/video_codecs/index.html
Line 40 in 1422bad
It would I think be worthwhile to:
We should also do an analysis of how KS macros are using the page summary, and whether they would be broken by removing the ability to override the default.
If we did decide it's important to let pages override the "first paragraph rule", we could consider having an optional front matter key for this, and we might also consider whether this is the same as the "short description" that we've talked about before as a component of our pages.
For now, we could limit this exercise to the JavaScript docs.
( @escattone , we talked about this issue yesterday)
The text was updated successfully, but these errors were encountered: