-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generate Heading IDs #181
Generate Heading IDs #181
Conversation
So, as i mentioned in issue #89, it seems this is a rather widely attempted issue. With that said, this is one of the simplest working solutions i have seen yet. Very nice! |
After using this, i feel like an improvement could be made to track previous ID names, to ensure no duplicates. Possible append an index to the end of the id |
Yeah, I thought about that. However, even though duplicate IDs would be problematic, I think it would be a bad idea to account for it in code. There are a number of use-cases for these headers having deep-permalinks.
Use cases 1 & 2 could be supported by random, opaque IDs. There is nothing inherent in the value of the ID that would impede these use cases. Use case # 3, on the other hand, requires a known and predictable value for the ID attribute. Since the value of the ID must be known prior to its generation, the best method is to simply use the textual content of the header itself (even accounting for predictable conversion rules to generate IDs valid for the attribute). Appending an index would hamper the predictability of the generated IDs. (only slightly, to be sure, since the index could be derived simply by placement in the page; but IMO, still harmful enough to use case 3) Since I don't believe de-duping should be handled in the code, it's obvious it falls on the shoulders of the author to ensure unique headers. This is a good thing, anyhow, as I would consider it a bad idea to have textually-identical headers on the same page. |
You think it would be bad to have textually identical headers? I don't mean to argue with you (really, i don't care haha), it's just that this use:
Isn't that style of writing relatively common? By definition, should all headers be unique? Assuming i understand you, use case 3, of the other linking to headers manually, it's a lose-lose use case, isn't it? If he has two identical headers, he can't link to the second header. If the second header has a different name that he expects (say, if it was appended an index) then he still doesn't know. So assuming i understand use case 3, it seems like appending the index is the lesser of the two evils. At least he has the ability to link to that second header, no? |
Perhaps. I'm throwing my anecdotal opinion out there on this one. My gut is saying duplicate text headers are just a bad idea, generally. You've clearly shown a refuting example, though I would personally try to make the sub headers more contextually specific by making them unique. I mean, if they aren't unique, should they really be headers? 🤷 (I'm thinking outlines of a thesis paper or presentation, chapter titles of books, etc. Anything so generic as to be duplicated in two locations isn't really header material. -- again, opinion)
Unless he embeds the second (duplicate) header directly as HTML with a manually-set ID. (In addition, there are other flavors of markdown that let you specify arbitrary attributes on some types of elements.)
I'm assuming the implementation you have in mind is to append an index only to the duplicate headers and not to every header? (otherwise, you'd lose predictability of every header, ever to gain support for an edge case) So the implementation would have to keep track of generated IDs?
That gives you 95% of your uses, with a single (terse) line of code. Anything more doesn't meet the value proposition for me. /opinion |
Personally, I would favor a particular feature that let's the author specify the ID attribute directly in markdown. I've already come across a flavor of markdown that did this, though I can't recall which one it was. The syntax was fairly nice: # my header {#my-header-id} Or something similar. That would allow auto-creation of IDs for the general case; while allowing a clean syntax for overriding the IDs for the edge case. (While not doing anything pseudo-magical or unexpected) |
Yup, exactly. No need to modify existing uniques, make index 0 empty by default
I think i would have to agree with you there. While i disagree about a unique requirement for a header (though, perhaps i am thinking about them more structurally, and less contextually), i would agree that duplicates easily fall in the 20% line. The only time i would really disagree, is if duplicate IDs are error prone somehow, but i have no idea on this. If, for some reason, something broke hard due to them, then that 20% is fatal, and worthy of a fix. Nevertheless, i agree with your 20% point, and withdraw my opinion hehe. With all of this discussed, i wonder how long it will take for all of this to be adopted? I've got a lot of links riding on this specific PR, but who knows what PR is actually going to get accepted. I have the feeling that when this issue is finally addressed, i'll be manually reassigning a hundred links hah. |
I could imagine actual errors being caused by this if there was any javascript depending on those IDs. But the workaround would be to embed that particular heading in straight HTML, with a custom ID.
I, too, though not in the hundreds. :-) Until this feature is added in some form or another, I've just been writing all my headers in direct HTML. Rather annoying, but at least it works correctly today, without needing to run a fork locally. |
I'd like to chime in on name vs. ID... My project (Markdown Here) uses Marked.js to format email. When email is displayed by Gmail (and other clients, probably), IDs get stripped out of elements, but names are left intact. So ID-based anchoring won't work for me, but name-based will. (And I do have users who specifically want to have ToC links in email.) Can both be used? Or there be an option for which is used? |
Looking at the diff, there appears to be a bunch of old unused code leftover here. I might just grab the relevant part and commit it myself. |
Yeah, I started this pr as a branch off another pr. I'll clean it up and |
I'm unclear as to the purpose of the various subdirectories under |
Uses the heading text itself as the ID for the H*. The only transformation made is to replace whitespace with '-' (hyphen). This conforms to the HTML5 spec wherein ID attributes can contain any non-whitespace character. (The previous HTML4 restrictions on ID attribute values have been relaxed.) This feature is limited in scope to strictly generating heading IDs. A future enhancement would be a simple way of letting the author specify the ID attribute. Perhaps: # Some: Crazy Header {#custom_id} to generate: <h1 id="custom_id">Some: Crazy Header</h1>
Previous commits have been squashed down to the minimum necessary. New tests were scrapped since the existing tests (once modified) provided coverage for this feature. Only thing still worth discussing, I think, is the choice of hyphen vs underscore as the whitespace replacement character. Otherwise, it's ready to merge. |
I have nothing new to add to the case I stated above, but in case saying it again will somehow sway anyone: For use in email, anchors need (Of course, if inline element rendering were customizable...) |
@adam-p While I respect the corner you're stuck in, I would consider that a rare edge case. The HTML spec now considers
|
It is (I guess) possible to use cheerio to generate names based on ids. |
@jasonkarns, I guess I should make this clear in the readme: test/new is for any test that isn't part of the original markdown test suite. test/original is only for the original markdown test suite. test/tests is where they both reside after being combined (and the markdown test suite being slightly altered) via the |
My only question with this, that i believe i forgot to raise, is it possible to make hover-anchor links like Github with this system? This is why i originally opted for the "more html" method, rather than the plain ID option. Are automatic links basically impossible with this method? Probably need to look into the custom generation branches then? |
@leeolayvar GitHub-style hover anchors are not possible with this method, at least not automatically. Though they could be added with javascript after the fact. |
k, didn't think so, thanks for the confirmation. I guess i'll wait for a token/parser modification system to be implemented to generate real anchors. This will be a useful stopgap until then. |
Wow, why was this merged in? The regex on this needs to be improved dramatically before this is actually usable. Expecting someone to remember to add commas and colons in reference links is a recipe for broken links and inconsistent results. Also, adding the Id directly to the heading is not a good solution as it makes it far more difficult to hang additional styles on the anchor (such as an icon on hover) without messing up your typography. It just seems like this PR has a lot of personal preference associated with it, and not enough regard for popular conventions. |
I now see 75dff71, which is good, but using headings instead of anchors is still just a bad idea. |
@jonschlinkert Personally, I think using the text-as-id is simpler for authors. The use case is for in-document links so the link author is the same person as the header author. Remembering to add commas and such is easier than trying to remember which characters get replaced and which character is used for replacement. The author simply copies the header text and replaces whitespace. But 'simple' is subjective. Adding any additional markup (like anchors) would only go further into the realm of personal taste. Giving headers an ID that can be targeted covers the core use case. Everything else (styling and clickable permalinks) are just enhancements on that core functionality. Getting the core functionality in place and shipped is more important than waiting for resolution on the numerous PRs that each do something slightly different. ( #89 #92 #129 #134 #207 ) The best future version of marked would indeed do some sort of anchor generation, or at least allow customized renderers. But who knows how long it could take to get those PRs resolved and merged? Much better to at least get marked to support linking to headers at all. |
Fair enough, I imagine we all take our positions based on either what we're familiar with or what we find most advantageous based on what we tend to focus on. IMHO it sounds like your use-cases are more oriented to basic documents, single documents, and/or a small amount of documentation (I really don't intend that as a slight or any kind of jab, so please don't take it that way. These are examples of my use cases:
I'm also doing the documentation for the new lesscss.org website and I'm managing docs for close to 100 other projects. Again, please don't think I'm trying to make a big deal out of "my projects", because I'm just one dude. But are my use cases, and hopefully it helps explain why I don't think this PR is sufficient.
I disagree. I'm not sure where you got your ideas about that, or whose core use case you're representing, but IMHO, most people would agree that this should be implemented with proper anchors, not a shortcut to slim down on markdown at the cost of being more correct.
In some cases, but that's a broad generalization that's difficult to debate. In this case, however, I think it would have been better to wait for something idiomatic.
Lol, for you. That, in particular, is what stands out to me about this PR. It has personal taste and preferences written all over it. At the end of the day, it's not a big enough deal for me to keep debating it. We can just fork it or find another solution. |
Well, i'll be interested if Anchors are added somehow. Like i was mentioning earlier, i would like a way to make easy linkable anchors (like Github) without having to layer JavaScript onto for seemingly no reason. With that said, i do feel the IDs are a nice simple solution if the other PRs involving modifying how the HTML is generated are pulled. If we are given a in-program ability to modify the output to our needs, than i feel the default should be just IDs. They're clean, simple, and provide a basic use case. If we are not given the ability to modify HTML output anytime soon, then i think just ID based anchor links are woefully inadequate. |
Here's how I arrived at my definition of 'core functionality': There are two features we're discussing.
The ability to do No. 2 is predicated on No. 1 already working. Granted, using anchors solves both at the same time. But as we can see from the number of pull requests that attempt to do both, there is disagreement on how it should be implemented. But fundamentally, the existence of a clickable anchor+link, and the ability to style it (hover, etc) are enhancements to the core functionality of "ability to deep-link to an arbitrary header". Adding IDs to the headers solves the basic use case, with minimal fuss. This way we can start linking to headers now without waiting for the larger implementation using anchors (or custom renderers). |
What about exposing an option like case 'heading': {
return this.options.headerHTML(this.token) || '<h'
+ this.token.depth
+ ' id="'
+ this.token.text.toLowerCase().replace(/[^\w]+/g, '-')
+ '">'
+ this.inline.output(this.token.text)
+ '</h'
+ this.token.depth
+ '>\n';
} Basically there is a simple default implementation, but users are also allowed to do cool stuff like GitHub's hover links if they define a Thoughts? |
"I could imagine actual errors being caused by this if there was any javascript depending on those IDs." It's also a problem for epub file validation or, basically, for anything else that does strict parsing of HTML. The fundamental issue here is that legal Markdown is being transformed into illegal HTML. That should never happen, IMO. The argument that authors shouldn't use duplicate headings doesn't seem persuasive. For one, that's a matter of style, not standards compliance. For another, the Markdown content may not be under your direct control (e.g., web site that allows users to enter markdown which is then transformed into HTML for another purpose). |
For anyone wondering here's what I did to get full GFM rendering using github style and anchor links: var renderer = new marked.Renderer();
marked.setOptions({
gfm: true,
renderer: renderer,
highlight: function (code) {
return require('highlight.js').highlightAuto(code).value;
}
});
renderer.heading = function (text, level) {
var escapedText = text.toLowerCase().replace(/[^\w]+/g, '-');
return '<h' + level + ' class="header"><a name="' +
escapedText +
'" class="anchor" href="#' +
escapedText +
'"><span class="octicon octicon-link"></span></a>' +
text + '</h' + level + '>';
}; Then use this CSS file: https://github.com/sindresorhus/github-markdown-css/blob/gh-pages/github-markdown.css |
Generate Heading IDs
This PR is a modification of #117 with the following differences:
anchor
tags, justid
attributes)name
attribute is obsolete in HTML5.id
attribute is recommendedid
attribute accepts any non-white-space character per HTML5 spec (and is backwards compatible)Changes:
anchors
,anchorClass
, andanchorContent
)