Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(tocObj): skip permalink symbol #175

Merged
merged 3 commits into from
Feb 23, 2020
Merged

Conversation

curbengh
Copy link
Contributor

@curbengh curbengh commented Jan 13, 2020

Fixes #174 Closes hexojs/hexo-renderer-markdown-it#102

Permalink (in this context) is a single non-word character, word = [a-Z0-9], and wrapped in <a>.
It may be wrapped in whitespace(s)

Scenario Input Output (text) Explanation
1 <h1 id="foo"><a>#</a>bar</h1>

<h1 id="foo">bar<a>#</a></h1>

<h1 id="foo"><a>#</a>bar<a>#</a></h1>
bar

bar

bar
# is treated as a permalink and skipped
2 <h1>号码<a>牌</a></h1> 号码 is treated as a permalink
3 <h1><a>#</a></h1> # If there is only permalink, parse it
4 <h1><a>b</a> two</h1> b two b is not a permalink

returns empty string '' instead of null. This is following convention. htmlparser2 DomUtils.getText(elNoText) and camaro camaro.transform(xml, {foo:'invalid'}) also return empty string. This is compatible with toc, as empty string is still falsy (same as null and undefined).

@curbengh curbengh requested a review from SukkaW January 13, 2020 04:12
@coveralls
Copy link

coveralls commented Jan 13, 2020

Coverage Status

Coverage increased (+0.03%) to 97.017% when pulling 38a0e5f on curbengh:tocobj-child into 7e5633a on hexojs:master.

@curbengh curbengh force-pushed the tocobj-child branch 2 times, most recently from 73ac878 to efaada6 Compare January 13, 2020 04:26
const input = [
'<h1 id="foo"><a>#</a>bar</h1>',
'<h1 id="foo">bar<a>#</a></h1>',
'<h1 id="foo"><a>#</a>bar<a>#</a></h1>'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
'<h1 id="foo"><a>#</a>bar<a>#</a></h1>'
'<h1 id="foo"><a>#</a>bar<a>#</a></h1>',
'<h1 id="foo"><a>#</a><span>bar</span><a>#</a></h1>'

Copy link
Contributor Author

@curbengh curbengh Jan 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<h1 id="foo"><a>#</a><span>bar</span><a>#</a></h1> would result in #bar# though since bar is wrapped in a children element.

Copy link
Contributor Author

@curbengh curbengh Jan 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A workaround can be if (element.type === 'text' || element.type === 'span').

<h1><a>#</a><span>bar</span><a>#</a></h1> -> bar

But the workaround is not foolproof,

<h1><a>#</a>hey<span>bar</span><a>#</a></h1> -> hey
<h1><a>#</a><span>bar</span>hey<a>#</a></h1> -> bar

Since for..of doesn't necessarily follow order, the result can be erratic.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The children tag could be anything: <span>, <a> even <p>.

SukkaW
SukkaW previously approved these changes Jan 13, 2020
Copy link
Member

@SukkaW SukkaW left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@curbengh curbengh changed the title fix(tocObj): prioritize text-only children element fix(tocObj): prioritize text of immediate descendant of a heading Jan 13, 2020
@curbengh
Copy link
Contributor Author

<h2>foo<i>bar</i>baz</h2> only results in foo. I think tocObj should just avoid <a>.

@SukkaW
Copy link
Member

SukkaW commented Jan 13, 2020

<h2>foo<i>bar</i>baz</h2> only results in foo. I think tocObj should just avoid <a>.

@curbengh So what happened if <h2><a href="">foobar</a></h2> or <h2><a>foo</a> bar</h2>?

@curbengh
Copy link
Contributor Author

curbengh commented Jan 13, 2020

with the new workaround (will commit in a minute),

<h2><a href="">foobar</a></h2> -> foobar
<h2><a>foo</a> bar</h2> -> bar (foo is assumed to be a permalink symbol)

I think 2nd case is rare, if a user wants to add a link, 1st case would be used.


I just thought of another approach, skip <a> with a single symbol.

@SukkaW
Copy link
Member

SukkaW commented Jan 13, 2020

@curbengh I have a better idea. Using replace to remove the symbol after getText. The symbol should be passed as a parameter.

@curbengh
Copy link
Contributor Author

The symbol should be passed as a parameter

parameter of which function?

@curbengh curbengh changed the title fix(tocObj): prioritize text of immediate descendant of a heading fix(tocObj): skip permalink symbol Jan 13, 2020
@curbengh curbengh force-pushed the tocobj-child branch 2 times, most recently from 0f84a37 to 5550c70 Compare January 13, 2020 06:56
@curbengh curbengh requested a review from SukkaW February 17, 2020 07:28
SukkaW
SukkaW previously approved these changes Feb 17, 2020
Copy link
Member

@SukkaW SukkaW left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Current implementation LGTM!

@curbengh
Copy link
Contributor Author

rebased to include #181.

@SukkaW SukkaW merged commit 544a6f6 into hexojs:master Feb 23, 2020
@noraj
Copy link

noraj commented Feb 23, 2020

@curbengh Awesome, this will be released in 1.9.0? 1.8.2?

@curbengh
Copy link
Contributor Author

should be part of 1.9.0

@curbengh curbengh deleted the tocobj-child branch February 25, 2020 00:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants