-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Core: Token.stringify
will now call util.encode
#1844
Core: Token.stringify
will now call util.encode
#1844
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe get rid of util.encode(…)
altogether by moving the string replacement code into Token.stringify(…)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Hmm on a second thought, Token is accessible from the outside isn't it? This could be considered a BC breaking change, even though Token was probably undocumented. |
The method is marked as private in the documentation: #1782. |
Which is a good thing, but this doc is not merged yet, so we have to account for what the users have been given so far, which is mainly this page of the website... where |
Unfortunately, @types/prismjs had documented it, albeit very wrongly. The current version has a warning about it being private. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’d say let’s merge this.
I'm actually with @Golmote on this one. This is technically a breaking change. Let's merge this after JSDoc is through, so people can know how the API changed. (PrismJS v2.0 when?) |
@mAAdhaTTah @Golmote I also recently read an article where this was mentioned. So I made a little benchmark and by using a simpler The simplified function used: var ATTR_REGEX = /[&<"]/;
/**
* This is a fast HTML escaping function.
*
* All credit goes to Vladimir Kutepov (@frenzzy on GitHub).
*
* @param {string} html
* @returns {string}
*/
function escapeHTML(html) {
var match = ATTR_REGEX.exec(html);
if (!match) return html;
var index = 0;
var lastIndex = 0;
var out = '';
var escape = '';
for (index = match.index; index < html.length; index++) {
switch (html.charCodeAt(index)) {
case 34: // "
escape = '"';
break;
case 38: // &
escape = '&';
break;
case 60: // <
escape = '<';
break;
default:
continue;
}
if (lastIndex !== index) out += html.substring(lastIndex, index);
lastIndex = index + 1;
out += escape;
}
return lastIndex !== index ? out + html.substring(lastIndex, index) : out;
} Benchmark log:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this rebases and passes tests I think we can release it as v1.26? the benchmarks look promising :)
I tagged this as v2.0 because this is potentially a breaking change to some people. As mentioned above, I also don't see any urgency to merge this. This PR doesn't block anything, nor is the improvement felt by our users.
That benchmark isn't for this change but the faster HTML escape function. Despite removing a deep copy of every token stream, this PR doesn't actually improve performance. |
Implemented in v2. |
This PR changes the behavior of
Token.stringify
so that the function will now encode strings itself rather than relying on pre-encoded token streams.This also simplified
util.encode
because it doesn't need to be able to handle token streams anymore.It's also more efficient like this because
encode
doesn't have to create a deep copy of the token stream.Btw. No language or plugin uses
encode
, so this should be a purely internal change.