-
Notifications
You must be signed in to change notification settings - Fork 30.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
buffer: improve fill & normalizeEncoding performance #18790
Conversation
b2f5427
to
f082114
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with a good CITGM run.
doc/api/buffer.md
Outdated
@@ -1228,6 +1228,9 @@ console.log(buf1.equals(buf3)); | |||
<!-- YAML | |||
added: v0.5.0 | |||
changes: | |||
- version: REPLACEME | |||
pr-url: https://github.com/nodejs/node/pull/REPLACEME | |||
description: Negative `end` values throw an `ERR_INDEX_OF_OUT_BOUNDS` error. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ERR_INDEX_OUT_OF_RANGE
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work!
I think it would be interesting to add fill
with larger values to the benchmark.
Actual changes LGTM
} | ||
|
||
function slowCases(enc) { | ||
switch (enc.length) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm surprised this improves performance noticeably.
@benjamingr the performance gain for Buffer#fill goes down the bigger the buffer is. The break even point should be at about 25kb. Above that the filling will be the main time consumer. |
@BridgeAR right, but since we're adding a benchmark that will run when this code changes for a while, I think there is value in adding a larger buffer for the test case - I suspect allocating large'ish buffers is a pretty common use case. Even if there won't be a big difference here. |
|
doc/api/buffer.md
Outdated
@@ -1228,6 +1228,9 @@ console.log(buf1.equals(buf3)); | |||
<!-- YAML | |||
added: v0.5.0 | |||
changes: | |||
- version: REPLACEME | |||
pr-url: https://github.com/nodejs/node/pull/REPLACEME | |||
description: Negative `end` values throw an `ERR_INDEX_OF_OUT_RANGE` error. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OF_OUT
-> OUT_OF
:)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
1) This improves the performance for Buffer#fill by using shortcuts. 2) It also ports throwing errors to JS. That way they contain the proper error code. 3) Using negative `end` values will from now on result in an error instead of just doing nothing. 4) Passing in `null` as encoding is from now on accepted as 'utf8'.
This focuses on the common case by making sure they are prioritized. It also changes some typeof checks to test for undefined since that is faster and it adds a benchmark.
Due to a consolidation the isEncoding function got less strict in version 5.x.x. This commit makes sure we do not return `true` for empty strings.
332ac6e
to
c5aa244
Compare
Rebased due to conflicts. |
Landed in d3af120...452eed9 |
1) This improves the performance for Buffer#fill by using shortcuts. 2) It also ports throwing errors to JS. That way they contain the proper error code. 3) Using negative `end` values will from now on result in an error instead of just doing nothing. 4) Passing in `null` as encoding is from now on accepted as 'utf8'. PR-URL: nodejs#18790 Reviewed-By: James M Snell <[email protected]> Reviewed-By: Benjamin Gruenbaum <[email protected]> Reviewed-By: Matteo Collina <[email protected]>
PR-URL: nodejs#18790 Reviewed-By: James M Snell <[email protected]> Reviewed-By: Benjamin Gruenbaum <[email protected]> Reviewed-By: Matteo Collina <[email protected]>
PR-URL: nodejs#18790 Reviewed-By: James M Snell <[email protected]> Reviewed-By: Benjamin Gruenbaum <[email protected]> Reviewed-By: Matteo Collina <[email protected]>
This focuses on the common case by making sure they are prioritized. It also changes some typeof checks to test for undefined since that is faster and it adds a benchmark. PR-URL: nodejs#18790 Reviewed-By: James M Snell <[email protected]> Reviewed-By: Benjamin Gruenbaum <[email protected]> Reviewed-By: Matteo Collina <[email protected]>
Due to code consolidation in nodejs#7207 the isEncoding function got less strict. This commit makes sure isEncoding returns false for empty strings as before the consolidation. PR-URL: nodejs#18790 Refs: nodejs#7207 Reviewed-By: James M Snell <[email protected]> Reviewed-By: Benjamin Gruenbaum <[email protected]> Reviewed-By: Matteo Collina <[email protected]>
1) This improves the performance for Buffer#fill by using shortcuts. 2) It also ports throwing errors to JS. That way they contain the proper error code. 3) Using negative `end` values will from now on result in an error instead of just doing nothing. 4) Passing in `null` as encoding is from now on accepted as 'utf8'. PR-URL: nodejs#18790 Reviewed-By: James M Snell <[email protected]> Reviewed-By: Benjamin Gruenbaum <[email protected]> Reviewed-By: Matteo Collina <[email protected]>
PR-URL: nodejs#18790 Reviewed-By: James M Snell <[email protected]> Reviewed-By: Benjamin Gruenbaum <[email protected]> Reviewed-By: Matteo Collina <[email protected]>
PR-URL: nodejs#18790 Reviewed-By: James M Snell <[email protected]> Reviewed-By: Benjamin Gruenbaum <[email protected]> Reviewed-By: Matteo Collina <[email protected]>
This focuses on the common case by making sure they are prioritized. It also changes some typeof checks to test for undefined since that is faster and it adds a benchmark. PR-URL: nodejs#18790 Reviewed-By: James M Snell <[email protected]> Reviewed-By: Benjamin Gruenbaum <[email protected]> Reviewed-By: Matteo Collina <[email protected]>
Due to code consolidation in nodejs#7207 the isEncoding function got less strict. This commit makes sure isEncoding returns false for empty strings as before the consolidation. PR-URL: nodejs#18790 Refs: nodejs#7207 Reviewed-By: James M Snell <[email protected]> Reviewed-By: Benjamin Gruenbaum <[email protected]> Reviewed-By: Matteo Collina <[email protected]>
@nodejs/security thoughts on preventing such changes in the future? My opinion is that a testcase should have been introduced at the same time when the comment was. Or, perhaps, even instead of the comment. Upd: filed #22492. |
This improves the performance of
Buffer#fill
and ofnormalizeEncoding
. The latter focuses on the common cases as can be seen in the benchmarks.I made the
Buffer.isEncoding()
implementation stricter again after it was loosened in #7207. It will not return true for an empty string anymore.normalizeEncoding
is now also stricter and it returnsundefined
forfalse
,NaN
and0
.undefined
,null
and''
are still "valid"utf8
encodings.The
Buffer#fill
implementation will now also throw an OOB error in caseend
is a negative value. This makes it consistent withstart
and it helps to identify issues since before it would just been ignored instead.Buffer#fill
will throw the errors in JS from now on in case the OOB is detected in c++ and those errors contain the proper error code from now on.Buffer#fill
will from now on also acceptnull
as validutf8
encoding in case a string is provided. That was not the case before but we do accept it in other places and that makes it more consistent.Buffer#fill performance
Normalize encoding performance
Checklist
make -j4 test
(UNIX), orvcbuild test
(Windows) passesAffected core subsystem(s)
buffer