-
Notifications
You must be signed in to change notification settings - Fork 30.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
crypto: improve randomUUID performance #37243
crypto: improve randomUUID performance #37243
Conversation
a31a956
to
d504e9a
Compare
@rangoo94 could you please retarget this PR onto the master branch? (you can do it from the gh ui by clicking edit and then selecting a different branch) |
d504e9a
to
0f68e30
Compare
@devsnek, thanks, sorry, I overlooked that. I rebased it now on top of the master branch. |
a466943
to
4bbe9fe
Compare
lib/internal/crypto/random.js
Outdated
let uuidBatch = 0; | ||
|
||
let hexBytesCache; | ||
function getHexBytes() { | ||
if (hexBytesCache === undefined) { | ||
hexBytesCache = new Array(256); | ||
for (let i = 0; i < hexBytesCache.length; i++) { | ||
const hex = NumberPrototypeToString(i, 16); | ||
hexBytesCache[i] = StringPrototypePadStart(hex, 2, '0'); | ||
} | ||
} | ||
return hexBytesCache; | ||
} | ||
|
||
function serializeUUID(buf, offset = 0) { | ||
const kHexBytes = getHexBytes(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To simplifify things a bit further here ... just generate the hex array on module load...
let uuidBatch = 0; | |
let hexBytesCache; | |
function getHexBytes() { | |
if (hexBytesCache === undefined) { | |
hexBytesCache = new Array(256); | |
for (let i = 0; i < hexBytesCache.length; i++) { | |
const hex = NumberPrototypeToString(i, 16); | |
hexBytesCache[i] = StringPrototypePadStart(hex, 2, '0'); | |
} | |
} | |
return hexBytesCache; | |
} | |
function serializeUUID(buf, offset = 0) { | |
const kHexBytes = getHexBytes(); | |
let uuidBatch = 0; | |
const kHexBytes = new Array(256); | |
for (let i = 0; i < kHexBytes.length; i++) { | |
const hex = NumberPrototypeToString(i, 16); | |
kHexBytes[i] = StringPrototypePadStart(hex, 2, '0'); | |
} | |
function serializeUUID(buf, offset = 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I like this simplification (4th variant), but it reserves 2KB of data immediately after loading crypto
. I couldn't imagine a case where it could be a problem, but - for safety - I introduced a lazy getter to avoid that.
Just to confirm, does it mean that this 2KB allocation is negligible? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the other hand - when the kHexBytes
is initialized statically without the for
loop, the results are ~8% better. The downside is that it costs the code space too.
Do you think that it's worth speeding it up this way instead?
const kHexBytes = [
'00', '01', '02', '03', '04', '05', '06', '07', '08', '09', '0a', '0b', '0c',
'0d', '0e', '0f', '10', '11', '12', '13', '14', '15', '16', '17', '18', '19',
'1a', '1b', '1c', '1d', '1e', '1f', '20', '21', '22', '23', '24', '25', '26',
'27', '28', '29', '2a', '2b', '2c', '2d', '2e', '2f', '30', '31', '32', '33',
'34', '35', '36', '37', '38', '39', '3a', '3b', '3c', '3d', '3e', '3f', '40',
'41', '42', '43', '44', '45', '46', '47', '48', '49', '4a', '4b', '4c', '4d',
'4e', '4f', '50', '51', '52', '53', '54', '55', '56', '57', '58', '59', '5a',
'5b', '5c', '5d', '5e', '5f', '60', '61', '62', '63', '64', '65', '66', '67',
'68', '69', '6a', '6b', '6c', '6d', '6e', '6f', '70', '71', '72', '73', '74',
'75', '76', '77', '78', '79', '7a', '7b', '7c', '7d', '7e', '7f', '80', '81',
'82', '83', '84', '85', '86', '87', '88', '89', '8a', '8b', '8c', '8d', '8e',
'8f', '90', '91', '92', '93', '94', '95', '96', '97', '98', '99', '9a', '9b',
'9c', '9d', '9e', '9f', 'a0', 'a1', 'a2', 'a3', 'a4', 'a5', 'a6', 'a7', 'a8',
'a9', 'aa', 'ab', 'ac', 'ad', 'ae', 'af', 'b0', 'b1', 'b2', 'b3', 'b4', 'b5',
'b6', 'b7', 'b8', 'b9', 'ba', 'bb', 'bc', 'bd', 'be', 'bf', 'c0', 'c1', 'c2',
'c3', 'c4', 'c5', 'c6', 'c7', 'c8', 'c9', 'ca', 'cb', 'cc', 'cd', 'ce', 'cf',
'd0', 'd1', 'd2', 'd3', 'd4', 'd5', 'd6', 'd7', 'd8', 'd9', 'da', 'db', 'dc',
'dd', 'de', 'df', 'e0', 'e1', 'e2', 'e3', 'e4', 'e5', 'e6', 'e7', 'e8', 'e9',
'ea', 'eb', 'ec', 'ed', 'ee', 'ef', 'f0', 'f1', 'f2', 'f3', 'f4', 'f5', 'f6',
'f7', 'f8', 'f9', 'fa', 'fb', 'fc', 'fd', 'fe', 'ff'
];
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but I'd prefer to have approval from @nodejs/crypto before landing this.
@rangoo94 can you please rebase on top of master to solve the git conflict? |
1874e77
to
3eb8e89
Compare
@aduh95 sure, rebased :) |
Co-authored-by: mscdex <[email protected]>
Co-authored-by: Antoine du Hamel <[email protected]>
Co-authored-by: James M Snell <[email protected]>
Co-authored-by: James M Snell <[email protected]>
Co-authored-by: James M Snell <[email protected]>
3eb8e89
to
61f67cf
Compare
Rebased once again on top of |
CI: https://ci.nodejs.org/job/node-test-pull-request/36368/ Benchmark results:
|
PR-URL: #37243 Reviewed-By: James M Snell <[email protected]> Reviewed-By: Antoine du Hamel <[email protected]>
Landed in 5694f7f |
PR-URL: #37243 Reviewed-By: James M Snell <[email protected]> Reviewed-By: Antoine du Hamel <[email protected]>
I arrived to similar conclusions (using the same approach) a year ago: The first implementation of my module actually only incremented the cache-window's by 1 (instead of 16), meaning that a shorter buffer could last longer. This was actually a suggestion from a developer who implemented the approach as Crystal library (and eventually made its way into Crystal's stdlib). While it's not as secure, perhaps this could be added as an option to avoid regenerating buffers more than is necessary? |
Hi @lukeed, thanks for the comment! The solution with incrementing offset by 1 seems interesting in terms of performance (~15% faster), but it may introduce security issues within 2 dimensions:
While collisions could be acceptable (maybe not in the stdlib anyway), the lack of uniqueness is both very dangerous and not applicable to the standard. As an example, if the online shop would generate
Basically, I think that this idea is really great, but only in very specific circumstances, though should be rather done as separate library. |
Right, it's less secure. That's why the suggestion came with a "behind an option" requirement :) It should definitely not be the default, but there may be use cases where the developer need not be concerned with an end-user guessing new variants. |
I very like the idea of including UUIDv4 into node.js core. I think that maximizing its performance could lead to further standardization. The initial version still had the potential for improvements, so I took some effort into it.
Benchmark results
After each step (separate commit), I ran
crypto/randomUUID.js
benchmark to observe the performance difference.crypto
module initializationcrypto.randomUUID()
callkHexDigits
)kHexDigits
)slice
on entropy cache)kHexDigits
)disableEntropyCache
accesskHexDigits
)00
-ff
strings)kHexBytes
)kHexBytes
on firstrandomUUID()
callkHexBytes
)Entropy cache size
Entropy cache size contributes to the performance, so I prepared a matrix of different sizes on different variants for comparison.
Increasing the entropy cache could be considered for variants 4 and 5, as it will improve ~10% per 1KB of additional cache.
Summary
There are 3 approaches to include the improvements, depending on what is expected:
crypto
, even withoutrandomUUID
, will take ~2KB of memoryrandomUUID
What are your thoughts about that?
Checklist
make -j4 test
(UNIX), orvcbuild test
(Windows) passes