-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[NO QA] Fix random numbers again #10700
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks way simpler and great write up.
Definitely curious to see where the previous approach fails.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, is this implementation is vulnerable to something I just learned about called a "modulo bias"?
- https://stackoverflow.com/questions/10984974/why-do-people-say-there-is-modulo-bias-when-using-a-random-number-generator/10984975#10984975
- https://crypto.stackexchange.com/questions/39186/what-does-it-mean-for-a-random-number-generator-to-be-cryptographically-secure/39188#39188?newreg=6e4d42032f4f452f8257c311c81413bd
- https://pthree.org/2018/06/13/why-the-multiply-and-floor-rng-method-is-biased/
- https://research.kudelskisecurity.com/2020/07/28/the-definitive-guide-to-modulo-bias-and-how-to-avoid-it/
} | ||
|
||
return result; | ||
return (Math.floor(Math.random() * (2 ** 21)) * (2 ** 32)) + Math.floor(Math.random() * (2 ** 32)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to see this moved over to NumberUtils.js instead of ReportUtils.js
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could do that, but it really shouldn't be used for anything but report ID's, everything else will be 64-bits. I don't have a hugely strong opinion, but leaving it here at least implies not to use it for other IDs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll be using it for IOUReportIDs too, since those are integers as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that leaving it in reportUtils is fine since it's only used for reportIDs
Yes, probably. It's probably vulnerable to probably most RNG problems. It's specifically not cryptographically secure per the spec for Math.random(), which is why Crypto.getRandomValues() exists. In theory, if we generate 32 bit numbers and only ever multiply them by even powers of two, then the modulo problem shouldn't be a thing. I.e., if we generate an underlying 32 bit random number and scale it to fit in 21 bits (like we do here), then this shouldn't be an issue, since these numbers always divide into each other easily. For example, 2^21 / 2^32 = exactly 2048. Each number in a 2^21 range should have exactly the same chance (2048 of the total 4294967296 possible options) of getting generated from 32 bits of randomness. But we don't know the details of every Math.random() implementation we're using, so we're not really sure. It shouldn't matter critically that we have slightly less than perfectly even distributions here, since the amount of possible options is still so high. If you have a There's lots of details to care about with RNGs, but this 53-bit generator just has to be good enough to get us through until we can deprecate oldDot with the fewest number of collisions in the meantime (then we need to worry about our 64-bit generator). The "real" solution is probably to use The real goal here is just to be able to generate numbers that don't overlap for a year or two, so we don't need to be as good as NSA encryption keys. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for explaining!
Yes, thank's for explaining!
So a potential problem is that we don't know how many bits are used in |
✋ This PR was not deployed to staging yet because QA is ongoing. It will be automatically deployed to staging after the next production release. |
We're basically basing our data on this 7-year-old SO post: https://stackoverflow.com/questions/3344447/precision-of-math-random#:~:text=It's%20browser%2FJavaScript%20engine%20dependent,16%20decimals%2C%20see%20Sly1024's%20answer There's not really a way to know without looking at the source for |
There's actually a comment in there saying:
That might be true, I haven't looked at recent Chrome source. |
🚀 Deployed to staging by @roryabraham in version: 1.1.95-0 🚀
|
A bit more testing required. Feel free to review.
Thread is here: https://expensify.slack.com/archives/C02HWMSMZEC/p1661439005224499
Details
Upon discussion with @quinthar, I got to testing the existing implementation here and found it resulting in a significant number of collisions (5707 collisions in 20,000,000 IDs).
David had a suggestion for a simpler implementation. it seems like it does almost exactly the same thing but does not result in the very high number of collisions. I have not yet discovered the problem with the older algorithm that resulted in the collisions.
Let me explain the new code:
We take:
And add:
And return it.
The second part is pretty straightforward, we add a random 32-bit number. What about the first part? First we generate a 21-bit number (we're aiming for 53 total bits, remember. 32 + 21 = 53):
Then we multiply by:
This has the effect of moving the 21 bits we generated 32 places to the left. If you're bad at binary math, think of it with decimal. Let's say we want an 8-digit number, but we can only generate 6 digits at a time (because 32 digits is a lot to type). Remember that a bit is a binary digit.
Anyway, we want an 8 digit decimal number. First we generate a two digit decimal number:
Now we over it over to the left by 6 places:
10^6 is 1,000,000. Multiplying our example value
42
by1000000
gives42000000
. See, we've moved 42 6 places to the left. Now binary is the same, except instead of raising 10 to the 6th power for six digits, we raise 2 to the 32nd power for 32 binary digits.Then we add the rest to the end. Generate a second 6-digit number:
It doesn't matter what number this generates, it will fit in the six
0
s at the right ofrandomNum
, so we can just add it to42,000,000
and we'll have some random 8-digit decimal number.We've done the exact same thing in binary but with longer numbers.
Fixed Issues
$ https://github.com/Expensify/Expensify/issues/224792
Tests
Here's the elaborate testing process that was used for this:
Using the following test page, generate 1,000,000 random reportIDs in chrome. Repeat this 20x by refreshing, each time copying the output to the end of a text file:
I saved the output in
final_20m_test.txt
. I was going to attach it here, but it's 337mb, so I put it onbastion1.sjc
if you want to get it, it's in/home/tyler/final_20m_test.txt.zip
for you to look at.Then run the following command to sort the output from your random number generator:
This takes a few minutes (it's 337mb of text) and saves all the ids in (lexically, not numerically, but it doesn't matter) sorted order.
Once sorted, run it through
uniq
:This will output any line that occurs twice. If there's no output, there's no collisions. The above set I generated has no collisions. I've done this twice to verify. You're welcome to repeat.
Note: this has only been tried with Chrome's
Math.random()
and nothing is guaranteed from any other implementation, but this seems to work correctly.Simple test run making sure web doesn't seem broken:
PR Review Checklist
Contributor (PR Author) Checklist
### Fixed Issues
section aboveTests
sectionQA steps
sectiontoggleReport
and notonIconClick
)src/languages/*
filesSTYLE.md
) were followedAvatar
, I verified the components usingAvatar
are working as expected)/** comment above it */
displayName
propertythis
properly so there are no scoping issues (i.e. foronClick={this.submit}
the methodthis.submit
should be bound tothis
in the constructor)this
are necessary to be bound (i.e. avoidthis.submit = this.submit.bind(this);
ifthis.submit
is never passed to a component event handler likeonClick
)StyleUtils.getBackgroundAndBorderStyle(themeColors.componentBG
)Avatar
is modified, I verified thatAvatar
is working as expected in all cases)PR Reviewer Checklist
The Contributor+ will copy/paste it into a new comment and complete it after the author checklist is completed
### Fixed Issues
section aboveTests
sectionQA steps
sectiontoggleReport
and notonIconClick
).src/languages/*
filesSTYLE.md
) were followedAvatar
, I verified the components usingAvatar
have been tested & I retested again)/** comment above it */
displayName
propertythis
properly so there are no scoping issues (i.e. foronClick={this.submit}
the methodthis.submit
should be bound tothis
in the constructor)this
are necessary to be bound (i.e. avoidthis.submit = this.submit.bind(this);
ifthis.submit
is never passed to a component event handler likeonClick
)StyleUtils.getBackgroundAndBorderStyle(themeColors.componentBG
)Avatar
is modified, I verified thatAvatar
is working as expected in all cases)QA Steps
Screenshots
Web
Mobile Web
Desktop
iOS
Android