-
-
Notifications
You must be signed in to change notification settings - Fork 435
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HighPrecision<f32/64> #531
Conversation
Oh, and I intentionally didn't include the code for HighPrecision01 from #372 since I don't want to take credit for @pitdicker's work. |
Your benchmarks show 6-10× the runtime:
Which is reasonable enough if the precision is actually needed, though as you say there may not be many use-cases where it is. So the big question is do we want to include this in Rand? You say it is orthogonal to #494 but it's not exactly, for example we could put all high-precision implementations in a companion crate.
The git commits still track who created the commit (even if you rebase or adjust it), so not a problem. |
Yeah, sticking this in a separate crate is definitely an option. Also related to discussion in #290. I'll see if I can rebase the other patch. |
1c96ce5
to
fa1f932
Compare
(The mean test is totally inadequate for checking high precision.)
Sorry for not looking at your code for so long. Can you maybe give a high-level description how
Feel free to treat the code like it is your own 😄. |
No worries. Mainly did this because it was a fun challenge. I'm not particularly opinionated about where the final implementation ends up living since I'm not sure in which situations this code is needed. It's easiest to think about only positive numbers as a start, and using 32bit numbers as an example. The implementation takes Unless we had underflow, both mantissas will be 24-bit numbers. We then convert both numbers to use the higher exponent, and shift down the mantissa to compensate. Here that means right-shifting low_mantissa, and increasing low_exponent, by 10. During shifting, if we're shifting the low value, round towards negative infinity. If we're shifting the high value, round towards positive infinity. This gives low_mantissa: 0x0000_2004 (At this point this information is saved in the Distribution object). Conceptually what we do after this is that we treat the high and the low as infinitely long numbers, by adding 0s at the end, and generate an infinitely long random number between them. We then grab enough bits of this random number, rounding towards negative infinity, that we get a 24bit number. In our case: Generate a number between In practice, the way this is done is to generate a number between the low and high mantissa using an This is repeated until we have a mantissa that is exactly 24 bits, or until we get into the underflow range (i.e. when exponent is 1). Usually only one loop is needed, except if we started with a mantissa of exactly 0. We could at this point shift left more, and add more random bits at the end. However if we do that and then shift right to get to a 24bit value we'd get back exactly to where we are now. So at this point we know that even if we had added infinitely more random bits at the end, rounding would get us back to exactly the mantissa that we have now. Finally we check that our number: is within the range that we set out to generate a number in. The only reason that we could get a number outside of the range is the rounding we did in the beginning to make both our limits have the same exponent. If we ended up with a number that's too big or too small, then start over from scratch. That's about all that's needed. Other than taking care of negative numbers. Negative numbers are handled by giving encoding the two mantissas as normal These signed integers are then used as limits in the uniform distribution. I.e. the distribution is a Once have a generated number, we check if it is negative and then grab its absolute value. This absolute value is what all subsequent math is done on. This in order to correctly measure that we have a 24bit value since this matches the IEEE754 encoding. Everything is done exactly the same other than two things:
I've skipped some details above, but this is mainly it. Do let me know if anything is still unclear. |
Oh, and we might want to move these distributions to their own file. There's very little sharing with the rest of the code in |
As mentioned in #494, I will now close this. It's a nice idea, but doesn't seem useful in |
This implements a distribution like Uniform<f32/f64>, but using the full precision of the floating point types.
There's some unfortunate things in there, like the HPFloatHelper trait. However it seems needed for the same reason that we have that we have the UniformSampler and SampleUniform traits. I went with a slightly simplified version of those traits since we don't have the same need of extensibility here.
I wrote some pretty thorough tests in order to convince myself that the implementation is correct. The full set of tests take well over 60 seconds to run on my modern MacBook Pro, so I instead have a smaller set of tests which are run by default. But the full set is also there and can be enabled by anyone tinkering with this code in the future.
Regarding relationship to issue #494, I think this PR is actually relatively orthogonal. I don't see the distribution in this PR as something that someone would ever want to use as input for other randomness related algorithms, such as weighted choices or bool generation. It's simply too bad performance and too high precision.
In fact, we might want to name this FullPrecision<...> or MaxPrecision<...> instead. And leave the name HighPrecision for something with better performance.