faster randn!(::MersenneTwister, ::Array{Float64}) #35078

rfourquet · 2020-03-11T18:42:06Z

The main idea is that

randn() essentially needs one random Float64 value (a bit more in rare cases)
rand!(rng::MersenneTwister, ::Array{Float}) is optimized, compared to calling rand(rng, Float64) in a loop

So for randn!, let's use this optimized version by filling the array with uniform float values, and then do a second pass to transform them into normal values. The same applies to randexp! but with a smaller improvement.

There is a threshold on the length of the array between 10-20 below which this refined algorithm is less performant, so we switch the algorithm depending on that length. Here are the speed-ups on my machine:

length	5	10	50	100	300	600	1 000	10 000	100 000	1 000 000
randn!	0.88	1.04	1.57	1.70	1.81	1.89	1.92	1.94	1.92	1.85
randexp!	0.87	1.14	1.27	1.37	1.37	1.34	1.42	1.39	1.37	1.31

So for small arrays there is small "speed-down".

As future work, it would be interesting to see if this also applies to some other AbstractArrays, and to other RNGs which have optimized array generation, in which case we could have a "trait" for enabling this improvement.

KristofferC · 2020-03-11T22:06:57Z

@nanosoldier runtests(ALL; vs =":master")

KristofferC · 2020-03-11T22:07:21Z

I just want to see how much of the ecosystem tests this will break.

nanosoldier · 2020-03-12T17:27:37Z

Your package evaluation job has completed - possible new issues were detected. A full report can be found here. cc @maleadt

rfourquet · 2020-03-20T08:56:41Z

Wow, when I had the idea for this change last week, I was surprised to not have thought about it earlier, and I just found a 5+ y.o. local branch with this exact same change 🙈 (the performance improvement might have not been worthwhile at that time though?)

Anyway, someone willing to review?

rfourquet · 2020-04-21T13:32:15Z

Ok to merge?

StefanKarpinski · 2020-04-21T13:46:25Z

Go for it!

KristofferC · 2020-04-21T13:48:03Z

Would be good to open issues at the packages that started failing due to this change.

rfourquet · 2020-04-21T18:15:41Z

Would be good to open issues at the packages that started failing due to this change.

I'm surprised by this suggestion, and would shy away from it, I would be worried to be spamming. Is there precedent for that? I remember having fixed few repos before Julia 0.7 to adjust them to last time similar changes, but now the random streams are expected to be subject to change in minor versions and the package authors will anyway be notified either by reading the NEWS or by running their tests. I can do, but am reluctant.

StefanKarpinski · 2020-04-21T18:20:19Z

If upgrading Julia breaks the tests, I'd imagine most package authors would want to be proactively notified about that, so I'm not sure that's terribly spammy. They can always ignore or close issues.

KristofferC · 2020-04-21T20:38:19Z

I'm surprised by this suggestion, and would shy away from it

Fine, but it will just end up with me filing these issues when I run PkgEval for the 1.5 release and have 63 extra regressions in the logs to go through, trying to figure out if the reason for the test failure is because of changes in the random number stream or that we broke something in Julia.

rfourquet · 2020-04-22T08:14:19Z

but it will just end up with me filing these issues when I run PkgEval for the 1.5 release

Oh I didn't realize you were doing that, then sure I will do for these errors. But I'm afraid that in many cases this won't be handled when you run PkgEval for 1.5 and that you will have still some work :-/

KristofferC · 2020-04-22T08:27:33Z

Sure, but at least there will be an issue there and I can quickly move on.

People do tend to fix these issues though:

rfourquet · 2020-04-22T08:35:25Z

People do tend to fix these issues though:

Ok cool, I was just imagining many maintainners to not build Julia from sources and wait for prerelease builds before trying to fix the failures.

there will be an issue there and I can quickly move on.

I will include "Julia 1.5" in the title, if this can help you find these issues. I might wait until after #29240 is merged in case this causes test failures in the same packages.

And `randexp!`.

rfourquet · 2020-04-23T12:59:02Z

I fixed a small bug, which makes randn! even slightly faster, but unfortunately it means I have to re-run PkgEval...

rfourquet · 2020-04-23T12:59:22Z

@nanosoldier runtests(ALL; vs =":master")

nanosoldier · 2020-04-24T02:06:03Z

Your package evaluation job has completed - possible new issues were detected. A full report can be found here. cc @maleadt

davidssmith · 2020-08-23T13:12:12Z

You guys are making it very hard to use Julia for reproducible research when you change something as fundamental as this in a supposedly stable version of your language.

KristofferC · 2020-08-23T14:08:25Z

Use the same julia version for reproducible research? Getting the exact random numbers for a given seed has nothing to do with the stability of the language.

davidssmith · 2020-08-23T14:42:51Z

True, but my end users keep grabbing the latest version and then complaining when something doesn't work, because you guys are changing so much post-supposedly-stable-1.0.x. This doesn't happen with Matlab. If a barely computer literate user grabs the latest version of Matlab and wants to run your script from a paper published five years ago, it will usually just work. When is Julia planning on getting there?

Also, I bet calling something like Random.seed!(myseed) with an unannotated scalar is the most common type of initialization, and if you want people to port their algorithms to Julia, that use case needs to create the same sequence as it would in other languages. Maybe it does now, but it can't have done it both before and now, since it was changed.

StefanKarpinski · 2020-08-23T15:38:41Z

The default RNG is documented not to be reproducible across Julia versions. I can understand that you might find this annoying, but it is an intentional policy which allows us to make major improvements to the default RNGs, including these which are all improved in the 1.5 release of Julia:

this change which makes many things 2x faster
faster rand!(::MersenneTwister, ::Array{Bool}) #33721 which gives 7-9x speedups
implement "nearly division less" algorithm for rand(a:b) #29240 which gives 3x speedups

Arguably more importantly, it allows even deeper changes like #34852, which will likely be in Julia 1.6. That PR replaces the now-quite-old Mersenne Twister (MT) RNG with the much faster, smaller and statistically better xoshiro* RNG. This not only majorly speeds up sequential RNG performance even more than the changes to the default RNG in 1.5 does, but also gives the guarantee that for the same Julia version, RNG sequences are reproducible even in the presence of multithreading. There is no way that kind of guarantee could be satisfied with an RNG as large as MT while maintaining acceptable performance characteristics.

There is no way any of this would be possible if we did what Matlab does and guaranteed the same exact RNG output forever. Matlab has made a different choice, favoring never changing the default RNG outputs over major speed gains. As a result, they are now stuck with an old, slow, bloated and statistically unsound RNG (MT) and many suboptimal algorithms for using it to generate sequences. To address this question:

When is Julia planning on getting there?

The answer is "never"—that is not a goal for the project. If a new, better RNG comes along that's better than xoshiro*, we will change the behavior again. In fact, it may be a good idea for us to intentionally change RNG sequences with every release so that people are forced not to rely on it.

There are various ways to deal with changing RNG sequences in Julia:

Don't give users the impression that they should be able to reproduce exact RNG sequences.
Explicitly document the version of Julia that is required in order to reproduce exact results.
Use https://github.com/rfourquet/StableRNGs.jl to provide explicitly reproducible RNG sequences.

davidssmith · 2020-08-23T16:37:48Z

I understand -- thanks for the detailed explanation, Stefan. I had always assumed Julia's goal was to be "Matlab done right" and assumed that it meant reaching some asymptotic behavior, but I can see the argument against that.

This old dude will need to re-evaluate how he uses Julia for open-sourced research codes. Users want to get all the speed improvements of the new releases, but busy PIs like me don't want to have to patch old code bases every release either. It would be nice to have an advice document on how to use Julia for research codes so that this conflict is minimized, e.g. stuff like your bullet points above.

StefanKarpinski · 2020-08-23T17:45:06Z

You definitely shouldn't have to patch things with each release. A best practices doc for RNGs would be good.

rfourquet added performance Must go faster randomness Random number generation and the Random stdlib labels Mar 11, 2020

rfourquet force-pushed the rf/faster-randn-bang branch from c235be7 to b44c0d5 Compare April 21, 2020 10:48

rfourquet added the minor change Marginal behavior change acceptable for a minor release label Apr 22, 2020

faster randn!(::MersenneTwister, ::Array{Float64})

aaa8f3a

And `randexp!`.

rfourquet force-pushed the rf/faster-randn-bang branch from b44c0d5 to aaa8f3a Compare April 22, 2020 10:31

try make LinearAlgebra tests pass

b89e35c

rfourquet merged commit b1f51df into master Apr 24, 2020

rfourquet deleted the rf/faster-randn-bang branch April 24, 2020 12:40

KristofferC mentioned this pull request May 18, 2020

WIP: Backports for Julia 1.5-beta-1 #35846

Merged

15 tasks

daviehh mentioned this pull request May 28, 2020

Random number generation: randn changed behavior on master #36052

Closed

KristofferC mentioned this pull request Jun 25, 2020

Potential Performance regression between Julia 1.4.2 and 1.5 Beta 1 #36414

Closed

daschw mentioned this pull request Jun 25, 2020

Fix nightly tests JuliaPlots/Plots.jl#2815

Merged

davidssmith mentioned this pull request Aug 23, 2020

Resolve errors in validation/tests davidssmith/DCEMRI.jl#52

Merged

StefanKarpinski mentioned this pull request Aug 23, 2020

default RNG seeding: use SHA2(minor, seed) to improve seeding #37165

Open

bjarthur mentioned this pull request Apr 17, 2023

widen signature of randn!(::MersenneTwister, ::Array{Float64}) ? #49388

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

faster randn!(::MersenneTwister, ::Array{Float64}) #35078

faster randn!(::MersenneTwister, ::Array{Float64}) #35078

rfourquet commented Mar 11, 2020 •

edited

Loading

KristofferC commented Mar 11, 2020

KristofferC commented Mar 11, 2020

nanosoldier commented Mar 12, 2020

rfourquet commented Mar 20, 2020

rfourquet commented Apr 21, 2020

StefanKarpinski commented Apr 21, 2020

KristofferC commented Apr 21, 2020

rfourquet commented Apr 21, 2020

StefanKarpinski commented Apr 21, 2020

KristofferC commented Apr 21, 2020 •

edited

Loading

rfourquet commented Apr 22, 2020

KristofferC commented Apr 22, 2020

rfourquet commented Apr 22, 2020

rfourquet commented Apr 23, 2020

rfourquet commented Apr 23, 2020

nanosoldier commented Apr 24, 2020

davidssmith commented Aug 23, 2020

KristofferC commented Aug 23, 2020 •

edited

Loading

davidssmith commented Aug 23, 2020

StefanKarpinski commented Aug 23, 2020 •

edited

Loading

davidssmith commented Aug 23, 2020

StefanKarpinski commented Aug 23, 2020

faster randn!(::MersenneTwister, ::Array{Float64}) #35078

faster randn!(::MersenneTwister, ::Array{Float64}) #35078

Conversation

rfourquet commented Mar 11, 2020 • edited Loading

KristofferC commented Mar 11, 2020

KristofferC commented Mar 11, 2020

nanosoldier commented Mar 12, 2020

rfourquet commented Mar 20, 2020

rfourquet commented Apr 21, 2020

StefanKarpinski commented Apr 21, 2020

KristofferC commented Apr 21, 2020

rfourquet commented Apr 21, 2020

StefanKarpinski commented Apr 21, 2020

KristofferC commented Apr 21, 2020 • edited Loading

rfourquet commented Apr 22, 2020

KristofferC commented Apr 22, 2020

rfourquet commented Apr 22, 2020

rfourquet commented Apr 23, 2020

rfourquet commented Apr 23, 2020

nanosoldier commented Apr 24, 2020

davidssmith commented Aug 23, 2020

KristofferC commented Aug 23, 2020 • edited Loading

davidssmith commented Aug 23, 2020

StefanKarpinski commented Aug 23, 2020 • edited Loading

davidssmith commented Aug 23, 2020

StefanKarpinski commented Aug 23, 2020

rfourquet commented Mar 11, 2020 •

edited

Loading

KristofferC commented Apr 21, 2020 •

edited

Loading

KristofferC commented Aug 23, 2020 •

edited

Loading

StefanKarpinski commented Aug 23, 2020 •

edited

Loading