Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract all PNRG methods to a shard under crystal-lang org, leaving only PCG32 in the stdlib #4634

Closed
mverzilli opened this issue Jun 29, 2017 · 23 comments · Fixed by #4636
Closed

Comments

@mverzilli
Copy link

PR #4536 introduced the best PNRG method Crystal has had to date. As we want to remove anything that isn't essential from the standard library, we will only keep PCG32 and create a new official shard to provide the rest of the PNRG methods.

PRs to remove the methods from the stdlib are welcome!

@mverzilli
Copy link
Author

Also, create a shard with the extracted methods here: https://github.com/crystal-lang/crystal-random

@konovod
Copy link
Contributor

konovod commented Jun 29, 2017

It seems I can't create a PR to shard as it's empty.

@oprypin
Copy link
Member

oprypin commented Jun 29, 2017

PCG32 has not proven itself by being used in the standard library of any serious programming language, or otherwise. Making it the default was bad, removing alternatives is madness.

@straight-shoota
Copy link
Member

If it doesn't prove itself, it can certainly be replaced before 1.0. But there is surely no value in having more than one generator in the stdlib.

@mverzilli
Copy link
Author

@oprypin do you know of any test suites to compare PRNGs? Everyone in the #4536 agreed that it was an appropriate default for the stdlib. The only way to settle this discussion is by running tests and comparing metrics: performance, coverage, etc.

@oprypin
Copy link
Member

oprypin commented Jun 29, 2017

comparing metrics: performance, coverage

That gets us nowhere. return 4 would easily win.

Without being deeply involved in the field, the only way to compare PRNGs is by the level of trust. There is no rigorous way to prove randomness.

@konovod
Copy link
Contributor

konovod commented Jun 29, 2017

PCG32 has not proven itself by being used in the standard library of any serious programming language

Yes, and I wonder why. Xorshift varions are used in several languares though (as I mentioned in #4493) - would you prefer it? Is it discussion of Xorshift vs PCG or MT19937+ISAAC vs PCG?
Few more links about PRNGS:
https://sunoru.github.io/RandomNumbers.jl/stable/man/benchmark/
https://cocoawithlove.com/blog/2016/05/19/random-numbers.html
https://www.reddit.com/r/programming/comments/4gtlfz/xoroshiro128_the_fastest_fullperiod_pseudorandom/
It seems that most people agree that xoroshiro is slightly faster than PCG. Same was results on my PC, but @ysbaddaden have results on both x32 and x64 machines where PCG was faster. The only thing everybody agree - Mersenne Twister is outdated. It is slower, uses more memory, easily predictable and fail some statistical tests. So IMHO replacing it with any modern PRNG is a good thing. Of those PRNGS the choice is (again, IMHO) between PCG and Xorshift. Personally i'm slightly inclined towards PCG, as I was told about it on local gamedev forum by some people with serious mathematic background. But yes, several languages prefer xorshift, and PCG is somewhy isn't used in any stdlib.

There is no rigorous way to prove randomness.

There is a TestU01. Both PCG and Xorshift passes it.

@oprypin
Copy link
Member

oprypin commented Jun 29, 2017

I don't care so much about merits of PRNGs and I am not directing the communication that way. All I'm saying is the standard library should have a trusted PRNG, not only one that performs fast or whatever.

@konovod
Copy link
Contributor

konovod commented Jun 29, 2017

@oprypin so is a PRNG that is used by Erlang, JavaScript V8, Nim and Rust (Rust is using ISAAC by default though) trusted enough? Would it better if we switch to it instead of PCG?

Another idea is to let ISAAC stay (as @ysbaddaden proposed IIRC) - it is cryptographically secure, time-proven and requires zero efforts as it is already here.

@ysbaddaden
Copy link
Contributor

I'm not knowledgeable enough on the topic of randomness, so I asked @jedisct1 (libsodium author):

PCG’s good. I would use ChaCha20/12 if speed is not the primary concern, though.

Everything I read says the same: PCG32 and xorshift* are both good. They pass the TestU01 statistical test suite, and they're fast. They aren't cryptographically secure AFAIK, so documentation should state that, and redirect to Random::System, and ChaCha20 (maybe ISAAC), available in the https://github.com/crystal-lang/crystal-random Shard.

@sempervictus
Copy link

True entropy is really the provenance of the OS - has access to boot time entropy pools, mixes inputs, and has direct hwrng access if available. Software-only solutions in userspace are inherently at a disadvantage against attackers, so I'd lean towards a performant prng in stdlib, with instructions for users on getting heavy lifting done by the OS if possible, or using a proven slower one from a separate lib.

@ysbaddaden
Copy link
Contributor

I think distributing a proven and trusted secure PRNG along the faster default and the system source is a good idea. We'd have a fast yet good default, a secure but system-specific solution, and a secure solution using the same PRNG, whatever the underlying system.

Maybe we could introduce ChaCha12 (or ChaCha20 if we're conservative) instead of ISAAC. The implementation is simpler, it's well trusted and used in many places. ChaCha20 performed poorly in non release mode (which may or may not a concern) but I suppose ChaCha12 would do better?

@RX14
Copy link
Contributor

RX14 commented Jul 16, 2017

this stackexchange answer seems to agree that ChaCha12 is a fine choice. PCG32 seems to be fast and "challenging" to predict compared to MT19937, but still not fully secure. Therefore PCG32 appears to be a very good all-rounder, a good default. ChaCha12 seems to be a good choice for cryptographically secure random. In addition to providing /dev/urandom, I think this is the best solution.

@konovod
Copy link
Contributor

konovod commented Jul 16, 2017

ChaCha12 should take 60% time of ChaCha20 +- overhead. I've updated my benchmark at https://github.com/konovod/prngs with it, and added to table results in non-release mode.
Note that this is an implementation based on ChaCha20 RFC/docs, not an optimized SSE version mentioned at pcg-random or (possibly optimized?) version used in arc4random.

Speaking about performance, I have only basic experience in optimization, didn't tried to check generated assembly code (and don't know SSE anyway).
But there is a huge increase in ChaCha12\20 speed in release mode (and it is still passing specs in it), so perhaps LLVM knows it's work.

Also, both PCG32 and ChaCha20 (and also XorShift) use rotl command (rotating shift). According to @funny-falcon comment, crystal fails to convert it to single instruction (i didn't checked it in my case). So perhaps all three algorithms could be made much faster if there would be the way to generate rot instruction (but it seems more like an LLVM problem).

About ISAAC - i don't know about attacks on it, there is even unclaimed bounty on it's page for discovering a seed from sequence, but it is based on RC4 that is known to be vulnerable. So ChaCha is a safer bet.

@funny-falcon
Copy link
Contributor

crystal fails to convert it to single instruction

I didn't said it always fails. But it fails in unpredictable situations, so it should be always checked.

Looks like, combining shift+or into rotl is late in optimisations chain, and previous optimisations sometimes break this conversion.

@funny-falcon
Copy link
Contributor

but it is based on RC4

Same way SHA2 is based on MD5. But SHA2 is still not broken.

There is improved version ISAAC+ from Jean-Phillipe Aumasson "On the pseudo-random generator ISAAC"(https://131002.net/data/papers/Aum06b.pdf)

@ysbaddaden
Copy link
Contributor

Does LLVM provide intrinsics for rot or rotl that we could take advantage of?

@konovod
Copy link
Contributor

konovod commented Jul 18, 2017

AFAIK no.

@akzhan
Copy link
Contributor

akzhan commented Jul 18, 2017

There is no rot intrinsic.

LLVM sources have only macros like this:

#define rot( x, k ) (((x)<<(k)) | ((x)>>(32-(k))))

Usually it is optimized by LLVM into rot instruction as @funny-falcon already said.

@konovod
Copy link
Contributor

konovod commented Aug 4, 2017

There is improved version ISAAC+ from Jean-Phillipe Aumasson "On the pseudo-random generator ISAAC"(https://131002.net/data/papers/Aum06b.pdf)

This require very little changes to ISAAC - replace shifts to rotation, one addition to xor, add xor in one place. I did it in my prngs shard. Makes thing slower by 5-10%. The only thing that stops me from making PR - i haven't found any implementations of it. So no specs or test vectors - i can only generate some values and call it spec, hoping that i implemented it right.

@ysbaddaden
Copy link
Contributor

Interesting, but no test vectors nor usages is a show stopper, yes.

@mverzilli
Copy link
Author

Just created an empty shared at this repo: https://github.com/crystal-lang/crystal-random. PRs to move stuff from the stdlib welcome!

@straight-shoota
Copy link
Member

This issue can be closed. In stdlib there's only Random::PCG32, Random::ISAAC and Random::Secure. Everything else has been moved to https://github.com/crystal-lang/crystal-random

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants