Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deprecate std/mersenne #18395

Merged
merged 20 commits into from
Jul 5, 2021
Merged

deprecate std/mersenne #18395

merged 20 commits into from
Jul 5, 2021

Conversation

qaziquza
Copy link
Contributor

I added a couple of functions to the Mersenne Twister module. I am sure that very few people use the module, so I picked it to be safe, as I am an incompetent algorithm writer.

@timotheecour
Copy link
Member

timotheecour commented Jun 30, 2021

  • not DRY
  • not generic
  • not idiomatic
  • comments like ## Might be useful at some point to someone. (wrongly placed for docgen and, well, ...)

what we need instead:

allowing reusing std/random with a generic interface allowing to plugin MersenneTwister or sysrand instead of the default random number generator

@qaziquza
Copy link
Contributor Author

qaziquza commented Jun 30, 2021

I meant "Replace lots of useless code with a generic proc."

@timotheecour
Copy link
Member

I meant "Replace lots of useless code with a generic proc."

you can git rebase -i and force-push to change past commits

the problem with proc sample*[T](mt: var MersenneTwister, arr: seq[T]): T = is that it tries to mirror random.sample, and then soon enough we'll have to mirror the other procs too (gauss etc).

Clearly, this API mirroring is not a good design.

A good design would allow calling gauss, sample, etc generically on the random generator type

@konsumlamm
Copy link
Contributor

A good design would allow calling gauss, sample, etc generically on the random generator type

I feel like a RNG concept would be nice here (ofc that'd have to wait for concepts to stabilize), similar to Rust's Rng trait.

@Araq
Copy link
Member

Araq commented Jun 30, 2021

What's wrong with the default random number generator? I don't want interfaces everywhere, the parametrization either causes code bloat or indirect function calls...

@konsumlamm
Copy link
Contributor

What's wrong with the default random number generator? I don't want interfaces everywhere, the parametrization either causes code bloat or indirect function calls...

For one, the default RNG isn't cryptographically secure. Though I don't know why the mersenne module exists in the first place. But even for only two different RNGs, it would be nice to have a common interface, instead of manually duplicating all the functions, imo.

@Araq
Copy link
Member

Araq commented Jun 30, 2021

But even for only two different RNGs, it would be nice to have a common interface, instead of manually duplicating all the functions, imo.

Well we can consider that once we have a second RNG that anybody out there cares about. :-)

@konsumlamm
Copy link
Contributor

But even for only two different RNGs, it would be nice to have a common interface, instead of manually duplicating all the functions, imo.

Well we can consider that once we have a second RNG that anybody out there cares about. :-)

So if apparently noone cares about std/mersenne and std/sysrand, then why were they added in the first place?

@Araq
Copy link
Member

Araq commented Jun 30, 2021

So if apparently noone cares about std/mersenne and std/sysrand, then why were they added in the first place?

mersenne is archaic and I don't know to which extend sysrand can implement a sane interface as there is some fallback to /dev/urandom (which is terrible).

@konsumlamm
Copy link
Contributor

konsumlamm commented Jun 30, 2021

mersenne is archaic

Hmm, perhaps we should just deprecate it then (and recommend random instead)?

@Vindaar
Copy link
Contributor

Vindaar commented Jun 30, 2021

So if apparently noone cares about std/mersenne and std/sysrand, then why were they added in the first place?

I'm a physicist and care about it for example. If only, because in ROOT (CERN's HEP software framework) the class TRandom3 is based on the Mersenne twister and is still very commonly used for Monte Carlo simulations in high energy physics.

@qaziquza
Copy link
Contributor Author

qaziquza commented Jun 30, 2021

Make it a nimble package? An alternative to the stdlib random? Those who want it can use it, and those who don't are not bothered?

@Vindaar
Copy link
Contributor

Vindaar commented Jun 30, 2021

Make it a nimble package? An alternative to the stdlib random? Those who want it can use it, and those who don't are not bothered?

Not sure if you're replying to me or not.

But personally I don't care whether it lives in the stdlib or elsewhere. I care about the fact that the thing is implemented.
For sampling I use alea already anyway, which precisely allows to wrap any RNG.

@Araq
Copy link
Member

Araq commented Jul 1, 2021

Make it a nimble package? An alternative to the stdlib random? Those who want it can use it, and those who don't are not bothered?

Please make it a Nimble package.

@timotheecour
Copy link
Member

timotheecour commented Jul 1, 2021

What's wrong with the default random number generator? I don't want interfaces everywhere, the parametrization either causes code bloat or indirect function calls...

there's little point in RT parametrization on the RNG type so let's just discuss the CT parametrization via eg

proc gauss(r: var Rand; mu = 0.0; sigma = 1.0): float
etc...

=>
type RngConcept* = concept
  proc next(a: var Self): uint64
  proc skipRandomNumbers(a: var Self) # could be optional

proc gauss(r: var RngConcept; mu = 0.0; sigma = 1.0): float
etc...

really not much difference complexity wise.

I don't know to which extend sysrand can implement a sane interface as there is some fallback to /dev/urandom (which is terrible).

it wouldn't cause code bloat in the generated binary any more than what the program calls, so if you only instantiate it with only std/random's default RNG, it wouldn't increase binary size.

I'm not sure why it matters that it can fallback to /dev/urandom on some systems; such systems would anyway have that limitation if they wanted to implement gauss from scratch from urandom source of randomness. Note that urandom data can be grabbed in batches to avoid 1 call per requested value analog to how fread works via buffering

The point is we can have gauss, sample, etc available generically to std/sysrandom, the default RNG in std/random, as well as 3rd party libs that conform to RngConcept, for free.

@Araq
Copy link
Member

Araq commented Jul 1, 2021

I'm not sure why it matters that it can fallback to /dev/urandom on some systems

It means that the operation can fail.

@qaziquza
Copy link
Contributor Author

qaziquza commented Jul 1, 2021

Should I remove or deprecate Mersenne, and create a new Github repo with Mersenne in it?

@timotheecour
Copy link
Member

timotheecour commented Jul 1, 2021

It means that the operation can fail.

/dev/urandom (unlike /dev/random) is non-blocking, and what should be used as fallback (and that's indeed what we do). https://www.2uo.de/myths-about-urandom Is the best resource I found on this topic, and a good read.

If it fails (say with EINTR) we just retry, just like we do in io.readLine:

# in readLine:
    while true:
      # fixes #9634; this pattern may need to be abstracted as a template if reused;
      # likely other io procs need this for correctness.
      fgetsSuccess = c_fgets(addr line[pos], sp.cint, f) != nil
      if fgetsSuccess: break
      if errno == EINTR:
        errno = 0
        c_clearerr(f)
        continue
      checkErr(f)
      break

Again, the main point is that someone who wants to implement gauss/sample/uniform from scratch on top of a CSPRNG like system randomness would have no benefit in doing so over simply reusing existing code from std/random generalized as mentioned earlier to be generic on RngConcept.

@Araq
Copy link
Member

Araq commented Jul 2, 2021

https://www.2uo.de/myths-about-urandom Is the best resource I found on this topic, and a good read.

It's irrelevant. You simply cannot read from /dev/urandom and know what you're gonna get, it might just be a file of this name. Yes, I know if this happens that the system is seriously compromised and there are lots of whack-a-mole solutions just so that we can continue to live in the dream land where pseudo-paths can be used instead of APIs. But the correct solution is to finally admit that Unix is a terrible design and to use a real API, and only the API.

@timotheecour
Copy link
Member

You simply cannot read from /dev/urandom and know what you're gonna get, it might just be a file of this name

meh, we can just add enforce cfile.st_mode == S_ISCHR, "/dev/urandom is not of the expected kind (expecting character special), your system is broken"

Which I've never heard of, seems like a hypothetical scenario. Again, this is a fallback, almost all systems (and all the OSes officially supported by nim) in use have APIs for that and don't need to /dev/urandom direcly (as can be seen in https://nim-lang.github.io/Nim/sysrand.html#urandom%2CNatural)

@Araq
Copy link
Member

Araq commented Jul 2, 2021

Just because it's "character special" that doesn't mean that it produces random numbers... ;-)

@c-blake
Copy link
Contributor

c-blake commented Jul 2, 2021

If a system is so compromised then hard linking /dev/zero (in some sense the least random numbers) to /dev/urandom might not even be a surprising strategy and /dev/zero is a character device file. Lol. Yes, these poor folks are clearly screwed security-wise...It is true that this is just a fallback, but some perfect RN situation also seems off point.

One, probably slow source of true random numbers for seeding/crypto/other performance insensitive applications and one PRNG seem enough for the stdlib. The application areas are distinct enough that I don't think a "true Gaussian" adds much value. It is also unreasonable/out of scope for the stdlib to keep up with all the probability distributions that might get requested. So, I doubt we need layering/multisource RNs in the in-stdlib.

There are already not one but three nimble packages with their own implementations of mersenne, actually - nim-random and nimscripter and libtcod. alea which allows layering backend sources via concepts could be made to pick one of those to depend upon (or add its own 4th!).

So, this PR could simply deprecate mersenne. If @pyautogui wants to lift it into a 4th package that only provides just that then this also seems reasonable. Then after a deprecation cycle, it can become deletion.

To help understand how much disturbance this might cause, I searched a clone of the nimbleverse I have. The only packages I could find that seem to depend upon mersenne being in the stdlib are two with example code only:

desim http://github.com/jayvanderwall/desim
plotly https://github.com/brentp/nim-plotly

and two that actually use it:

perlin  https://github.com/Nycto/PerlinNim
  (just 1 line of code; easily changed)
opengl-sandbox  https://github.com/krux02/opengl-sandbox
  (this uses it the most - like 4 or 5 places)

There is who knows how much non-nimble "dark code" out there, of course..but only one real usage in 1500 packages suggests extremely weak reliance/low perturbation to deprecating now.

@qaziquza
Copy link
Contributor Author

qaziquza commented Jul 2, 2021

Yes, that sounds good. I am added {. deprecated .} pragmas currently.

lib/pure/mersenne.nim Outdated Show resolved Hide resolved
lib/pure/mersenne.nim Outdated Show resolved Hide resolved
@konsumlamm
Copy link
Contributor

I'd also add {.deprecated: "use std/random instead".} at the top level, to deprecate the module itself and not only everything inside it.

@timotheecour
Copy link
Member

timotheecour commented Jul 2, 2021

I'd also add

I'd only add. No need to deprecate individual procs, just the module

lib/pure/mersenne.nim Outdated Show resolved Hide resolved
lib/pure/mersenne.nim Outdated Show resolved Hide resolved
lib/pure/mersenne.nim Show resolved Hide resolved
lib/pure/mersenne.nim Outdated Show resolved Hide resolved
tests/stdlib/tmersenne.nim Outdated Show resolved Hide resolved
pyautogui and others added 4 commits July 2, 2021 22:53
@c-blake
Copy link
Contributor

c-blake commented Jul 3, 2021

You should maybe change the title of this PR to "deprecate mersenne" or similar and downvoters should perhaps reconsider their votes. :-) The whitespace changes seem a bit superfluous, but 🤷 also harmless. Other than that, it's looking fine.

lib/pure/mersenne.nim Outdated Show resolved Hide resolved
@timotheecour timotheecour changed the title Add to the Mersenne Twister. deprecate std/mersenne Jul 3, 2021
Co-authored-by: Timothee Cour <[email protected]>
@Araq
Copy link
Member

Araq commented Jul 3, 2021

Needs a changelog entry.

changelog.md Outdated Show resolved Hide resolved
changelog.md Outdated Show resolved Hide resolved
@timotheecour timotheecour merged commit 927a832 into nim-lang:devel Jul 5, 2021
@konsumlamm konsumlamm mentioned this pull request Jul 6, 2021
PMunch pushed a commit to PMunch/Nim that referenced this pull request Mar 28, 2022
Co-authored-by: konsumlamm <[email protected]>
Co-authored-by: Timothee Cour <[email protected]>
Co-authored-by: Andreas Rumpf <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants