Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

change uuid to use Random.RandomDevice() instead of Random.default_rng() #35872

Merged
merged 11 commits into from
May 27, 2020

Conversation

ssikdar1
Copy link
Contributor

Description

Fixes #35860

Change Random.default_rng() in the UUID functions to Random.RandomDevice()

Testing

Ran $ ./julia stdlib/UUIDs/test/runtests.jl with no issues

@StefanKarpinski
Copy link
Member

Looks good to me. A simple test that could be done after this is setting the global RNG seed to the same value twice and generating a new UUID by each of these methods after and checking that they are not the same. Before this, that would have failed, so that's a good regression test.

@rfourquet rfourquet added minor change Marginal behavior change acceptable for a minor release randomness Random number generation and the Random stdlib needs docs Documentation for this change is required needs news A NEWS entry is required for this change labels May 14, 2020
@rfourquet
Copy link
Member

Also, this needs docs and news. And maybe a "!!! compat" annotation saying that this new behavior starts in version 1.6.

@JeffBezanson
Copy link
Member

This may be the right decision but I believe RandomDevice can be reeeally slow --- are we sure this is ok?

@tkf
Copy link
Member

tkf commented May 15, 2020

(pinging @Keno who had the same concern #32954 (comment))

I think the default should be the safe one. If you know you need tons of random UUIDs, I think you can always swap the RNG argument. But warning this in the docstring would be nice.

@rfourquet
Copy link
Member

I'm not totally sold on the idea either. There is something nice in knowing that generally all randomness comes from the default_rng or explicit RNGs and reproducibility can be achieved with seed!. Generally, the default one is seeded only in tests, and I would tend to consider it bad practice when it's seeded in a package (not in tests). So using RandomDevice() here goes in the way of the user who needs reproducibility (for whatever reason, e.g. debugging). On the other hand, it's easy for applications to explicitly use RandomDevice() for UUIDs. Using RandomDevice() here is a special case and opens the door to more special cases, and special cases increase complexity.

@StefanKarpinski
Copy link
Member

I think the danger of producing the same UUIDs repeatedly when they are supposed to be universally unique is way too high. This is definitely one of those things that has caused problems because people (myself included) just don't realize when they call seed! that they will cause the same UUIDs to be generated repeatedly. Super dangerous. Especially because of the decision which I don't think we can roll back at this point to automatically call seed! at the start of each testset. The other case that's also really bad is using a default RNG for generating names of temp files, but I think that has now been fixed?

@StefanKarpinski
Copy link
Member

Combined with the fact that UUIDs are sometimes relevant for security, this seems good to me.

@rfourquet rfourquet removed needs docs Documentation for this change is required needs news A NEWS entry is required for this change labels May 15, 2020
@JeffBezanson
Copy link
Member

How about using a dedicated default RNG for UUIDs that's initialized from RandomDevice at startup?

@StefanKarpinski
Copy link
Member

StefanKarpinski commented May 15, 2020

I think that would be fine. It might be good to do something that is more secure as well, e.g. initialize a seed from RandomDevice on the first call and then do hash-based UUID generation combining that seed value with a counter using SHA256 or something like that.

@StefanKarpinski
Copy link
Member

StefanKarpinski commented May 15, 2020

I feel like this is pretty simple and we can just do this and see if anyone actually cares about the performance of generating UUIDs. If they do, then we can consider something more complicated.

@@ -53,7 +55,7 @@ UUID("cfc395e8-590f-11e8-1f13-43a2532b2fa8")
!!! compat "Julia 1.6"
The default rng has been changed to use Random.RandomDevice as of Julia 1.6
"""
function uuid1(rng::AbstractRNG=Random.RandomDevice())
function uuid1(rng::AbstractRNG=uuid_rng)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you make this change? Creating a new RandomDevice() is extremely cheap on non-Windows, and on Windows having a global uuid_rng is not thread-safe...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rfourquet I guess misunderstood @JeffBezanson 's comment above?

How about using a dedicated default RNG for UUIDs that's initialized from RandomDevice at startup?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My interpretation of it was using an RNG like default_rng() (currently MersenneTwister() which is thread-safely initialized), which is therefore immune against the unqualified Random.seed!() but way faster than RandomDevice(), but "initialized from RandomDevice()" means it's seeded from RandomDevice() at startup.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But Stefan's last comment suggests to keep it how it was, because the poor performance of RandomDevice might considered as a non-problem until someone complains.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay! I reverted 507b454 . I'll leave it alone for now until there's something I should explicitly add/change/remove.

@tkf
Copy link
Member

tkf commented May 24, 2020

Since it is possible that the default RNG to be changed, I think it'd be better to avoid putting Random.RandomDevice() in the signature. Also, it makes future changes easier if we only describe the behavior and not the implementation. How about something like this?

uuid4([rng::AbstractRNG]) -> UUID

Generates a version 4 (random or pseudo-random) universally unique identifier (UUID), as specified by RFC 4122.

The default rng used by uuid4 is not GLOBAL_RNG and every invocation of uuid4() without an argument should be expected to return a unique identifier. Importantly, the outputs of uuid4 do not repeat even when Random.seed!(seed) is called. Currently (as of Julia 1.6), uuid4 uses Random.RandomDevice as the default rng. However, this is an implementation detail that may change in the future.

compat "Julia 1.6"
The output of uuid4 does not dependent on GLOBAL_RNG as of Julia 1.6.

Copy link
Member

@c42f c42f left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great to me — simple and safe, with the side benefit that it doesn't pull the complexity of all of Random to the default implementation of uuid4 (which helps greatly in #35894).

Any objections to merging and seeing how it turns out?

stdlib/UUIDs/src/UUIDs.jl Outdated Show resolved Hide resolved
stdlib/UUIDs/src/UUIDs.jl Outdated Show resolved Hide resolved
ssikdar1 and others added 2 commits May 26, 2020 12:37
Co-authored-by: Rafael Fourquet <[email protected]>
Co-authored-by: Rafael Fourquet <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
minor change Marginal behavior change acceptable for a minor release randomness Random number generation and the Random stdlib
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use RandomDevice in UUIDs instead of default_rng?
6 participants