Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crystal::System::Random namespace #4450

Merged

Conversation

ysbaddaden
Copy link
Contributor

@ysbaddaden ysbaddaden commented May 23, 2017

Introduces a Crystal::System::Random namespace proposal to abstract the platform specifics of generating random numbers, using a secure random source. This follows and examplifies what is proposed in #4406.

The API is limited to two methods that must be implemented for a platform:

  • Crystal::System::Random.random_bytes(buf : Bytes) : Nil to fill a Slice with random bytes;
  • Crystal::System::Random.next_u : UIntX to return an unsigned random number (from an UInt8 to an UInt64).

The implementation has been extracted from SecureRandom. It uses getrandom on linux (if supported by the kernel) or reads from /dev/urandom on other POSIX systems. In addition OpenBSD now uses arc4random.

This patch also introduces a new Random::System class, which permits to use a secure system source for generating random numbers, instead of a PRNG. This includes patches taken from @oprypin in #3434

In addition, maybe Random.new_seed should now use Crystal::System::Random instead of using a pseudo-random (if not deterministic) value?

@bararchy
Copy link
Contributor

Just noting that /dev/urandom will give non-random data if entropy does not exists

When read, the /dev/random device will only return random bytes within the estimated number of bits of noise in the entropy pool. /dev/random should be suitable for uses that need very high quality randomness such as one-time pad or key generation. When the entropy pool is empty, reads from /dev/random will block until additional environmental noise is gathered.

A read from the /dev/urandom device will not block waiting for more entropy. As a result, if there is not sufficient entropy in the entropy pool, the returned values are theoretically vulnerable to a cryptographic attack on the algorithms used by the driver. Knowledge of how to do this is not available in the current unclassified literature, but it is theoretically possible that such an attack may exist. If this is a concern in your application, use /dev/random instead.

class Random::System
include Random

def initialize
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, otherwise Random::System.new will call Random.new(seed = new_seed) which won't work out.

@Sija
Copy link
Contributor

Sija commented May 23, 2017

Why Sys instead of System?

@RX14
Copy link
Contributor

RX14 commented May 23, 2017

@bararchy that is a cryptographic myth that kernel developers keep propagating. You cannot "run out" of entropy.

@ysbaddaden ysbaddaden force-pushed the extract-sys-random-namespace branch from e1330f1 to a5b8d6e Compare May 23, 2017 14:35
@bew
Copy link
Contributor

bew commented May 23, 2017

@Sija see #4406 Sys is meant to be a namespace for per-plateform low-level abstraction.

System is more high-level, directly usable by users (to get the hostname, the cpu_count (#4449) ).

@ysbaddaden ysbaddaden force-pushed the extract-sys-random-namespace branch from a5b8d6e to e328ca9 Compare May 23, 2017 14:37
@ysbaddaden
Copy link
Contributor Author

@Sija because System already exists and is unrelated (returns informations about the system, like hostname or cpu_count). Also Rust uses sys and I like it. No strong feelings though; I'm open to suggestions (in #4406 please).

@bararchy see https://bugs.ruby-lang.org/issues/9569. The issue links to a bunch of documentation.

@bararchy
Copy link
Contributor

@ysbaddaden @RX14
You learn something every day I guess XD Thanks for the links

@@ -30,6 +30,21 @@ module Sys
end
end

# Returns a random UInt32.
def self.random_number : UInt32
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's quite arbitrary to consume the random source 4 bytes at a time... But it happens to be a good compromise between reading 1 byte at a time and having a proper implementation getting as many bytes as needed (see #3434).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somewhat. You avoid calling next_u twice when generating a 64bits number but always read at least a next_u (i.e. 4 bytes) when the maximum value fits.

Merged.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you understand that pull request.
This whole rand_type business is already in the code base and already ensures that as few next_u invocations are done as needed, it's not tied to the integer type but to the size of the requested range. Copying it here (without attribution, too) is not changing anything.
The thing that makes a difference is the granularity of next_u which could return one byte instead of arbitrarily grabbing 4.
But then you start thinking about doing too many system calls, which my PR tries to mitigate with that random_bytes thing which is introduced as an alternative to next_u

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I stupidly missed the next_u : UInt8 in your PR, and couldn't understand the granularity in next_u you were talking about. I renamed random_number to next_u for symmetry with Random#next_u and implementations now return an UInt8, except OpenBSD which calls arc4random and thus returns an UInt32.

rng = Random::System.new

it "returns random numbers from the secure system source" do
typeof(rng.next_u).should eq(UInt32)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation as presented in this PR happens to have next_u returning an UInt32 but I don't think it should be part of the specification. module Random does not require next_u to return an UInt32, anyway.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrong assumption on my side. I was lured by arc4random, libsodium's sysrandom, our ISAAC and MT19937 implementations all returning an UInt32. I see the ISAAC64 algorithm returns an UInt64, so as long as Random handles any size, I'll remove the spec limitation.


describe "Random::System" do
rng = Random::System.new

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's worth testing one nontrivial case, like generating a 64 bit signed integer. It is not part of Random::System directly but having this called in real code can be better at pointing the compiler at something that accidentally went wrong.

# `getrandom` (if the kernel supports it) and fallbacks on reading from
# `/dev/urandom` on UNIX systems.
class Random::System
include Random
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not this:

module Random::System
  extend Random

I think it would look awesome!

There is no point in creating an instance of an object without state.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both Random::ISAAC and Random::MT1993 merely include Random and class methods seem to be specific to Random. What would extend bring?

Copy link
Member

@oprypin oprypin May 23, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use Random::System instead of Random::System.new. It can act as an instance of Random for all purposes, but without a need to have an object on the heap.
#3434 (comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about making it a struct instead? I'd like PRNG implementations to follow the same design, but there is no context, so a zero-sized struct should fit.

Copy link
Member

@oprypin oprypin May 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this would break any design. It's already broken because the RNG does not have initialization with a seed (and there's no reason to require RNGs to do that). If you're dropping the (seed) part from .new(seed), might as well drop the whole thing.

This way we can at least have Random::DEFAULT and Random::System be interchangable. It's just not nice to have to create these fake instances. Makes no sense to keep one and makes no sense to create one every time.

I'm really keen to showcase the power of the language this way, and also emphasise that Crystal does not keep state regarding this RNG.

@bcardiff
Copy link
Member

@ysbaddaden @Sija I like Sys, yet we could also use Platform since that word is used when describing the intent of the namespace. No strong feelings.

@RX14
Copy link
Contributor

RX14 commented May 23, 2017

I'd very much prefer platform as it is more descriptive and having Sys and System could get confusing.

@Sija
Copy link
Contributor

Sija commented May 23, 2017

@ysbaddaden I'd vote for @bcardiff proposition of using Platform instead, for the reasons @RX14 so aptly pointed out.

@ysbaddaden ysbaddaden force-pushed the extract-sys-random-namespace branch 2 times, most recently from 1c417e9 to 7f8e157 Compare May 23, 2017 21:42
@@ -17,5 +17,15 @@ module Sys
raise "Failed to access secure source to generate random bytes!"
end
end

def self.random_number : UInt32
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically the main problem I have with this PR is this method arbitrarily getting 4 bytes and being called just random_number.

@ysbaddaden ysbaddaden force-pushed the extract-sys-random-namespace branch from 7f8e157 to 8f7a78f Compare May 24, 2017 00:04
@oprypin
Copy link
Member

oprypin commented May 24, 2017

Looks awesome now! Thanks for bearing with me.

I'll keep an eye on this -- and keep arguing about making Random::System a module 😋

Extracts the platform specific parts out of SecureRandom into a
simple API in `sys/random` that should be implemented for each
platform.
Allows to generate random numbers using a secure source provided by
the system. It actually uses the same source as SecureRandom.

Includes changes by Oleh Prypin (@prypin) to try and read as few
bytes are required from `/dev/urandom`.
@ysbaddaden ysbaddaden force-pushed the extract-sys-random-namespace branch from 8f7a78f to 704a05b Compare May 25, 2017 14:26
@ysbaddaden ysbaddaden changed the title Sys::Random namespace Crystal::System::Random namespace May 25, 2017
@ysbaddaden
Copy link
Contributor Author

ysbaddaden commented May 25, 2017

I moved everything to the Crystal::System namespace, which I believe is a better place (thanks @luislavena). What do you think?

@RX14
Copy link
Contributor

RX14 commented May 25, 2017

I still think that platform describes the contents of the namespace better. But its pretty arbitrary so I won't push it.

@bcardiff
Copy link
Member

I'm ok with Crystal::System.

@ysbaddaden note that we have now the skip_file macro so in some cases it might come handy. As it is coded now, including crystal/** or crystal/system/** will not compile. Maybe it's ok, since the std should use just the crystal/system parts that they depends on.

@RX14
Copy link
Contributor

RX14 commented May 25, 2017

skip_file is for exactly the kind of platform-specific files this pr introduces. This pr is the best way to test the feature in practice. If it's not the best solution for this pr then it may as well be removed as it's useless.

@ysbaddaden
Copy link
Contributor Author

I started adding the skip_file macros. Here are my impressions:

the good:

  • it tells which system(s) an implementation is for (great);

the less good:

  • the default / fallback implementation file must list incompatible systems and systems we have other/better implementations (not really useful);

the bad:

  • it duplicates the dispatch in the main file, adding a new system or an optimized implementation requires to update multiple places: the dispatch and the skips of the default implementation; there are 3 places in total (dispatch, the system file, the default file) instead of 1.

@bcardiff bcardiff mentioned this pull request May 26, 2017
@bcardiff
Copy link
Member

@ysbaddaden For me the best part of skip file is that it allows the globs to keep working in any directory.

There could be other solutions for this problem: like ignoring in require globs files prefixed with _ for example. With some rule like that the if can remain in the non prefixed main file like it is right now and the user will be able to require crystal/system/**.

@RX14
Copy link
Contributor

RX14 commented May 26, 2017

@ysbaddaden I don't quite understand what you mean by "it duplicates the dispatch in the main file". Surely you'd just require all the relevant files and let skip_file do it's thing.

And as for the default file, I think there being an entirely cross-platform fallback will be quite rare, so it's best to have a whitelist in all files, (we know the entire set of targets). In that case, adding a new system requires only updating the file of the implementation for that system to add the system to the whitelist.

@bcardiff
Copy link
Member

@RX14 , he probably meant that in linux version would say skip_file unless flag?(:linux), and in the unix if should say. skip_file if flag?(:linux).

Later on, for example in windows, the windows file will say skip_file unless flag?(:windows), but the unix file will need to mention flag?(:windows). That duplication/explosion is not nice.

Having some chained if flag? as currently is easier to get which file will be included given certain flags and is less error prone.

But I do value more the require ../** in this case.

@RX14 RX14 mentioned this pull request May 26, 2017
24 tasks
@RX14
Copy link
Contributor

RX14 commented May 26, 2017

@bcardiff I was suggesting simply not using unless at all and listing all the flags which are not windows or linux by hand. This is a lot more explicit in exactly what platforms the file is used for and solves the problem of multiple updates.

@ysbaddaden
Copy link
Contributor Author

Does a glob of crystal/system/** really makes sense? Why require many files just to skip them when only 1 has the actual implementation? It makes seeking implementations for system X harder, too. Maybe we don't need everything from crystal/system either, just like we don't need everything from lib_c.

@RX14 defaults won't be that rare. Quite often we'll have "if windows else unix" and not much more. NetBSD or another UNIX system, for example, would require to put a bunch of || flag?(:netbsd) everywhere when only a few places would need it (if any).

@RX14
Copy link
Contributor

RX14 commented May 26, 2017

@ysbaddaden How about adding a :unix flag? There might be non-unix non-windows systems eventually.

@bcardiff
Copy link
Member

My concern with the glob is mainly consistency. The lib_c case is not the same since the path is selected from the triplet there.

If the feature of this PR is fine let's move forward as is. There other PR willing to move things to Crystal::System and wit should be used for windows.

Crystal::System should not be used directly so it's almost fine if globs are not supported maybe. Things can be improved later.

@oprypin
Copy link
Member

oprypin commented May 26, 2017

Is this really the place to discuss all of this?
I'm really interested in the technical side of this implementation but following this thread has been painful.

@ysbaddaden
Copy link
Contributor Author

@oprypin sorry, you may click unsubscribe in the right column to stop being notified. I don't expect the technical side to change, and if it does, I'll notify you!

@RX14 an :unix flag would be great, but sometimes a system has a particular implementation; for example :openbsd and :linux here, so it wouldn't help, unless we do if flag?(:unix) && !flag?(:openbsd) && !flag?(:linux) :(

@RX14
Copy link
Contributor

RX14 commented May 27, 2017

@ysbaddaden I guess a "cover file" which just uses macros to require the correct file is a better solution than skip_file. If we structure the file hierarchy correctly, we can require "crystal/system/*" (not **) and have it work. Surely though, we'd just have a crystal/system.cr which did the correct thing to require all of Crystal::System. Then there would be no need for globs (apart from in crystal/system.cr).

@bcardiff
Copy link
Member

Eventually Crystal::System might have more things in the hierarchy / directory structure.
What if we spit the files in crystal/system and crystal/system.impl so require crystal/system/** will continue to work?

File in crystal/system will contain the condition as they are right now and include things in ../system.imp.

@ysbaddaden, if you don't like it (even with other folder name), no worries and feel free to merge this. So we can start moving other system specific code :-)

@RX14
Copy link
Contributor

RX14 commented May 29, 2017

@bcardiff Rather then creating a messy and confusing folder layout how about just not using require "crystal/system/**". require "crystal/system" should have the exact same effect of requiring everything. require "lib_c/**" already doesn't work and that's not a bug. Please just keep it simple.

@ysbaddaden
Copy link
Contributor Author

ysbaddaden commented May 30, 2017

I thought a little more about this and I'll add the skip_file in the file headers, merely as a protection against glob issues. This shouldn't happen but let's play it safe. Let's see if keeping both the skip_file and dispatch working is annoying or not. After all, once implemented that won't change much.

Maybe I can add the :unix flag too? We can't use it right now, but in a next compiler release we could simplify some skip_file and dispatchs as in if unix elsif windows else raise.

Protects against glob requires of `crystal/system/**` that would
fail, since it would load conflicting or incompatible
implementations.
This is automatically set based on the target triple for known unix
systems. This should eventually simplify skip_file conditions for
generic UNIX implementations.
@ysbaddaden ysbaddaden merged commit eeac6ac into crystal-lang:master May 30, 2017
@ysbaddaden ysbaddaden deleted the extract-sys-random-namespace branch May 30, 2017 14:52
@mverzilli mverzilli added this to the Next milestone Jun 2, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants