diff --git a/src/guide-data.md b/src/guide-data.md index d491ff3..b958769 100644 --- a/src/guide-data.md +++ b/src/guide-data.md @@ -63,12 +63,10 @@ random value (an unbiased random value) is in several contexts: random value (it would be infinitely large, and use infinite amounts of memory) - `f64`: we treat this as an approximation of the real numbers, and, *by convention*, restrict to the range 0 to 1 (if not otherwise specified). - Note that this type has finite precision, so we use the coin-flipping method - above (but with random bits instead of coins) until we get as much precision - as the type can represent; however, since floating-point numbers are much - more precise close to 0 than they are near 1, we typically simplify here and - stop once we have enough precision to differentiate between 1 and the next - smallest value representable (`1 - ε/2`). + We will come back to the conversions used later; + for now note that these produce 52-53 bits of precision (depending on which + conversion is used, output will be in steps of `ε` or `ε/2`, where `1+ε` is + the smallest representable value greater than `1`). ## Random data diff --git a/src/guide-dist.md b/src/guide-dist.md index ea73cb9..9564110 100644 --- a/src/guide-dist.md +++ b/src/guide-dist.md @@ -10,7 +10,7 @@ pub trait Distribution { fn sample(&self, rng: &mut R) -> T; // a convenience function defined using sample: - fn sample_iter<'a, R>(&'a self, rng: &'a mut R) -> DistIter<'a, Self, R, T> + fn sample_iter(self, rng: R) -> DistIter where Self: Sized, R: Rng, @@ -18,39 +18,129 @@ pub trait Distribution { } ``` -Rand provides implementations of many different distributions; for the full -list see the [`distributions`] module; the most common are highlighted below. +Rand provides implementations of many different distributions; we cover the most +common of these here, but for full details refer to the [`distributions`] module +and the [`rand_distr`] crate. -## Uniform distributions +# Uniform distributions The most obvious type of distribution is the one we already discussed: one without pattern, where each value or range of values is equally likely. This is known as *uniform*. -Rand actually has several variants of this: - -- [`Standard`] requires no parameters and produces uniformly distributed - values over the entire range of the output type (for `bool` and integers) - or over the range from 0 to 1 (for floats) or over valid Unicode code - points. It also has extensions to tuples, array types and `Option`. -- [`Uniform`] is parametrised with `low` and `high` points, and produces - values uniformly distributed within this range. -- [`Alphanumeric`] is uniform over the values `0-9A-Za-z` -- [`Open01`] and [`OpenClosed01`] are variations of [`Standard`] for floating - point numbers between 0 and 1 (partially) exclusive of end points. - -For convenience, [`Rng::gen`] and [`random`] are short-cuts to [`Standard`], -and [`Rng::gen_range`] is a short-cut to [`Uniform`], allowing things like: - -```rust -# extern crate rand; -# use rand::prelude::*; -let mut rng = thread_rng(); -let coord: (f64, f64) = rng.gen(); -let die_roll = rng.gen_range(1, 7); -``` - -## More continuous distributions +Rand actually has several variants of this, repesenting different ranges: + +- [`Standard`] requires no parameters and samples values uniformly according + to the type. [`Rng::gen`] provides a short-cut to this distribution. +- [`Uniform`] is parametrised by `Uniform::new(low, high)` (including `low`, + excluding `high`) or `Uniform::new_inclusive(low, high)` (including both), + and samples values uniformly within this range. + [`Rng::gen_range`] is a convenience method defined over + [`Uniform::sample_single`], optimised for single-sample usage. +- [`Alphanumeric`] is uniform over the `char` values `0-9A-Za-z`. +- [`Open01`] and [`OpenClosed01`] are provide alternate sampling ranges for + floating-point types (see below). + +## Uniform sampling by type + +Lets go over the distributions by type: + +- For `bool`, [`Standard`] samples each value with probability 50%. +- For `Option`, the [`Standard`] distribution samples `None` with + probability 50%, otherwise `Some(value)` is sampled, according to its type. +- For integers (`u8` through to `u128`, `usize`, and `i*` variants), + [`Standard`] samples from all possible values while + [`Uniform`] samples from the parameterised range. +- For `NonZeroU8` and other "non-zero" types, [`Standard`] samples uniformly + from all non-zero values (rejection method). +- `Wrapping` integer types are sampled as for the corresponding integer + type by the [`Standard`] distribution. +- For floats (`f32`, `f64`), + + - [`Standard`] samples from the half-open range `[0, 1)` with 24 or 53 + bits of precision (for `f32` and `f64` respectively) + - [`OpenClosed01`] samples from the half-open range `(0, 1]` with 24 or + 53 bits of precision + - [`Open01`] samples from the open range `(0, 1)` with 23 or 52 bits of + precision + - [`Uniform`] samples from a given range with 23 or 52 bits of precision +- For the `char` type, the [`Standard`] distribution samples from all + available Unicode code points, uniformly; many of these values may not be + printable (depending on font support). The [`Alphanumeric`] samples from + only a-z, A-Z and 0-9 uniformly. +- For tuples and arrays, each element is sampled as above, where supported. + The [`Standard`] and [`Uniform`] distributions each support a selection of + these types (up to 12-tuples and 32-element arrays). + This includes the empty tuple `()` and array. +- For SIMD types, each element is sampled as above, for [`Standard`] and + [`Uniform`] (for the latter, `low` and `high` parameters are *also* SIMD + types, effectively sampling from multiple ranges simultaneously). SIMD + support is gated behind a [feature flag](../features.html#simd-support). + +# Non-uniform distributions + +Non-uniform distributions can be divided into two categories, as follows. +Some of these discrete and all of the continuous distributions have been moved +from the main [`rand`] crate to a dedicated [`rand_distr`] crate. + +## Discrete non-uniform distributions + +Discrete distributions sample from boolean or integer types. As above, these +can be sampled uniformly, or, as below, via a non-uniform distribution. + +Potentially a discrete distribution could sample directly from a set of discrete +values such as a slice or an `enum`. See the section on [Sequences] regarding +Rand's traits for slice and iterator types. Rand does not provide direct +sampling from `enum`s, with the exception of `Option` (see above). + +### Booleans + +The [`Bernoulli`] distribution is a fancy name for generating a boolean +with a given a probability `p` of being `true`, or defined via a +`success : failure` ratio. Often this is described as a *trial* with +probability `p` of *success* (`true`). + +The methods [`Rng::gen_bool`] and [`Rng::gen_ratio`] are short-cuts to this +distribution. + +### Integers + +The [`Binomial`] distribution is related to the [`Bernoulli`] in that it +models running `n` independent trials each with probability `p` of success, +then counts the number of successes. + +Note that for large `n` the [`Binomial`] distribution's implementation is +much faster than sampling `n` trials individually. + +The [`Poisson`] distribution expresses the expected number of events +occurring within a fixed interval, given that events occur with fixed rate λ. + +### Weighted sequences + +The [`WeightedIndex`] distribution samples an index from sequence of weights. +See the [Sequences] section for convenience wrappers directly sampling a slice +element. + +For example, weighted sampling could be used to model the colour of a marble +sampled from a bucket containing 5 green, 15 red and 80 blue marbles. + +Currently the Rand lib only implements *sampling with replacement*, i.e. +repeated sampling assumes the same distribution (that any sampled marble +has been replaced). An alternative distribution implementing +*sampling without replacement* has been +[requested](https://github.com/rust-random/rand/issues/596). + +Note also that two implementations of [`WeightedIndex`] are available; the +first is optimised for a small number of samples while +[`alias_method::WeightedIndex`] is optimised for a large number of samples +(where "large" may mean "> 1000"; benchmarks recommended). + +## Continuous non-uniform distributions + +Continuous distributions model samples drawn from the real number line ℝ, or in +some cases a point from a higher dimension (ℝ², ℝ³, etc.). We provide +implementations for `f64` and for `f32` output in most cases, although currently +the `f32` implementations simply reduce the precision of an `f64` sample. The exponential distribution, [`Exp`], simulates time until decay, assuming a fixed rate of decay (i.e. exponential decay). @@ -72,28 +162,11 @@ The [`Beta`] distribution is a two-parameter probability distribution, whose output values lie between 0 and 1. The [`Dirichlet`] distribution is a generalisation to any positive number of parameters. -## Discrete distributions - -The [`Bernoulli`] distribution is very simple: given a probability `p` (or a -ratio `num / denom`), a boolean value is produced with the given probability -of being `true` (simulating a trial with probability `p` of success). - -For convenience, [`Rng::gen_bool`] and [`Rng::gen_ratio`] are short-cuts to [`Bernoulli`]. - -The [`Binomial`] distribution is related: given a probability `p` and a number -`n`, this distribution simulates running `n` Bernoulli trials and tells you the -number which were successful. - -The [`Poisson`] distribution expresses the expected number of events occurring -within a fixed interval, given that events occur with fixed rate λ. - -## Weighted sampling - -Finally, [`WeightedIndex`] is a discrete distribution sampling from a finite -selection of choices each with given weight. - +[Sequences]: ../guide-seq.html [`Distribution`]: ../rand/rand/distributions/trait.Distribution.html [`distributions`]: ../rand/rand/distributions/index.html +[`rand`]: ../rand/rand/index.html +[`rand_distr`]: ../rand/rand_distr/index.html [`Rng::gen_range`]: ../rand/rand/trait.Rng.html#method.gen_range [`random`]: ../rand/rand/fn.random.html [`Rng::gen_bool`]: ../rand/rand/trait.Rng.html#method.gen_bool @@ -102,6 +175,7 @@ selection of choices each with given weight. [`Rng`]: ../rand/rand/trait.Rng.html [`Standard`]: ../rand/rand/distributions/struct.Standard.html [`Uniform`]: ../rand/rand/distributions/struct.Uniform.html +[`Uniform::sample_single`]: ../rand/rand/distributions/struct.Uniform.html#method.sample_single [`Alphanumeric`]: ../rand/rand/distributions/struct.Alphanumeric.html [`Open01`]: ../rand/rand/distributions/struct.Open01.html [`OpenClosed01`]: ../rand/rand/distributions/struct.OpenClosed01.html @@ -117,3 +191,4 @@ selection of choices each with given weight. [`Beta`]: ../rand/rand/distributions/struct.Beta.html [`Dirichlet`]: ../rand/rand/distributions/struct.Dirichlet.html [`WeightedIndex`]: ../rand/rand/distributions/weighted/struct.WeightedIndex.html +[`alias_method::WeightedIndex`]: ../rand/rand/distributions/weighted/alias_method/struct.WeightedIndex.html diff --git a/src/guide-values.md b/src/guide-values.md index 828c332..40cf981 100644 --- a/src/guide-values.md +++ b/src/guide-values.md @@ -7,29 +7,34 @@ This is a trick question: we need to know both the *range* we want and the type of *distribution* of this value (which is what the [`next`](guide-dist.md) section is all about). +## The `Rng` trait + For convenience, all generators automatically implement the [`Rng`] trait, which provides short-cuts to a few ways of generating values. This has several convenience functions for producing uniformly distributed values: -- [`gen`] generates an unbiased random value from a range appropriate for the +- [`Rng::gen`] generates an unbiased (uniform) random value from a range + appropriate for the type. For integers this is normally the full representable range (e.g. from `0u32` to `std::u32::MAX`), for floats this is between 0 and 1, and some other types are supported, including arrays and tuples. - More on this in the [next section](guide-dist.html#uniform-distributions). -- [`gen_range`] generates an unbiased random value with given bounds + + This method is a convenience wrapper around the [`Standard`] distribution, + as documented in the [next section](guide-dist.html#uniform-distributions). +- [`Rng::gen_range`] generates an unbiased random value with given bounds `low` (inclusive) and `high` (exclusive) -- [`fill`] and [`try_fill`] are optimised functions for filling any byte or +- [`Rng::fill`] and [`Rng::try_fill`] are optimised functions for filling any byte or integer slice with random values It also has convenience functions for producing non-uniform boolean values: -- [`gen_bool`] generates a boolean with the given probability -- [`gen_ratio`] also generates a boolean, where the probability is defined +- [`Rng::gen_bool`] generates a boolean with the given probability +- [`Rng::gen_ratio`] also generates a boolean, where the probability is defined via a fraction Finally, it has a function to sample from arbitrary distributions: -- [`sample`] samples directly from some [distribution](guide-dist.md) +- [`Rng::sample`] samples directly from some [distribution](guide-dist.md) Examples: @@ -48,7 +53,7 @@ let x: f64 = rng.gen(); let roll = rng.gen_range(1, 7); ``` -Additionally, the [`random`] function is a short-cut to [`gen`] on the [`thread_rng`]: +Additionally, the [`random`] function is a short-cut to [`Rng::gen`] on the [`thread_rng`]: ```rust if rand::random() { println!("we got lucky!"); @@ -56,12 +61,13 @@ if rand::random() { ``` [`Rng`]: ../rand/rand/trait.Rng.html -[`gen`]: ../rand/rand/trait.Rng.html#method.gen -[`gen_range`]: ../rand/rand/trait.Rng.html#method.gen_range -[`sample`]: ../rand/rand/trait.Rng.html#method.sample -[`gen_bool`]: ../rand/rand/trait.Rng.html#method.gen_bool -[`gen_ratio`]: ../rand/rand/trait.Rng.html#method.gen_ratio -[`fill`]: ../rand/rand/trait.Rng.html#method.fill -[`try_fill`]: ../rand/rand/trait.Rng.html#method.try_fill +[`Rng::gen`]: ../rand/rand/trait.Rng.html#method.gen +[`Rng::gen_range`]: ../rand/rand/trait.Rng.html#method.gen_range +[`Rng::sample`]: ../rand/rand/trait.Rng.html#method.sample +[`Rng::gen_bool`]: ../rand/rand/trait.Rng.html#method.gen_bool +[`Rng::gen_ratio`]: ../rand/rand/trait.Rng.html#method.gen_ratio +[`Rng::fill`]: ../rand/rand/trait.Rng.html#method.fill +[`Rng::try_fill`]: ../rand/rand/trait.Rng.html#method.try_fill [`random`]: ../rand/rand/fn.random.htm [`thread_rng`]: ../rand/rand/fn.thread_rng.html +[`Standard`]: ../rand/rand/distributions/struct.Standard.html