-
-
Notifications
You must be signed in to change notification settings - Fork 437
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement various choose() methods #490
Conversation
Looks like wasm builds don't like using |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like there is quite a lot of overlap with #483 in fact. I see you decided to keep all the methods on Rng
. I think my version is cleaner, but I don't know if it comes at a cost to usability.
I haven't looked into the "choose weighted" stuff much recently; I think @pitdicker did more. I wonder, is it ever worth iterating over weights as in your API instead of simply collecting the weights into a vector (possibly cumulatively)?
src/lib.rs
Outdated
/// This function iterates over `weights` twice. Once to get the total | ||
/// weight, and once while choosing the random value. If you know the total | ||
/// weight, or plan to call this function multiple times, you should | ||
/// consider using [`choose_weighted_with_total`] instead. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doc is copied from other function?
The main reason that I had in mind for using an let my_items = vec![("foo", 10), ("bar", 3), ("baz", 1)];
let weights = my_items.iter().map(|x| x.1);
println!("{:?}", rng.choose_weighted(my_items.iter(), weights)); Or let my_items = get_vec_of_structs();
let weights = my_items.iter().map(|x| x.get_weight());
println!("{:?}", rng.choose_weighted(my_items.iter(), weights)); An alternative way to accomplish this goal would be to do something like: fn choose_weighted<Iter, F, Weight>(&mut self,
items: Iter,
f: F) -> Option<Iter::Item>
where Iter: Iterator + Clone,
F: FnMut(&Iter::Item) -> Weight,
Weight: SampleUniform +
Default +
core::ops::Add<IterWeights::Item, Output=IterWeights::Item> +
core::cmp::PartialOrd<IterWeights::Item> +
Clone This would work as well, but means that if the weights and items are stored in different vectors, then the caller will have to |
src/seq.rs
Outdated
} | ||
} | ||
|
||
/// XXX Any way to avoid '+ Sized' here? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not in an API that consumes self (i.e. fn choose(self, ...)
). It's possible to use where Self: Sized
on each method instead, but in practice I don't think that gains anything (unless we want to add methods taking &self
, e.g. like size_hint
does).
SliceExt → SliceRandom, IteratorExt → IteratorRandom
f93e5f9
to
2a786a8
Compare
I've rebased this on top of #483 and moved the This leaves it with a signature like this: pub fn choose_weighted<R, IterItems, IterWeights>(rng: &mut R,
items: IterItems,
weights: IterWeights) -> Option<IterItems::Item>
where R: Rng + ?Sized,
IterItems: Iterator,
IterWeights: Iterator + Clone,
IterWeights::Item: SampleUniform + ... Same thing for One alternative approach would be only let this function deal with the This would leave us with a singature like: pub trait IteratorRandom: Iterator + Sized {
...
fn choose_index_weighted<R>(mut self,
rng: &mut R) -> Option<usize>
where R: Rng + ?Sized,
Self: Clone,
Self::Item: SampleUniform + ...
{
... There's a few of upsides to this.
But also a few downsides:
Ultimately I'm leaning towards this new way being a better solution though. Some code differences: // tuples or structs
let items = [('a', 2), ('b', 1), ('c', 1)];
let weights = items.iter().map(|&(_, w)| w);
let item = choose_weighted(&mut rng, items.iter(), weights).unwrap();
// separate arrays
let items = ['a', 'b', 'c'];
let weights = [2, 1, 1];
let item = choose_weighted(&mut rng, items.iter(), weights.iter()).unwrap();
// iterators
let items = get_items();
let weights = get_item_weights();
let item = choose_weighted(&mut rng, items, weights).unwrap(); New: // tuples or structs
let items = [('a', 2), ('b', 1), ('c', 1)];
let item = items.iter().map(|&(_, w)| w)
.choose_index_weighted(&mut rng)
.map(|pos| items[pos]).unwrap();
// separate arrays
let items = ['a', 'b', 'c'];
let weights = [2, 1, 1];
let item = weights.iter()
.choose_index_weighted(&mut rng)
.map(|pos| items[pos]).unwrap();
// iterators
let items = get_items();
let weights = get_item_weights();
let item = weights
.choose_index_weighted(&mut rng)
.and_then(|pos| items.nth(pos)).unwrap(); |
2a786a8
to
27e323a
Compare
A fn choose_weighted<W, WF>(&self, weights: WF) -> T
where WF: Fn(&T, usize) -> W,
W: /* weight type requirements */ This may be usable like so: let items = [('a', 2), ('b', 1), ('c', 1)];
let item = items.choose_weighted(|x, _| x.1).unwrap().0;
// separate arrays
let items = ['a', 'b', 'c'];
let weights = [2, 1, 1];
let item = items.choose_weighted(|_, i| weights[i]).unwrap(); I'm not sure if providing both an index and reference to the item in this function is overkill? I'm also not sure how well it optimises. |
I'll create separate PRs for the tests and for choose_weighted. The rest of these changes are covered by #483 |
On top of the existing
Rng.choose()
andRng.choose_mut()
, this implementsRng.choose_from_iterator()
,Rng.choose_weighted()
,Rng.choose_weighted_with_total()
,SliceRandom.choose()
,SliceRandom.choose_mut()
, andIteratorRandom.choose()
.Regarding the
vec.shuffle(rng)
vs.rng.shuffle(vec)
issue, my strategy for where to stick functions was basically:Rng
trait. Since this helps with documentation and discoverability.get_stuff().do_random_op(&mut rng).do_other_stuff()
, I've also added functions onSliceRandom
andIteratorRandom
which forward to the implementation on theRng
trait.This seems like a reasonable trade-off to me. But happy to discuss more.
Also, this patch relies on that issue #476 will be fixed. Otherwise using floating point weights will on very rare occasions panic.