-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change frequency generator to immediately shrink #406
Conversation
Why does the outcome change? ( |
Oh, I meant to mention - I think this changes how the seed is used. I tried using a stable seed, but for some reason you get different results (though it's still stable). Does that make sense? Basically, using Edit: I think this is happening because of this new line: gens <- traverse toTreeMaybeT (upTo n xs0) which will end up splitting the seed for the call to |
|
||
-- Pick a single generator from the weighted list of generators. We prune | ||
-- here as shrinks no longer need to care about bias. | ||
n <- prune $ integral $ Range.constant 1 total |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of integral
and prune
, just use integral_
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I didn't know about that!
Thanks Ollie. Cool investigation. I've had a play and I think there's a slightly neater solution which doesn't require the tree hacking. The intuition is the same, but just uses a modified shrink on integral to not include the duplicates. frequency :: MonadGen m => [(Int, m a)] -> m a
frequency = \case
[] ->
error "Hedgehog.Gen.frequency: used with empty list"
xs0 -> do
let
pick n = \case
[] ->
error "Hedgehog.Gen.frequency/pick: used with empty list"
(k, x) : xs ->
if n <= k then
x
else
pick (n - k) xs
iis =
scanl1 (+) (fmap fst xs0)
total =
last iis
n <- shrink (\n -> takeWhile (< n) iis) $ integral_ $ Range.constant 1 total
pick n xs0 What do you think?
|
That sounds great! I certainly wasn't happy about how many guts I had to open up to get this to work. I'll have a play around with your code tomorrow and see what I can make of it. |
Yes, that makes sense. However, I don't understand why each case stopped where it did. In your example, do all values correspond to a failed test? That is, does the shrinking keep happening until there is nothing left to shrink (not because the test passed)? |
@TysonMN what you're seeing above is the output of I forget exactly what |
@HuwCampbell Thanks for your suggestion, it looks like it works great!
Originally, it could be
so this is much better! |
Yeah I'm pretty happy with the result. It does address the duplicates seen in #118 as well. |
@ocharles can you squash this to one commit with both of us as authors? |
e2a63eb
to
628ff8e
Compare
@HuwCampbell How does that look? |
628ff8e
to
4dba9df
Compare
Thanks! that second shrink example is probably off now though. |
I think it's still pretty reasonable, it might not be exactly right, but it illustrates that there are no duplicates. I think the only material difference is that |
I only mentioned because it was brought up that the initial value changed, while that's no longer the case. |
Ah good point, I'll generate a new output. |
The frequency generators works by totalling all weights, and then picking a random number between [0, total]. It then selects the first generator who's cummulative weight matches this number. This works fine, but the current implementation leads to poor shrinking behavior. In particular, the shrinking is driven by repeatedly picking smaller numbers in the range [0, total]. For skewed distributions, this can mean repeatedly reselecting the same generator. This commit makes the observation that the weight only matters for the first pick. Once you have made a choice and observed a test failure, you can give all generators equal weight, as we're going to perform a breadth-first enumeration to try and find a shrink anyway. Before this patch, we would see behavior like: ``` > printWith 30 (Seed 4 1) $ Hedgehog.Gen.maybe alphaNum === Outcome === Just 'o' === Shrinks === Nothing Just 'o' Just 'o' Just 'o' Just 'o' Just 'a' Just 'h' Just 'l' Just 'n' ``` Note that `Just 'o'` appears many times - this is because the maybe generator puts a heavy bias towards 'Just' constructors. With this commit, we get the much more minimal output: ``` > printWith 30 (Seed 4 1) $ Hedgehog.Gen.maybe alphaNum === Outcome === Just 'o' === Shrinks === Nothing Just 'a' Just 'h' Just 'l' Just 'n' ``` Note that no characters were duplicated. Co-authored-by: Huw Campbell <[email protected]>
4dba9df
to
b724ecc
Compare
Commit and pull request updated ✍️ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good 👍 Great collab work as well 🎸
Change frequency generator immediately shrink
The frequency generators works by totalling all weights, and then
picking a random number between [0, total]. It then selects the first
generator who's cummulative weight matches this number. This works fine,
but the current implementation leads to poor shrinking behavior. In
particular, the shrinking is driven by repeatedly picking smaller
numbers in the range [0, total]. For skewed distributions, this can mean
repeatedly reselecting the same generator.
This commit makes the observation that the weight only matters for the
first pick. Once you have made a choice and observed a test failure, you
can give all generators equal weight, as we're going to perform a
breadth-first enumeration to try and find a shrink anyway.
Before this patch, we would see behavior like:
Note that
Just 'o'
appears many times - this is because the maybegenerator puts a heavy bias towards 'Just' constructors.
With this commit, we get the much more minimal output:
Note that no characters were duplicated.