-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve fidelity of Arbitrary-based proptests #1105
Comments
An alternative to consider here is to not suggest using proptest for now, and then only come back to recommending it if we find a way to improve it. Proptest is convenient, but that could be harmful if that convenience results in people writing not-great proptests when they might have otherwise written better fuzz tests. |
I do think arbitrary-proptests are useful enough to recommend, with the potential caveat about how arbitrary-proptest-interop chooses the input size mentioned below. At some point I took a look at how Arbitrary converts to proptest, was discouraged with how the two interact, and couldn't come up with an obviously better way to do it. There was something about the assumptions Arbitrary makes vs what proptest expects that made me think the quality of input to proptest was low and would be hard to improve. My recollection is fuzzy but... One problem is that proptest has type-specific generators and "reducers" (or whatever it calls the process of simplifying failing input), but Arbitrary's input is just bytes and in arbitrary-proptest that's all that proptest sees, so the strategy it uses to generate and reduce test cases is very naive. Because proptest is just reducing a byte array to find minimal test inputs, this reduction process is generally kinda nonsense for arbitrary-proptests. (This may be why I never see arbitrary-proptests testing obvious boundary values like 0 and i32::max - because the way it would find these values is through the reduction process, which doesn't work here.) I also recall discovering that most Arbitrary implementations accept input that is not long enough to actually generate values (though the interface supports rejecting short inputs), treating the input as if it was padded with enough 0s to generate the needed value, and this had some negative interaction with proptest. Trying to recall, I think the problem with this is that although proptest generally generates the starting input to a test randomly, there is a great chance it will simply generate too few bytes to fulfill the Arbitrary value, but the value will be generated anyway from 0s. Proptest generators are supposed to detect this situation and reject the case, but because Arbitrary impls accept short inputs, we can't actually reject them. This "feature" of Arbitrary impls would seemingly also affect the fuzzer, but I assume with enough generations the fuzzer will generate longer and longer inputs. Looking again now, I see that proptest-arbitrary-interop defaults to generating 256 bytes: https://github.com/graydon/proptest-arbitrary-interop/blob/main/src/lib.rs#L188; suggesting that inputs longer than that are going to generate 0-values, unless the test author understands this problem and uses the I do think proptests based on Arbitrary impls are probably lower quality than proptests based on the proptest native traits on rust types.I also think they are probably still useful as-is. (There's another crate similar to arbitrary-interop-proptest, https://docs.rs/heckcheck/latest/heckcheck/, with accompanying blog post https://blog.yoshuawuyts.com/bridging-fuzzing-and-property-testing/. I haven't looked at it). We could consider directly supporting the proptest traits the way we do Arbitrary. cc @graydon |
What problem does your feature solve?
Our fuzzing docs emphasize the ability to convert fuzz tests to proptests using https://github.com/graydon/proptest-arbitrary-interop
In my brief look at the quality of input generated by proptest-arbitrary-interop I was not seeing some of input values I expect from a type-driven property tester, e.g. min/max integer values.
So I suspect that proptests generated in this fashion are not getting the quality of coverage they might expect.
What would you like to see?
Investigate closer what is happening in proptest-arbitrary-interop vs the stock proptest traits and see if the quality of values they produce can be improved.
What alternatives are there?
The text was updated successfully, but these errors were encountered: