-
-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hitting a recursion limit while running cargo semver-check
on diesel
#108
Comments
Thanks for the detailed report. I was worried something like this might become a problem, and I'd love your thoughts on the possible options here. The underlying issue is that we're trying to deserialize a very deeply nested JSON object, and exceeding serde's hardcoded recursion limit of 128.
|
This might be an opportunity to provide feedback to rustdoc folks on the design of the data format. I'm somewhat tempted to say we allow unbounded depth considering the source is, in theory, trusted. |
WRT miniserde, It only supports enums with unit variants, which makes it a non starter for WRT rustdoc-types (the crate, not the upstream version), I'd be happy to merge a change that added an additional (de)?serialization trait impls. Note that this would need to be done in the update script and not the code directly, and probably also controlled via a cargo feature. The best place to fix this seems like changing |
It's not really clear to me that the rustdoc JSON format is lacking and needs to change, to be honest. If the folks in charge of the rustdoc JSON format pushed back on this, I wouldn't think it unreasonable. Diesel uses many more (and more complex) generics than the median crate — in fact I'd bet it's past the 99.9%ile on that distribution. We are privileged because Trustfall and its adapter will absorb ~any change to the JSON format. But for most rustdoc JSON users, it might not make sense to make the format harder to use in exchange for supporting more crates at the extreme end of that distribution.
It isn't clear to me that anything guarantees fail-stop behavior if a bad JSON file is deserialized. Panics would be unfortunate, but exploitable stack overflow causing a security vulnerability would be way worse. I'm worried about paths away from the happy ones:
|
If it nests that deeply, another strategy should at least be considered for the schema.
I see this as being more of a short range solution while something else is worked out. |
Perhaps an optional "use at own risk" feature that enables the corresponding serde feature, then? That seems like it could be a reasonable, measured solution for the time being. |
Maybe a |
Thanks for all your discussions here. Let me add a few arguments here:
I do not feel that's a good argument. Yes diesel is more complex that the average crate, but also it's much more popular than the average crate. To be clear here: It's fine to say: That's to complex we will not support that as long as this tool is not part of the official rust tool chain. As soon as you talk about the inclusion in
I personally would consider this an issue with the JSON emitted by rustdoc. At very least I would assume that
What exactly do you expect as MCVE? Minimizing diesel is something that's really hard. I can at least provide the JSON document that's currently produced by rustdoc. |
A few Assuming Diesel's rustdoc JSON file is
It would be helpful to include at least the max-depth number + some of the paths that are at that depth together with the file, just to make it easier to minimize the file and narrow in on the problem from there.
I'm happy to put a branch together that uses unlimited recursion — I'd appreciate pointers / PR review on the clap magic necessary to parse the options in the most ergonomic way 😊 I can clone Diesel and generate the rustdoc files. |
Thanks for the pointers on how to extract these information. I'm out for holidays in the next weeks, but I can try to gather that information after I'm back. |
I finally had some time to get the requested numbers:
Funnily that fails with The generated json file is ~500MB large so it's quite hard to tell what's wrong there. For reference the generated json file as zip: diesel.zip I did hack together some small serde_json script that disables the recursion limit. It seems like the max depth is 903 based on that. The longest path seems to be:
The other information for the item with the id "0:40213" indicate that it is some impl generated here: https://github.com/diesel-rs/diesel/blob/master/diesel/src/query_builder/insert_statement/insert_with_default_for_sqlite.rs#L368-L375 So this seems to be easily fixable if cargo semver-check would provide a way to specify the feature flags to check or if it would use the default features. The script used to get these results: https://gist.github.com/weiznich/78a541904ea08c0375c64f2cf7a96eb0 |
Thanks for putting this together, it's very useful! By default the build is indeed @tonowak and I came up with a viable plan that should allow only checking user-specified features, and I believe some progress on it has already been made. |
Couldn't you just do what |
I think To ensure the long-term health of the project, I have to be very conservative with adopting things that may lead to an increased maintenance burden on me. This is especially the case for things that increase complexity but don't come with ongoing monetary sponsorship. Disabling the recursion limit felt like a taking a huge risk to satisfy a rare edge case, which might endanger my ability to build the many other things that would be more widely applicable. Also, since So that seems to make this issue an even lower priority for now. |
Steps to reproduce the bug with the above code
Actual Behaviour
(The column number varies between different runs)
Expected Behaviour
cargo semver-checks check-release
reports a number of API breaking changes between diesel 1.4 and diesel 2.0Additional Context
No response
Debug Output
Software version
cargo-semver-checks 0.9.1
Operating system
Linux 5.18.18-200.fc36.x86_64
Command-line
cargo nightly version
Compile time information
The text was updated successfully, but these errors were encountered: