-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize a couple of nested loops in the scanner. #537
Conversation
this looks good to me |
Should this be a v3 pr? (is the same problem present in v3?) |
v2 is the default branch. v3 is in prerelease. |
Let's get the issue fixed in v2 first and then we can easily port it forward to v3. I will have a deeper look into the issue and the suggested solution and get back to you soon. |
First attempt had a bug dealing w/ complex keys added to the stack immediately after another key was "removed" (keys are not actually taken out of the stack, but do get overwritten, and the new key might not be a possible simple_key, so min_possible needs to account for that).
|
It's good to avoid force pushing in those cases, so that we can see what changed between the revisions. Either way, as we chatted over email I hadn't touched this logic from the ported C library so far as it seemed to do its job, but I'd like to try taking a step back before adding more complexity to see if there's an alternative approach that could get rid of the issues found while simplifying the implementation instead of making it harder to understand and maintain. I believe this is the case here. I will push an alternative algorithm to v3 that is both simpler and faster so you can have a look. |
This should simplify the logic and significantly improve performance in edge cases as found and reported on #537 by CJ Cullen.
Please have a look at e228e37 and let me know what you think. |
TL;DR: I think your patch fixes the issue, but I got fairly confused along the way :). I haven't looked to much at the code on v3, but it appears to be more strict on some sketchy (but I think valid) YAML. E.g.
produces:
produces: The DeepFlow benchmark I added would now fail because of the reasons above, but runs out of memory on my machine before it can fail. That test looks something like:
Where the maps are nested to max depth, with ~1000kB of crap (~500,000 tokens) at the bottom. I made the bottom map in the test smaller to gather some numbers.
I think this is because we are generating per-pair error messages for each re-defined key here: https://github.com/go-yaml/yaml/blob/v3/decode.go#L749-L760. That is probably fairly easy to clean up. When I flip
I think most of that is just v3-specific stuff (I don't think I'm running into this, but I could be), because comparing it to v2 w/ just your patch applied gets much better numbers:
|
If you're happy with it as a solution, yes, that would be great, thanks!
It is more strict indeed, and the prevention of duplicated keys by default is intended. It's not supported by the specification, so v3 makes this harder to overlook. That said, the It violates this part of the spec:
For that same reason, note that Would be nice to fix the proposed tests so they don't depend on invalid yaml.
Ah, good catch! Let's fix that as well then. |
This should simplify the logic and significantly improve performance in edge cases as found and reported on go-yaml#537 by CJ Cullen.
This should simplify the logic and significantly improve performance in edge cases as found and reported on go-yaml#537 by CJ Cullen.
Opened #543 for the port to v2 (with tests) |
One detail I forgot commenting on your earlier post, and which is certainly due out of respect for the proper research done, is the fact the suggested simpler implementation is slower than the original one on this PR. This is true, but assuming the simpler version solves the abuse well enough (when I tested originally it was about 95% improvement according to benchcmp), I think it's worth the simpler version on the basis that the more complex version won't really affect normal uses at all. That assumes most yaml is trivially nested, which sounds fair. If that's incorrect somehow, please do let me know. |
In my testing, your patch ported onto v2 was not significantly slower, except in a couple of the max-depth error cases, and even then, we're talking about 50ms vs. 1ms.
|
This should simplify the logic and significantly improve performance in edge cases as found and reported on #537 by CJ Cullen.
This PR has been obsoleted by the follow ups, per comments above. Thank you very much for the collaboration here, @cjcullen. |
There are two places in scannerc.go where we iterate through the full simple_keys stack that I think we can avoid:
I've included a couple of test cases that show the bad behavior, even with the recent depth limits :