-
Notifications
You must be signed in to change notification settings - Fork 242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v2 - loading the document requires knowing the format in advance #1918
Comments
We could create an overload that looks like this:
We first try and parse as JSON. If the JSON fails, we can say "either you have bad JSON, or you gave me YAML, and you have not registered the YAML reader, in which case you need to take a dependency on the OpenAPI.Readers package, and register the YAML reader."
|
This approach seems a bit problematic especially for large YAML documents. Once JSON parsing fails here: OpenAPI.NET/src/Microsoft.OpenApi/Reader/OpenApiJsonReader.cs Lines 43 to 48 in 1338905
Assuming we attempt the YAML parsing within the catch block, we can't reset the TextReader's input position once it shifts as it doesn't support seeking and the input has either been partially or fully consumed at this point leaving no content for the YAML parser. We could try buffering the content into memory for reuse but this becomes expensive especially for large YAML files. |
Thank you for the additional information. I don't understand why we to direct the parsing between two different parsing logics:
We could simply feed everything to the YAML parser, couldn't we? It it happens to "only be JSON" there's extra parsing logic that would be used. It might have a performance impact since yaml parsing ultimately converts back to JSON nodes. It's also worth noting this is what the library used to do before the recent changes. Then for people who really care about the performance impact, we could still leave in place the overloads that accept a format argument, bypassing this overhead. Also for any method that accepts a URI, we could look at the response content type and call the right format parser. Thoughts? |
@baywet Parsing Graph beta as JSON takes ~450ms. Using YamlSharp it take around 5secs. I think that is a difference we should try harder to take advantage of. Especially because many consumers of the library will not tell us what format they are sending. @Maggie. Maybe we should revisit sniffing the content. We might be able to "peek" at the first few bytes without consuming the content. Also, after reviewing the full set of APIs yesterday I have a few suggested changes. One of them includes making the TextReader go away. I don't think it is useful for JSON. Scenarios. We may be able to support it just in the OpenAPIYamlReader. |
Then I think we can continue reviewing the work I did in this PR which includes sniffing the content to detect the format #1929 |
(or its async variant)
Did not require knowing the format of the document (JSON/YAML) in advance. This was handy as when building general purpose tools, it's not something that I know in advance. Parsing the file extension/url extension/response content type header might help discovering it, but it's not reliable.
The equivalent method
Requires passing the format, which I believe is a regression in functionality. We should work to restore the same level of functionality.
Also, there are a number of places in the documentation and doc comments that still refer to the old initialization and require to be updated before GA.
https://github.com/search?q=repo%3Amicrosoft%2FOpenAPI.NET%20OpenApiStreamReader&type=code
The text was updated successfully, but these errors were encountered: