v2 - loading the document requires knowing the format in advance #1918

baywet · 2024-11-12T13:33:34Z

new OpenApiStreamReader().Read(stream, out var diagnostic);

(or its async variant)

Did not require knowing the format of the document (JSON/YAML) in advance. This was handy as when building general purpose tools, it's not something that I know in advance. Parsing the file extension/url extension/response content type header might help discovering it, but it's not reliable.

The equivalent method

OpenApiDocument.LoadAsync(input, "unknown", settings)

Requires passing the format, which I believe is a regression in functionality. We should work to restore the same level of functionality.

Also, there are a number of places in the documentation and doc comments that still refer to the old initialization and require to be updated before GA.

https://github.com/search?q=repo%3Amicrosoft%2FOpenAPI.NET%20OpenApiStreamReader&type=code

The text was updated successfully, but these errors were encountered:

darrelmiller · 2024-11-19T14:37:01Z

We could create an overload that looks like this:

OpenApiDocument.LoadAsync(input, settings);

We first try and parse as JSON. If the JSON fails, we can say "either you have bad JSON, or you gave me YAML, and you have not registered the YAML reader, in which case you need to take a dependency on the OpenAPI.Readers package, and register the YAML reader."

Try and parse as JSON
If it fails, check to see if YAML parser is registered
If it is try parsing with the YAML parser

MaggieKimani1 · 2024-11-25T09:25:23Z

We could create an overload that looks like this:

OpenApiDocument.LoadAsync(input, settings);

We first try and parse as JSON. If the JSON fails, we can say "either you have bad JSON, or you gave me YAML, and you have not registered the YAML reader, in which case you need to take a dependency on the OpenAPI.Readers package, and register the YAML reader."

Try and parse as JSON

If it fails, check to see if YAML parser is registered

If it is try parsing with the YAML parser

This approach seems a bit problematic especially for large YAML documents. Once JSON parsing fails here:

OpenAPI.NET/src/Microsoft.OpenApi/Reader/OpenApiJsonReader.cs

Lines 43 to 48 in 1338905

    
           try 
        
           { 
        
               jsonNode = LoadJsonNodes(input); 
        
           } 
        
           catch (JsonException ex) 
        
           {

Assuming we attempt the YAML parsing within the catch block, we can't reset the TextReader's input position once it shifts as it doesn't support seeking and the input has either been partially or fully consumed at this point leaving no content for the YAML parser.

We could try buffering the content into memory for reuse but this becomes expensive especially for large YAML files.
I'd suggest we check whether the YAML parser has been registered already in the factory then use it right off the bat before using the JSON parser.
Thoughts?

baywet · 2024-11-25T12:54:45Z

Thank you for the additional information.

I don't understand why we to direct the parsing between two different parsing logics:

The YAML 1.23 specification was published in 2009. Its primary focus was making YAML a strict superset of JSON

source

We could simply feed everything to the YAML parser, couldn't we? It it happens to "only be JSON" there's extra parsing logic that would be used. It might have a performance impact since yaml parsing ultimately converts back to JSON nodes.

It's also worth noting this is what the library used to do before the recent changes.

Then for people who really care about the performance impact, we could still leave in place the overloads that accept a format argument, bypassing this overhead. Also for any method that accepts a URI, we could look at the response content type and call the right format parser.

Thoughts?

darrelmiller · 2024-11-25T13:50:09Z

@baywet Parsing Graph beta as JSON takes ~450ms. Using YamlSharp it take around 5secs. I think that is a difference we should try harder to take advantage of. Especially because many consumers of the library will not tell us what format they are sending.

@Maggie. Maybe we should revisit sniffing the content. We might be able to "peek" at the first few bytes without consuming the content. Also, after reviewing the full set of APIs yesterday I have a few suggested changes. One of them includes making the TextReader go away. I don't think it is useful for JSON. Scenarios. We may be able to support it just in the OpenAPIYamlReader.

MaggieKimani1 · 2024-11-25T15:45:34Z

@baywet Parsing Graph beta as JSON takes ~450ms. Using YamlSharp it take around 5secs. I think that is a difference we should try harder to take advantage of. Especially because many consumers of the library will not tell us what format they are sending.

@Maggie. Maybe we should revisit sniffing the content. We might be able to "peek" at the first few bytes without consuming the content. Also, after reviewing the full set of APIs yesterday I have a few suggested changes. One of them includes making the TextReader go away. I don't think it is useful for JSON. Scenarios. We may be able to support it just in the OpenAPIYamlReader.

Then I think we can continue reviewing the work I did in this PR which includes sniffing the content to detect the format #1929

baywet added the priority:p0 Blocking issue/ loss of critical functions. An ICM may be filed to communicate urgency. SLA<=48hrs label Nov 12, 2024

MaggieKimani1 mentioned this issue Nov 14, 2024

Fix: loading the document requires knowing the format in advance #1929

Closed

darrelmiller added this to the NET:2.0 milestone Nov 26, 2024

baywet mentioned this issue Nov 27, 2024

Add support for handling OpenAPI v3.1 definitions microsoft/kiota#3914

Open

MaggieKimani1 mentioned this issue Dec 5, 2024

[API review] Evaluate format default value in Load methods #1964

Closed

darrelmiller modified the milestones: v2 - Preview1, V2 - Preview3 Dec 19, 2024

baywet mentioned this issue Dec 20, 2024

Refactor readers to reduce surface area #1975

Merged

baywet closed this as completed in #1975 Dec 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v2 - loading the document requires knowing the format in advance #1918

v2 - loading the document requires knowing the format in advance #1918

baywet commented Nov 12, 2024

darrelmiller commented Nov 19, 2024

MaggieKimani1 commented Nov 25, 2024 •

edited

Loading

baywet commented Nov 25, 2024

darrelmiller commented Nov 25, 2024

MaggieKimani1 commented Nov 25, 2024

v2 - loading the document requires knowing the format in advance #1918

v2 - loading the document requires knowing the format in advance #1918

Comments

baywet commented Nov 12, 2024

darrelmiller commented Nov 19, 2024

MaggieKimani1 commented Nov 25, 2024 • edited Loading

baywet commented Nov 25, 2024

darrelmiller commented Nov 25, 2024

MaggieKimani1 commented Nov 25, 2024

MaggieKimani1 commented Nov 25, 2024 •

edited

Loading