Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add async support for all IO operations #430

Open
aaubry opened this issue Aug 28, 2019 · 18 comments
Open

Add async support for all IO operations #430

aaubry opened this issue Aug 28, 2019 · 18 comments

Comments

@aaubry
Copy link
Owner

aaubry commented Aug 28, 2019

Add async overloads of all methods that perform IO.
I am opening this ticket because I feel that this should eventually be implemented. If this is a feature that would be valuable to you, please say so in the comments.

@samsmithnz
Copy link

I wonder what the performance benefit would actually be here, since in my experience this is already pretty quick!

Would you create a second set of functionality with the Async suffix? Or would this be an upgrade and replace of all non async functionality to be async?

@aaubry
Copy link
Owner Author

aaubry commented Dec 7, 2019

The intention is more to be able to support workflows that only allow async operations. For example, the aspnet request or response streams, which by default require async. I don't have a clear design in mind yet, but I think that the synchronous API would still be necessary, therefore there would need to be a second set of APIs with the async suffix. Forcing async usage in all cases would have a performance cost for all memory-only use cases, which I think are relatively common.
What are your thoughts on this ?

@samsmithnz
Copy link

samsmithnz commented Dec 7, 2019

I like your plan, it makes sense. If you want to identify particular functions that you'd like to convert, I'm happy to contribute and work on this. I have some experience writing async functions...

@aaubry
Copy link
Owner Author

aaubry commented Dec 8, 2019

Sure. At the moment I can't dedicate too much time to this feature, so any help is welcome!

@samsmithnz
Copy link

samsmithnz commented Dec 8, 2019

I've created an initial commit and would value your input. Would you mind having a peak to make sure it meets your standards before I continue?

Note that this is only for "serialize" so far, and I've only written one async test (I tried to write an alternative for a (relatively) long running unit test, which you can see, appears to have a significant performance boost):

image

Pull Request: #457 (build currently failing due to some performance test issue I will also investigate)

@willson556
Copy link
Contributor

I took a look at picking up #457 -- I agree with the comment there that async should start at the Emitter and propagate up the stack. I think due to the number of callsites to Write() (which would need to be async and then the callsites would themselves need to be within async methods) that doing this while retaining the synchronous API would result in a lot of code duplication. I think there are 3 possible paths:

  1. Drop the synchronous API
  2. Make a thin-synchronous wrapper around the otherwise async implementation (generally frowned on)
  3. Invent some new architecture that somehow reduces how far async would propagate (I don't have any ideas here)

Personally, those paths are in order of preference and it's likely I could devote some significant development time towards 1 or 2.

@aaubry
Copy link
Owner Author

aaubry commented Apr 22, 2020

Regarding the options that you suggest, I think that only the first one is viable. Option 2 would add very little value, I think, and if there was an easy way to implement option 3, it would probably already be built into the C# language.

Not keeping the synchronous API is a problem because we target platforms that do not support async. I'm reluctant to discontinue their support and would rather not have async APIs if this proved too difficult to maintain.

I have another concern which is that the emitter currently performs a lot of small writes to the underlying TextWriter. Most of them will probably end up just writing to a buffer and return synchronously. This means that we'll end up having a lot of overhead and maybe the end result will be worse than using the synchronous APIs... We would need to perform some tests before fully committing to using async. I believe that using System.IO.Pipelines could help here but would require a complete reimplementation of the parser and emitter.

Do you have ideas to address these issues ?

@willson556
Copy link
Contributor

I agree with everything stated above. I started to look at how other serializers had handled this. It looks like Newtonsoft.Json hasn't fully implemented async support -- they seem to have support in the parser but not in the serializer. I then took a look at System.Text.Json and they took an interesting approach to leave async out of most of the codebase: https://github.com/dotnet/runtime/blob/master/src/libraries/System.Text.Json/src/System/Text/Json/Serialization/JsonSerializer.Write.Stream.cs#L76

They seem to be calling the normal synchronous code and then just copying the output to the stream asynchronously. This sort of approach would seem to be a lot easier to implement in YamlDotNet but I'm not sure how much it really improves performance? I may try and throw together a test of that approach in the next couple of days to see.

@willson556
Copy link
Contributor

Regardless of performance with more conventional streams, the JsonSerializer approach would meet the stated goal of supporting async-only streams.

@willson556
Copy link
Contributor

willson556 commented Apr 27, 2020

Just came across https://devblogs.microsoft.com/dotnet/system-io-pipelines-high-performance-io-in-net/ which seems highly applicable here (and I now realize you pointed out in your last comment).

@aaubry
Copy link
Owner Author

aaubry commented May 1, 2020

Yes, this would probably be the way forward, although we would have issues with some of the targetted platforms. I don't think that it is very important to keep supporting .NET 2.0, but Unity uses .NET 3.5 and I believe that there are many users of the library on that platform.

@willson556
Copy link
Contributor

I think with that approach it might be possible to avoid excessive code duplication and use conditional compiles and package references to maintain compatibility. I haven't yet had time to prototype anything concrete to see if it pans out.

@lesair
Copy link

lesair commented May 21, 2020

Antoine, do you need help on this?

@aaubry
Copy link
Owner Author

aaubry commented May 22, 2020

I'm not currently working on this, but maybe @willson556 does. I'm currently working on a different feature, schemas, and can't spend time on this feature at the moment.

@willson556
Copy link
Contributor

I haven't started work on this either. I was sort of waiting to see the outcome of #482 and #486 before proceeding with further contributions though I do have a long-run interest in building this.

It is my current belief based on some work we've done since April that it would not be too difficult to do this using the producer/consumer pattern utilizing System.IO.Pipelines or System.Threading.Channels to bridge the gap between calls to Write() in the emitter and async IO. I haven't looked at the read side but I imagine a similar pattern would be possible there as well.

@lesair
Copy link

lesair commented Jun 9, 2020

Sounds like fun. How can I help?

@goncalo-oliveira
Copy link

Any news on this topic since 2020? Some .NET parts do block synchronous IO operations, so having an async API here would definitely make things smoother.

@goldsam
Copy link

goldsam commented Oct 26, 2023

+1 to a high-speed implementation based on System.IO.Pipelines.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants