Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved async method code gen for exceptions #65863

Open
stephentoub opened this issue Dec 8, 2022 · 31 comments
Open

Improved async method code gen for exceptions #65863

stephentoub opened this issue Dec 8, 2022 · 31 comments
Assignees
Labels
Area-Compilers Concept-API This issue involves adding, removing, clarification, or modification of an API. Feature Request
Milestone

Comments

@stephentoub
Copy link
Member

stephentoub commented Dec 8, 2022

Exceptions have significant overhead and are for exceptional situations. The guidance then is to avoid them in performance-sensitive code. However, there are times when the cost of exceptions does matter, and while developers might be able to recognize such situations after the fact and do work to react to prevent performance-related issues there in the future, it'd be nice if the overhead associated with exceptions could be reduced. There is work happening in the .NET runtime to reduce the cost of exceptions significantly in general, but even with that work there will still be significant overhead, and that overhead is magnified by async. The async model for propagating exceptions involves throwing and catching an exception in every "stack frame" as part of the async call chain, so an exception propagating through a long chain of async methods can have an order (or two) of magnitude aggregate higher cost than the corresponding exception in a synchronous call stack.

This is a proposal to do better. There have been similar discussions over the years in a variety of issues/discussions on csharplang, but I'm hoping we can use this issue as the center of gravity and decide if/what to do about this (which could include deciding once and for all to do nothing).

Proposal

Today, for an async method like the following:

public async Task Method()
{
    ...
    await SomethingAsync();
    ...
}

the compiler emits a MoveNext method containing scaffolding like the following:

try
{
    ...
    awaiter = SomethingAsync().GetAwaiter();
    if (!$awaiter.IsCompleted)
    {
        <>t__state = 42;
        <>u__awaiter1 = awaiter;
        <>t__builder.AwaitUnsafeOnCompleted(ref $awaiter, ref this);
        return;
        Label42:
        awaiter = <>u__awaiter1;
        <>t__state = -1;
    }
    awaiter.GetResult();
    ...
}
catch (Exception exception)
{
    <>1__state = -2;
    <>t__builder.SetException(exception);
    return;
}
<>1__state = -2;
<>t__builder.SetResult();

That awaiter.GetResult() call throws an exception for a faulted awaitee, with that exception then getting caught by the compiler-generated catch, which stores the exception into the builder associated with this method, completing the operation. The GetResult() method is defined as part of the await pattern, a parameterless method whose return value is either void or matches the TResult of the operation being awaited.

Pattern

We update the await pattern to optionally include a method of the form:

TResult GetResult(out Exception? exception);

where the TResult may be void if the operation doesn't return a value. If an awaiter exposes a GetResult of this form, it should not throw an exception even for failed operations. Instead, if the operation completes successfully, it should return a value as does the existing GetResult method, with the exception out argument set to null. If the operation completes unsuccessfully, it should return default(TResult), with the exception out argument set to a non-null instance of an Exception.

Codegen

When an await is nested inside of any user-written try block, nothing changes.

For any await outside of all user-written try blocks, if the awaiter exposes the new GetResult(out Exception? exception) overload, the compiler will prefer to use it, e.g.

Exception? e = null;
try
{
    ...
    awaiter.GetResult(out e);
    if (e is not null) goto Exceptional;
    ...
}
catch (Exception exception)
{
    <>1__state = -2;
    <>t__builder.SetException(exception);
    return;
}

<>1__state = -2;
<>t__builder.SetResult();
return;

Exceptional:
ExceptionDispatchInfo.AppendCurrentStackFrame(e);
<>t__builder.SetException(e);

In doing so, we avoid the expensive layer of throw/catch in order to propagate the exception from the awaiter to the method's builder.

A centralized Exceptional: section like this would keep the additional code size to a minimum, albeit it would lose some diagnostic information about the location of the error. Alternatively, for more code size, every await could include:

...
awaiter.GetResult(out e);
if (e is not null)
{
    ExceptionDispatchInfo.AppendCurrentStackFrame(e);
    <>t__builder.SetException(e);
    return;
}
...

New Runtime APIs

ExceptionDispatchInfo.AppendCurrentStackFrame. Just passing the exception from GetResult(out Exception? exception) to builder.SetException would result in a diagnostic gap, as the exception would no longer contain data about this async method, data that would normally be populated as part of the throw/catch. Instead, we add a new ExceptionDispatchInfo.AppendCurrentStackFrame that does the minimal work necessary to gather the same data about the current stack frame it would as part of a stack walk, and append relevant data to the exception's stack trace.

All of the public awaiters in CoreLib are updated to expose a new GetResult overload:

  • TaskAwaiter
  • TaskAwaiter<TResult>
  • ValueTaskAwaiter
  • ValueTaskAwaiter<TResult>
  • ConfiguredTaskAwaitable.ConfiguredTaskAwaiter
  • ConfiguredTaskAwaitable<TResult>.ConfiguredTaskAwaiter
  • ConfiguredValueTaskAwaitable.ConfiguredValueTaskAwaiter
  • ConfiguredValueTaskAwaitable<TResult>.ConfiguredValueTaskAwaiter

And we use default interface method support to add new GetResult overloads to IValueTaskSource and IValueTaskSource<TResult>:

public interface IValueTaskSource
{
    ...
    public virtual void GetResult(out Exception? exception)
    {
        try
        {
            GetResult();
            exception = null;
        }
        catch (Exception e)
        {
            exception = e;
        }
    }
}


public interface IValueTaskSource<TResult>
{
    ...
    public virtual TResult GetResult(out Exception? exception)
    {
        try
        {
            exception = null;
            return GetResult();
        }
        catch (Exception e)
        {
            exception = e;
            return default;
        }
    }
}

such that these new overloads may be used by the ValueTask awaiters. dotnet/runtime's implementations of these interfaces would override GetResult(out Exception) to avoid the default throw/catch.

Risks

  • This adds additional boilerplate code to async methods in support of the exceptional path. That means some increase in binary size to support improving the performance of a scenario that's supposed to be exceptional and non-hot-path. This could be a reason to not do this feature.
  • Similarly, this adds non-pay-for-play overhead to every await (at a minimum a null check / branch, though hopefully reasonably well predicted) in support of the exceptional case but paid on the success path. We would need to measure the incurred overhead to make sure it's in the noise... anything more than that should make us not do this feature.
  • This changes the code gen for async methods, so tools which recognize it (e.g. decompilers) will need to be updated.
  • We'd need to measure the proposed ExceptionDispatchInfo.AppendCurrentStackFrame method's overhead to make sure it's meaningfully-enough better than a throw/catch.

Alternatives

The simplest alternative is for developers to keep on keeping on as they do today. Regardless of whether this feature exists or not, we still want exceptions to be exceptional, and so developers should continue to try to keep exceptions off the hot path. In situations where a developer detects an expensive use of exception throw/catch due to async, they can manually change the code to avoid throwing via a custom await helper, e.g.

internal readonly struct NoThrowAwaiter : ICriticalNotifyCompletion
{
    private readonly Task _task;
    public NoThrowAwaiter(Task task) => _task = task;
    public NoThrowAwaiter GetAwaiter() => this;
    public bool IsCompleted => _task.IsCompleted;
    public void GetResult() { }
    public void OnCompleted(Action continuation) => _task.GetAwaiter().OnCompleted(continuation);
    public void UnsafeOnCompleted(Action continuation) => _task.GetAwaiter().UnsafeOnCompleted(continuation);
}
...
await new NoThrowAwaiter(task);
if (task.IsFaulted)
{
    Use(task.Exception.InnerException);
}

dotnet/runtime#22144 tracks adding new public API for this in dotnet/runtime and will hopefully be addressed in .NET 8, regardless of this issue. This issue ends up being about helping to reduce the situations in which a developer would need to do that and need to know to do that.

Related discussion:
dotnet/csharplang#4450

@stephentoub stephentoub added Concept-API This issue involves adding, removing, clarification, or modification of an API. Feature Request labels Dec 8, 2022
@dotnet-issue-labeler dotnet-issue-labeler bot added Area-Compilers untriaged Issues and PRs which have not yet been triaged by a lead labels Dec 8, 2022
@stephentoub
Copy link
Member Author

cc: @davidwrighton, @janvorli, @davidfowl, @jcouv

@davidwrighton
Copy link
Member

@tommcdon I'm curious what you think of the diagnostic impact

@janvorli
Copy link
Member

janvorli commented Dec 8, 2022

cc: @mangod9 FYI

@CyrusNajmabadi
Copy link
Member

I'd definitely be motivated on this if the numbers showed it to be worthwhile. Any benchmarks for this? Would def be curious how this helps for things like cancellation exceptions. Thanks

@stephentoub
Copy link
Member Author

stephentoub commented Dec 8, 2022

Any benchmarks for this?

Sure. I took this code:

public class Original
{
    public static async Task A() => await B();
    private static async Task B() => await C();
    private static async Task C() => await D();
    private static async Task D() => await E();
    private static async Task E() => throw new Exception();
}

and decompiled it, then duplicated the resulting C# and modified the Original into a Proposal that tweaks the GetResult calls approximately as suggested. I then wrapped each in a benchmark:

[Benchmark(Baseline = true)]
public async Task OriginalVersion()
{
    try
    {
        await Original.A();
    }
    catch { }
}

[Benchmark]
public async Task ProposedVersion()
{
    try
    {
        await Proposal.A();
    }
    catch { }
}

So in each case, we have 6 async methods. In OriginalVersion, all 6 will throw/catch an exception. In ProposedVersion, A, B, C, and D will all propagate the exception in the proposed manner, so only 2 of the 6 will throw/catch an exception. Thus we end up removing ~66% of the exceptions being thrown/caught for this relatively small chain... longer chains would have more savings. This is obviously an approximation, as not all throw/catches have equivalent costs, and I've not added anything in here to assist with diagnostics (e.g. the AppendCurrentStackFrame method mentioned). On my machine, this benchmark shows the proposal doubling the throughput and cutting allocation by more than half (but, this isn't decisive... we'd need to measure a more complete prototype, for both success and failure cases, and in more real-world usage... this is just a microbenchmark):

Benchmark
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
using System;
using System.Diagnostics;
using System.Runtime.CompilerServices;
using System.Runtime.InteropServices;
using System.Threading.Tasks;

[MemoryDiagnoser]
public partial class Program
{
    static void Main(string[] args) => BenchmarkSwitcher.FromAssembly(typeof(Program).Assembly).Run(args);

    [Benchmark(Baseline = true)]
    public async Task OriginalVersion()
    {
        try
        {
            await Original.A();
        }
        catch { }
    }

    [Benchmark]
    public async Task ProposedVersion()
    {
        try
        {
            await Proposal.A();
        }
        catch { }
    }
}

//public class Original
//{
//    public static async Task A() => await B();
//    private static async Task B() => await C();
//    private static async Task C() => await D();
//    private static async Task D() => await E();
//    private static async Task E() => throw new Exception();
//}

public class Original
{
    [StructLayout(LayoutKind.Auto)]
    [CompilerGenerated]
    private struct Ad__0 : IAsyncStateMachine
    {
        public int one__state;

        public AsyncTaskMethodBuilder t__builder;

        private TaskAwaiter u__1;

        private void MoveNext()
        {
            int num = one__state;
            try
            {
                TaskAwaiter awaiter;
                if (num != 0)
                {
                    awaiter = B().GetAwaiter();
                    if (!awaiter.IsCompleted)
                    {
                        num = (one__state = 0);
                        u__1 = awaiter;
                        t__builder.AwaitUnsafeOnCompleted(ref awaiter, ref this);
                        return;
                    }
                }
                else
                {
                    awaiter = u__1;
                    u__1 = default(TaskAwaiter);
                    num = (one__state = -1);
                }
                awaiter.GetResult();
            }
            catch (Exception exception)
            {
                one__state = -2;
                t__builder.SetException(exception);
                return;
            }
            one__state = -2;
            t__builder.SetResult();
        }

        void IAsyncStateMachine.MoveNext()
        {
            this.MoveNext();
        }

        [DebuggerHidden]
        private void SetStateMachine(IAsyncStateMachine stateMachine)
        {
            t__builder.SetStateMachine(stateMachine);
        }

        void IAsyncStateMachine.SetStateMachine(IAsyncStateMachine stateMachine)
        {
            this.SetStateMachine(stateMachine);
        }
    }


    [StructLayout(LayoutKind.Auto)]
    [CompilerGenerated]
    private struct Bd__1 : IAsyncStateMachine
    {
        public int one__state;

        public AsyncTaskMethodBuilder t__builder;

        private TaskAwaiter u__1;

        private void MoveNext()
        {
            int num = one__state;
            try
            {
                TaskAwaiter awaiter;
                if (num != 0)
                {
                    awaiter = C().GetAwaiter();
                    if (!awaiter.IsCompleted)
                    {
                        num = (one__state = 0);
                        u__1 = awaiter;
                        t__builder.AwaitUnsafeOnCompleted(ref awaiter, ref this);
                        return;
                    }
                }
                else
                {
                    awaiter = u__1;
                    u__1 = default(TaskAwaiter);
                    num = (one__state = -1);
                }
                awaiter.GetResult();
            }
            catch (Exception exception)
            {
                one__state = -2;
                t__builder.SetException(exception);
                return;
            }
            one__state = -2;
            t__builder.SetResult();
        }

        void IAsyncStateMachine.MoveNext()
        {
            this.MoveNext();
        }

        [DebuggerHidden]
        private void SetStateMachine(IAsyncStateMachine stateMachine)
        {
            t__builder.SetStateMachine(stateMachine);
        }

        void IAsyncStateMachine.SetStateMachine(IAsyncStateMachine stateMachine)
        {
            this.SetStateMachine(stateMachine);
        }
    }


    [StructLayout(LayoutKind.Auto)]
    [CompilerGenerated]
    private struct Cd__2 : IAsyncStateMachine
    {
        public int one__state;

        public AsyncTaskMethodBuilder t__builder;

        private TaskAwaiter u__1;

        private void MoveNext()
        {
            int num = one__state;
            try
            {
                TaskAwaiter awaiter;
                if (num != 0)
                {
                    awaiter = D().GetAwaiter();
                    if (!awaiter.IsCompleted)
                    {
                        num = (one__state = 0);
                        u__1 = awaiter;
                        t__builder.AwaitUnsafeOnCompleted(ref awaiter, ref this);
                        return;
                    }
                }
                else
                {
                    awaiter = u__1;
                    u__1 = default(TaskAwaiter);
                    num = (one__state = -1);
                }
                awaiter.GetResult();
            }
            catch (Exception exception)
            {
                one__state = -2;
                t__builder.SetException(exception);
                return;
            }
            one__state = -2;
            t__builder.SetResult();
        }

        void IAsyncStateMachine.MoveNext()
        {
            this.MoveNext();
        }

        [DebuggerHidden]
        private void SetStateMachine(IAsyncStateMachine stateMachine)
        {
            t__builder.SetStateMachine(stateMachine);
        }

        void IAsyncStateMachine.SetStateMachine(IAsyncStateMachine stateMachine)
        {
            this.SetStateMachine(stateMachine);
        }
    }


    [StructLayout(LayoutKind.Auto)]
    [CompilerGenerated]
    private struct Dd__3 : IAsyncStateMachine
    {
        public int one__state;

        public AsyncTaskMethodBuilder t__builder;

        private TaskAwaiter u__1;

        private void MoveNext()
        {
            int num = one__state;
            try
            {
                TaskAwaiter awaiter;
                if (num != 0)
                {
                    awaiter = E().GetAwaiter();
                    if (!awaiter.IsCompleted)
                    {
                        num = (one__state = 0);
                        u__1 = awaiter;
                        t__builder.AwaitUnsafeOnCompleted(ref awaiter, ref this);
                        return;
                    }
                }
                else
                {
                    awaiter = u__1;
                    u__1 = default(TaskAwaiter);
                    num = (one__state = -1);
                }
                awaiter.GetResult();
            }
            catch (Exception exception)
            {
                one__state = -2;
                t__builder.SetException(exception);
                return;
            }
            one__state = -2;
            t__builder.SetResult();
        }

        void IAsyncStateMachine.MoveNext()
        {
            this.MoveNext();
        }

        [DebuggerHidden]
        private void SetStateMachine(IAsyncStateMachine stateMachine)
        {
            t__builder.SetStateMachine(stateMachine);
        }

        void IAsyncStateMachine.SetStateMachine(IAsyncStateMachine stateMachine)
        {
            this.SetStateMachine(stateMachine);
        }
    }

    [StructLayout(LayoutKind.Auto)]
    [CompilerGenerated]
    private struct Ed__4 : IAsyncStateMachine
    {
        public int one__state;

        public AsyncTaskMethodBuilder t__builder;

        private void MoveNext()
        {
            try
            {
                throw new OperationCanceledException();
            }
            catch (Exception exception)
            {
                one__state = -2;
                t__builder.SetException(exception);
            }
        }

        void IAsyncStateMachine.MoveNext()
        {
            this.MoveNext();
        }

        [DebuggerHidden]
        private void SetStateMachine(IAsyncStateMachine stateMachine)
        {
            t__builder.SetStateMachine(stateMachine);
        }

        void IAsyncStateMachine.SetStateMachine(IAsyncStateMachine stateMachine)
        {
            this.SetStateMachine(stateMachine);
        }
    }

    [AsyncStateMachine(typeof(Ad__0))]
    public static Task A()
    {
        Ad__0 stateMachine = default(Ad__0);
        stateMachine.t__builder = AsyncTaskMethodBuilder.Create();
        stateMachine.one__state = -1;
        stateMachine.t__builder.Start(ref stateMachine);
        return stateMachine.t__builder.Task;
    }

    [AsyncStateMachine(typeof(Bd__1))]
    private static Task B()
    {
        Bd__1 stateMachine = default(Bd__1);
        stateMachine.t__builder = AsyncTaskMethodBuilder.Create();
        stateMachine.one__state = -1;
        stateMachine.t__builder.Start(ref stateMachine);
        return stateMachine.t__builder.Task;
    }

    [AsyncStateMachine(typeof(Cd__2))]
    private static Task C()
    {
        Cd__2 stateMachine = default(Cd__2);
        stateMachine.t__builder = AsyncTaskMethodBuilder.Create();
        stateMachine.one__state = -1;
        stateMachine.t__builder.Start(ref stateMachine);
        return stateMachine.t__builder.Task;
    }

    [AsyncStateMachine(typeof(Dd__3))]
    private static Task D()
    {
        Dd__3 stateMachine = default(Dd__3);
        stateMachine.t__builder = AsyncTaskMethodBuilder.Create();
        stateMachine.one__state = -1;
        stateMachine.t__builder.Start(ref stateMachine);
        return stateMachine.t__builder.Task;
    }

    [AsyncStateMachine(typeof(Ed__4))]
    private static Task E()
    {
        Ed__4 stateMachine = default(Ed__4);
        stateMachine.t__builder = AsyncTaskMethodBuilder.Create();
        stateMachine.one__state = -1;
        stateMachine.t__builder.Start(ref stateMachine);
        return stateMachine.t__builder.Task;
    }
}

// Actual version would use:
//     awaiter.GetResult(out Exception e);
//     if (e is not null)
//     {
//         ...
//     }
// As that's not available, for benchmarking purposes it's instead using:
//     Task originalTask = Unsafe.As<TaskAwaiter, Task>(awaiter);
//     if (!Unsafe.As<TaskAwaiter, Task>(awaiter).IsCompletedSuccessfully)
//     {
//         Exception e = Unsafe.As<TaskAwaiter, Task>(awaiter).Exception.InnerException;
//         ...
//     }
// DO NOT DO THIS IN REAL CODE!!!!!!!!!!!

public class Proposal
{
    [StructLayout(LayoutKind.Auto)]
    [CompilerGenerated]
    private struct Ad__0 : IAsyncStateMachine
    {
        public int one__state;

        public AsyncTaskMethodBuilder t__builder;

        private TaskAwaiter u__1;

        private void MoveNext()
        {
            Exception propagated = null;
            int num = one__state;
            try
            {
                TaskAwaiter awaiter;
                if (num != 0)
                {
                    awaiter = B().GetAwaiter();
                    if (!awaiter.IsCompleted)
                    {
                        num = (one__state = 0);
                        u__1 = awaiter;
                        t__builder.AwaitUnsafeOnCompleted(ref awaiter, ref this);
                        return;
                    }
                }
                else
                {
                    awaiter = u__1;
                    u__1 = default(TaskAwaiter);
                    num = (one__state = -1);
                }
                if (!Unsafe.As<TaskAwaiter, Task>(ref awaiter).IsCompletedSuccessfully)
                {
                    propagated = Unsafe.As<TaskAwaiter, Task>(ref awaiter).Exception.InnerException;
                    goto Exceptional;
                }
                awaiter.GetResult();
            }
            catch (Exception exception)
            {
                one__state = -2;
                t__builder.SetException(exception);
                return;
            }
            one__state = -2;
            t__builder.SetResult();
            return;
        Exceptional:
            one__state = -2;
            t__builder.SetException(propagated);
        }

        void IAsyncStateMachine.MoveNext()
        {
            this.MoveNext();
        }

        [DebuggerHidden]
        private void SetStateMachine(IAsyncStateMachine stateMachine)
        {
            t__builder.SetStateMachine(stateMachine);
        }

        void IAsyncStateMachine.SetStateMachine(IAsyncStateMachine stateMachine)
        {
            this.SetStateMachine(stateMachine);
        }
    }


    [StructLayout(LayoutKind.Auto)]
    [CompilerGenerated]
    private struct Bd__1 : IAsyncStateMachine
    {
        public int one__state;

        public AsyncTaskMethodBuilder t__builder;

        private TaskAwaiter u__1;

        private void MoveNext()
        {
            Exception propagated = null;
            int num = one__state;
            try
            {
                TaskAwaiter awaiter;
                if (num != 0)
                {
                    awaiter = C().GetAwaiter();
                    if (!awaiter.IsCompleted)
                    {
                        num = (one__state = 0);
                        u__1 = awaiter;
                        t__builder.AwaitUnsafeOnCompleted(ref awaiter, ref this);
                        return;
                    }
                }
                else
                {
                    awaiter = u__1;
                    u__1 = default(TaskAwaiter);
                    num = (one__state = -1);
                }
                if (!Unsafe.As<TaskAwaiter, Task>(ref awaiter).IsCompletedSuccessfully)
                {
                    propagated = Unsafe.As<TaskAwaiter, Task>(ref awaiter).Exception.InnerException;
                    goto Exceptional;
                }
                awaiter.GetResult();
            }
            catch (Exception exception)
            {
                one__state = -2;
                t__builder.SetException(exception);
                return;
            }
            one__state = -2;
            t__builder.SetResult();
            return;
        Exceptional:
            one__state = -2;
            t__builder.SetException(propagated);
        }

        void IAsyncStateMachine.MoveNext()
        {
            this.MoveNext();
        }

        [DebuggerHidden]
        private void SetStateMachine(IAsyncStateMachine stateMachine)
        {
            t__builder.SetStateMachine(stateMachine);
        }

        void IAsyncStateMachine.SetStateMachine(IAsyncStateMachine stateMachine)
        {
            this.SetStateMachine(stateMachine);
        }
    }


    [StructLayout(LayoutKind.Auto)]
    [CompilerGenerated]
    private struct Cd__2 : IAsyncStateMachine
    {
        public int one__state;

        public AsyncTaskMethodBuilder t__builder;

        private TaskAwaiter u__1;

        private void MoveNext()
        {
            Exception propagated = null;
            int num = one__state;
            try
            {
                TaskAwaiter awaiter;
                if (num != 0)
                {
                    awaiter = D().GetAwaiter();
                    if (!awaiter.IsCompleted)
                    {
                        num = (one__state = 0);
                        u__1 = awaiter;
                        t__builder.AwaitUnsafeOnCompleted(ref awaiter, ref this);
                        return;
                    }
                }
                else
                {
                    awaiter = u__1;
                    u__1 = default(TaskAwaiter);
                    num = (one__state = -1);
                }
                if (!Unsafe.As<TaskAwaiter, Task>(ref awaiter).IsCompletedSuccessfully)
                {
                    propagated = Unsafe.As<TaskAwaiter, Task>(ref awaiter).Exception.InnerException;
                    goto Exceptional;
                }
                awaiter.GetResult();
            }
            catch (Exception exception)
            {
                one__state = -2;
                t__builder.SetException(exception);
                return;
            }
            one__state = -2;
            t__builder.SetResult();
            return;
        Exceptional:
            one__state = -2;
            t__builder.SetException(propagated);
        }

        void IAsyncStateMachine.MoveNext()
        {
            this.MoveNext();
        }

        [DebuggerHidden]
        private void SetStateMachine(IAsyncStateMachine stateMachine)
        {
            t__builder.SetStateMachine(stateMachine);
        }

        void IAsyncStateMachine.SetStateMachine(IAsyncStateMachine stateMachine)
        {
            this.SetStateMachine(stateMachine);
        }
    }


    [StructLayout(LayoutKind.Auto)]
    [CompilerGenerated]
    private struct Dd__3 : IAsyncStateMachine
    {
        public int one__state;

        public AsyncTaskMethodBuilder t__builder;

        private TaskAwaiter u__1;

        private void MoveNext()
        {
            Exception propagated = null;
            int num = one__state;
            try
            {
                TaskAwaiter awaiter;
                if (num != 0)
                {
                    awaiter = E().GetAwaiter();
                    if (!awaiter.IsCompleted)
                    {
                        num = (one__state = 0);
                        u__1 = awaiter;
                        t__builder.AwaitUnsafeOnCompleted(ref awaiter, ref this);
                        return;
                    }
                }
                else
                {
                    awaiter = u__1;
                    u__1 = default(TaskAwaiter);
                    num = (one__state = -1);
                }
                if (!Unsafe.As<TaskAwaiter, Task>(ref awaiter).IsCompletedSuccessfully)
                {
                    propagated = Unsafe.As<TaskAwaiter, Task>(ref awaiter).Exception.InnerException;
                    goto Exceptional;
                }
                awaiter.GetResult();
            }
            catch (Exception exception)
            {
                one__state = -2;
                t__builder.SetException(exception);
                return;
            }
            one__state = -2;
            t__builder.SetResult();
            return;
        Exceptional:
            one__state = -2;
            t__builder.SetException(propagated);
        }

        void IAsyncStateMachine.MoveNext()
        {
            this.MoveNext();
        }

        [DebuggerHidden]
        private void SetStateMachine(IAsyncStateMachine stateMachine)
        {
            t__builder.SetStateMachine(stateMachine);
        }

        void IAsyncStateMachine.SetStateMachine(IAsyncStateMachine stateMachine)
        {
            this.SetStateMachine(stateMachine);
        }
    }

    [StructLayout(LayoutKind.Auto)]
    [CompilerGenerated]
    private struct Ed__4 : IAsyncStateMachine
    {
        public int one__state;

        public AsyncTaskMethodBuilder t__builder;

        private void MoveNext()
        {
            try
            {
                throw new OperationCanceledException();
            }
            catch (Exception exception)
            {
                one__state = -2;
                t__builder.SetException(exception);
            }
        }

        void IAsyncStateMachine.MoveNext()
        {
            this.MoveNext();
        }

        [DebuggerHidden]
        private void SetStateMachine(IAsyncStateMachine stateMachine)
        {
            t__builder.SetStateMachine(stateMachine);
        }

        void IAsyncStateMachine.SetStateMachine(IAsyncStateMachine stateMachine)
        {
            this.SetStateMachine(stateMachine);
        }
    }

    [AsyncStateMachine(typeof(Ad__0))]
    public static Task A()
    {
        Ad__0 stateMachine = default(Ad__0);
        stateMachine.t__builder = AsyncTaskMethodBuilder.Create();
        stateMachine.one__state = -1;
        stateMachine.t__builder.Start(ref stateMachine);
        return stateMachine.t__builder.Task;
    }

    [AsyncStateMachine(typeof(Bd__1))]
    private static Task B()
    {
        Bd__1 stateMachine = default(Bd__1);
        stateMachine.t__builder = AsyncTaskMethodBuilder.Create();
        stateMachine.one__state = -1;
        stateMachine.t__builder.Start(ref stateMachine);
        return stateMachine.t__builder.Task;
    }

    [AsyncStateMachine(typeof(Cd__2))]
    private static Task C()
    {
        Cd__2 stateMachine = default(Cd__2);
        stateMachine.t__builder = AsyncTaskMethodBuilder.Create();
        stateMachine.one__state = -1;
        stateMachine.t__builder.Start(ref stateMachine);
        return stateMachine.t__builder.Task;
    }

    [AsyncStateMachine(typeof(Dd__3))]
    private static Task D()
    {
        Dd__3 stateMachine = default(Dd__3);
        stateMachine.t__builder = AsyncTaskMethodBuilder.Create();
        stateMachine.one__state = -1;
        stateMachine.t__builder.Start(ref stateMachine);
        return stateMachine.t__builder.Task;
    }

    [AsyncStateMachine(typeof(Ed__4))]
    private static Task E()
    {
        Ed__4 stateMachine = default(Ed__4);
        stateMachine.t__builder = AsyncTaskMethodBuilder.Create();
        stateMachine.one__state = -1;
        stateMachine.t__builder.Start(ref stateMachine);
        return stateMachine.t__builder.Task;
    }
}
Method Mean Error StdDev Ratio Gen0 Allocated Alloc Ratio
OriginalVersion 48.38 us 0.173 us 0.162 us 1.00 0.0610 7.45 KB 1.00
ProposedVersion 24.16 us 0.200 us 0.167 us 0.50 0.0305 3.2 KB 0.43

@CyrusNajmabadi
Copy link
Member

Oh wow. That's quite a benefit. I didn't realize there as a memory win here as well. This def seems like a worthwhile area to investigate.

Similarly, this adds non-pay-for-play overhead to every await (at a minimum a null check / branch, though hopefully reasonably well predicted) in support of the exceptional case but paid on the success path. We would need to measure the incurred overhead to make sure it's in the noise... anything more than that should make us not do this feature

This def seems important. Does the runtime have perf tests that exercise async?

@mangod9
Copy link
Member

mangod9 commented Dec 8, 2022

@jkotas as well. Assume this doesn't require any runtime changes?

@jcouv jcouv self-assigned this Dec 8, 2022
@jcouv jcouv added this to the C# 12.0 milestone Dec 8, 2022
@jcouv jcouv removed the untriaged Issues and PRs which have not yet been triaged by a lead label Dec 8, 2022
@stephentoub
Copy link
Member Author

Assume this doesn't require any runtime changes?

Just CoreLib, plus anything that might be needed in support of an efficient mechanism to augment the Exception with diagnostics information for the current stack frame.

@stephentoub
Copy link
Member Author

Does the runtime have perf tests that exercise async?

Yes, but all microbenchmarks.

@jkotas
Copy link
Member

jkotas commented Dec 8, 2022

This adds additional boilerplate code to async methods in support of the exceptional path. That means some increase in binary size to support improving the performance of a scenario that's supposed to be exceptional and non-hot-path.

For reference, a simple async method like async Task A() => await B(); costs about 3.5kB of workingset at runtime (with JIT) when the async path is not taken. The cost grows to about 9kB of workingset once the async path is taken. The binary footprint of AOT compiled async boilerplate code is in similar range. It is a lot - dotnet/runtime#79204 has an example where the async boilerplate code is about 15% of the total AOT binary size.

@Suchiman
Copy link
Contributor

Suchiman commented Dec 9, 2022

Would this even be relevant, if the green threads experiment succeeds?

@stephentoub
Copy link
Member Author

Would this even be relevant, if the green threads experiment succeeds?

Yes.
a) That's a long-term experiment. There's no immediate plans for shipping anything based on that.
b) Even if it does ship, it'll be in addition to not instead of async/await, which the .NET ecosystem is based on now and will be for a very, very long time.
c) As currently defined in that experiment, the lightweight threads aren't usable everywhere async/await its, e.g. in situations where the actual thread used matters, like UI.

@tommcdon
Copy link
Member

tommcdon commented Dec 9, 2022

Thanks for including me on the discussion. I haven't read through all of the details yet, but on the surface it looks like this change might have impact on async debugging, and so we will need to evaluate impact to diagnostics.

@gregg-miskelly
Copy link
Contributor

I think the primary diagnostics impact would be that it would no longer be possible to stop at each link along the chain of the async stack. In some cases, this might be an improvement actually. But, especially in a JMC scenario when a portion of the stack was non-user code, this could cause a significant regression. I think this could be addressed by making ExceptionDispatchInfo.AppendCurrentStackFrame raise something along the lines of a first-chance exception to the debugger (perhaps also renaming the method since it is now doing two things).

@theodorzoulias
Copy link

dotnet/runtime#22144 tracks adding new public API for this in dotnet/runtime and will hopefully be addressed in .NET 8, regardless of this issue.

I am a bit pessimistic that the Support await'ing a Task without throwing #22144 proposal can go forward, without the IValueTaskSource.GetResult(out Exception? exception) enhancement proposed here. Currently the options for implementing that feature for value-tasks are:

  1. Lose forever the exception that caused the value-task to complete in a faulted state, or
  2. Preserve all information by doing an internal try/catch (only for IValueTaskSource-based value-tasks).

Both options are unsatisfactory. Especially the second, because the NoThrow functionality is desirable mainly for performance, not for convenience. I hope that this proposal will go forward, so that the NoThrow proposal can have a satisfactory outcome.

@timcassell
Copy link

dotnet/csharplang#4565 Is a fuller proposal (compared to the discussion in dotnet/csharplang#4450 that is linked in the OP). Thanks @sharwell for linking.

Could the solution proposed there be used in conjunction with this? Or is this expected to supersede everything async-throw related?

As proposed here, canceling async functions would still be quite a bit more expensive than the proposal in dotnet/csharplang#4565 (1000x vs 2x), thanks to the ExceptionDispatchInfo.AppendCurrentStackFrame. I would imagine trying to optimize the cancelation case by using a singleton OperationCanceledException instance, but that AppendCurrentStackFrame would end up introducing a memory leak, so that leaves only allocating a new OperationCanceledException().

Also, in this proposal, it suggests the compiler only emit this new code if the awaiter has the new GetResult overload method. Could the CoreLib types expose a new API to return new awaiters that contain the overload for users to more explicitly generate this exception-optimized code (so normal code would not take a hit if it expects to never encounter an exception)? I suggested the same idea in dotnet/csharplang#4565.

As an aside, I do like the TResult GetResult(out Exception? exception);, perhaps a similar TResult GetResult(out bool isCanceled); could be used for the cancelation case as well, instead of a property.

@Kirill-Maurin
Copy link

Is an AppendCurrentStackFrame call really necessary?
We don't want to catch an exception, we just want to throw it.
Without this call, the code should be radically faster
Even if this call is needed for normal exceptions, TaskCanceledException and its descendants can definitely do without it

@CyrusNajmabadi
Copy link
Member

Even if this call is needed for normal exceptions, TaskCanceledException and its descendants can definitely do without it

I've definitely had to track down crashes related to cancellation tokens. I don't think not having stacks in that case would be ok :(

@timcassell
Copy link

I've definitely had to track down crashes related to cancellation tokens. I don't think not having stacks in that case would be ok :(

Because of async void, or something else? Tbh, I think async void should suppress thrown OperationCanceledExceptions.

@CyrusNajmabadi
Copy link
Member

No. things like 'incorrect cancellation token used' also come up. Where we are doing semi-complex things with linked cancellation tokens and we end up returning the wrong cancellation token out of something.

@Kirill-Maurin
Copy link

I've definitely had to track down crashes related to cancellation tokens. I don't think not having stacks in that case would be ok :(

I understand that without a stack there are difficulties in debugging.
But with stacks TaskCanceledException is absolutely useless
In my case it took 10 minutes to stop the microservice at 100% load of all processor cores
The optimized version stopped in less than two seconds without increasing CPU load

@stephentoub
Copy link
Member Author

stephentoub commented Mar 15, 2023

But with stacks TaskCanceledException is absolutely useless

The extra frame this proposal suggests adding to the exception's trace should be cheap. It's not a throw, nor is it a stack walk. It could be as simple as appending a const string generated by the compiler, similar to CallerMethodName, not even needing reflection of any kind.

I don't see a good reason to special-case OperationCanceledException here, nor would I want to complicate code gen for it even more, nor sacrifice debugability.

@Kirill-Maurin
Copy link

The extra frame this proposal suggests adding to the exception's trace should be cheap. It's not a throw, nor is it a stack walk. It could be as simple as appending a const string generated by the compiler, similar to CallerMethodName, not even needing reflection of any kind.

This is a very good solution, thank you
Then it seems strange why the benchmark showed such an insignificant performance gain after optimization
In my case there were about two thousand tasks with a stack depth of 5-10 calls

@stephentoub
Copy link
Member Author

stephentoub commented Mar 15, 2023

showed such an insignificant performance gain after optimization

Which benchmark, and after what optimization?

@CyrusNajmabadi
Copy link
Member

In my case it took 10 minutes to stop the microservice at 100% load of all processor cores

Can you provide a trace. While cancellation is me expensive, it shouldn't ever be that bad.

@sharwell
Copy link
Member

The extra frame this proposal suggests adding to the exception's trace should be cheap. It's not a throw, nor is it a stack walk. It could be as simple as appending a const string generated by the compiler, similar to CallerMethodName, not even needing reflection of any kind.

This is a very good solution, thank you Then it seems strange why the benchmark showed such an insignificant performance gain after optimization In my case there were about two thousand tasks with a stack depth of 5-10 calls

The benchmark in this issue demonstrates the performance advantage of moving from throwing an exception twice (current behavior) to once (new behavior). The benchmark in the linked issue shows the performance advantage of moving from throwing an exception once (current behavior) to not throwing an exception at all (new behavior). It's a slightly different scenario. In practice, both proposals have similar performance characteristics (i.e. many code situations will show noticeable benefits, while a subset of those will show exceptional benefits).

@timcassell
Copy link

If an awaitable stores the exception in an ExceptionDispatchInfo object, I think it will need to expose a new API to get the exception with the state restored without throwing. I'm not sure if ExceptionDispatchInfo.SourceException already does that. The docs don't mention it, so I assume it doesn't. This is necessary for multiple async methods consuming the same awaitable instance.

@jaredpar jaredpar modified the milestones: C# 12.0, 17.9 Sep 11, 2023
@jaredpar jaredpar modified the milestones: 17.9, 17.10 Nov 14, 2023
@jcouv jcouv modified the milestones: 17.10, 17.11 Mar 26, 2024
@jcouv jcouv moved this to Active/Investigating in Compiler: Julien's umbrellas Jun 3, 2024
@jaredpar jaredpar modified the milestones: 17.11, Backlog Jul 23, 2024
@Mart-Bogdan
Copy link

Mart-Bogdan commented Sep 5, 2024

It occurred to me that AppendCurrentStackFrame could be quite costly. We need to get stack frame at run-time. It might be comparable to throw-catch (as it would also append frame).

So I propose to create StackFrame beforehand, and just append frame to current trace. This must be way faster!

There are two ways ho we can do this:

  1. Create nullable static field to cache StackFrame and lazily create new StackFrame() which would capture current frame. In this case we would incur runtime cost only once (or few times if method is called concurrently and I assume we won't do locking here, as it's still safe to create two instances of StackFrame and discard one by overwriting field).
  2. Given that we are already inside a compiler, we already know line number etc. We can create StackFrame with this knowledge and don't need runtime cost at all! We can create static field to cache StackFrame (and initialize it in static .ctor or lazily), and put there new Stack frame, putting into it line number and MethodInfo acquired using IL (methodof). We would need to create constructor or factory method to create StackFrame with method info, line number, file name.

Second option would create even better performance, but unsure if we want to create such a constructor, or to expose line number information in constants, if there are no debug symbols (Obfuscators, IDK, protection from cracking). Me personally don't care about hiding line numbers.

@comm100-allon
Copy link

The performance degradation caused by exceptions in asynchronous call chains can be catastrophic. We use an ASP.NET Core Middleware to handle exceptions uniformly, but this middleware is followed by several other middleware components, such as authentication and routing. Additionally, within MVC controllers, there are many layers of asynchronous calls. This results in more than ten layers of asynchronous operations between the point where an exception is thrown and the final exception handling module. When there are database fluctuations, exceptions can proliferate, drastically reducing system throughput. Under high concurrency, the lengthening of the thread pool's waiting queue and insufficient threads in the pool can lead to performance degradation of over 10,000 times. This is completely unacceptable, and due to this issue, we are considering migrating from .NET to the Java platform.

@jkotas
Copy link
Member

jkotas commented Sep 14, 2024

@comm100-allon You may want to try .NET 9. Performance and scalability of exceptions have been significantly improved in .NET 9. The exception storms that you describe were primary motivating scenario.

@Mart-Bogdan
Copy link

@comm100-allon You might want to use hand written exception handling Middleware, as current implementation is invoking whole middleware chain second time to route exception to exception handle controller. It looks like overkill for me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Compilers Concept-API This issue involves adding, removing, clarification, or modification of an API. Feature Request
Projects
Status: Active/Investigating
Development

No branches or pull requests