Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intermittent connection failures when using AMQPS over IPv6 on .NET 6 #242

Closed
peolivei2 opened this issue Mar 8, 2023 · 0 comments
Closed

Comments

@peolivei2
Copy link

peolivei2 commented Mar 8, 2023

Issue

Establishing new client connections fail intermittently when using the library to connect against an AMQPS endpoint over IPv6 in .NET 6 (if we switch the client code back to .NET core 2.1 or connect to an IPv4 broker, the errors no longer happen).

Repro steps

  1. Start the local test broker on an IPv6 amqps address, like the following:
.\bin\Debug\TestAmqpBroker\net6.0\TestAmqpBroker.exe amqps://[::1]:10196 /cert:localhost
  1. Run the following client code using .NET 6:
using System;
using System.Threading.Tasks;
using Microsoft.Azure.Amqp;
using Microsoft.Azure.Amqp.Transport;
using System.Collections.Generic;

namespace MyApp
{
    internal class Program
    {
        static HashSet<int> ints = new HashSet<int>();

        static void Main(string[] args)
        {
            Run().Wait();
        }

        static async Task Run()
        {
            Uri uri = new Uri("amqps://[::1]:10196/");

            for (int i = 0; i < 100; ++i)
            {
                AmqpConnectionFactory factory = new AmqpConnectionFactory();
                factory.Settings.TransportProviders.Add(new TlsTransportProvider(new TlsTransportSettings()
                {
                    CertificateValidationCallback = (a, b, c, d) => true,
                    CheckCertificateRevocation = false,
                    Protocols = System.Security.Authentication.SslProtocols.Tls12
                }));

                await factory.OpenConnectionAsync(uri, TimeSpan.FromSeconds(30));
                Console.WriteLine("Success");
            }
        }
    }
}

This code simply opens a connection 100 times. On .NET core 2.1 this code works fine, but on .NET 6, after a few iterations, the code eventually fails with the following exception:

System.IO.IOException : Transport 'tls4' is valid for write operations.
---- System.InvalidOperationException : This operation is only allowed using a successfully authenticated context.

Investigation

After a lengthy investigation, we were able to identify the root cause of the race condition in the following call on
TcpTransportInitiator.cs:44:

bool connectResult = Socket.ConnectAsync(SocketType.Stream, ProtocolType.Tcp, connectEventArgs);

When this call returns true all works well, which seems to always be the case in .NET core 2.1 or when connecting to an IPv4 broker. However, when it returns false, indicating that the connection was performed synchronously, the library breaks. In .NET 6, this call seems to return false from time to time for IPv6 sockets.

More specifically, when the call above returns false, it causes the following path on AmqpTransportInitiator.cs:367 to be executed twice:

     if (!thisPtr.CompleteSelf(args.CompletedSynchronously, args.Exception))
                {
                    if (args.Transport != null)
                    {
                        // completed by timer
                        args.Transport.Abort();
                    }
                }

The first time causes the operation to complete. The second time, however, because the operation was already completed once, causes Transport.Abort() to be called, which disposes the connection, cause the failures we see above.

@peolivei2 peolivei2 changed the title Possible race condition when establishing connections on .NET 6 Intermittent connection failures when using AMQPS over IPv6 on .NET 6 Mar 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants