Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ConnectionReset exception #45

Open
HolyPrapor opened this issue Dec 29, 2020 · 5 comments · May be fixed by #54
Open

ConnectionReset exception #45

HolyPrapor opened this issue Dec 29, 2020 · 5 comments · May be fixed by #54

Comments

@HolyPrapor
Copy link

This client was used for a long time on Windows without any issues. A couple of month ago we tried to use this client on .NET Core and we tested it on Linux and Windows.

In our project we use ZooKeeperClient a lot to read nodes and set watchers.

Windows version works flawlessly.
However, Linux version causes Connection reset by peer exception.
I investigated this problem and read Zookeeper logs. I found out that Zookeeper didn't reset it's connection.
I didn't capture any tcp dumps, but I'm pretty sure there are no TCP RST packets.

Upgrading to .NET 5 makes the situation even worse. (ConnectionLossExceptions appear more often).

I decided to go deeper into the ZooKeeperClient code.
I found a check which causes false-detected connection loss.

Unfortunately, I was not able to detect what causes this effect and how to reproduce this problem. Looks like a problem with sockets on Linux.

Removing this check solves the problem.

Also, this client sends KeepAlive pings anyway, so if there IS a real connection loss, we will know about it in a soon time (either next time we try to send something or next ping).

@HolyPrapor
Copy link
Author

According to SO the most proper way to check if a socket is connected is to check if there any bytes available to read and call Poll method of the socket.
This PR resolves the issue with sockets.

@MatsKarlsson
Copy link

Been having the same problem, org.apache.zookeeper.KeeperException.ConnectionLossException started to appear more frequently in .netcore3, but when trying to upgrade to .NET5 I get it all the time.

Using MacOS and Big Sur.

@kuskmen
Copy link

kuskmen commented Aug 23, 2021

I am afraid we also started facing the same issue, would be nice if PR is reviewed and released if it solves the issue.

@douggish
Copy link

douggish commented Mar 21, 2022

We upgraded to .NET 6 from .NET Core 3.1 and see this very frequently now when running within a linux docker container.

@madelson
Copy link

madelson commented Dec 1, 2022

Can the code be changed to follow the guidance from the MSFT docs for checking connected?

// .Connect throws an exception if unsuccessful
client.Connect(anEndPoint);

// This is how you can determine whether a socket is still connected.
bool blockingState = client.Blocking;
try
{
    byte [] tmp = new byte[1];

    client.Blocking = false;
    client.Send(tmp, 0, 0);
    Console.WriteLine("Connected!");
}
catch (SocketException e)
{
    // 10035 == WSAEWOULDBLOCK
    if (e.NativeErrorCode.Equals(10035))
    {
        Console.WriteLine("Still Connected, but the Send would block");
    }
    else
    {
        Console.WriteLine("Disconnected: error code {0}!", e.NativeErrorCode);
    }
}
finally
{
    client.Blocking = blockingState;
}

Console.WriteLine("Connected: {0}", client.Connected);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants