Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Seeming race condition for the synchronous client #634

Closed
sigmavirus24 opened this issue Feb 10, 2021 · 9 comments · Fixed by RC-MODULE/rumboot-tools#6
Closed

Seeming race condition for the synchronous client #634

sigmavirus24 opened this issue Feb 10, 2021 · 9 comments · Fixed by RC-MODULE/rumboot-tools#6
Assignees
Labels

Comments

@sigmavirus24
Copy link

Hi there old friend,

We just upgraded from 4.6.1 to 5.0.4. Prior to the upgrade, our code was working totally fine but now occasionally crashes.

Our code first callsconnect(url, transports=["websockets"], headers={"Authorization": "FakeAuthHeader"}) which succeeds and the library's logging prints Engine.IO connection established. Sometimes, it then promptly prints Namespace / is connected. Other times, our code reaches emit("auth", API_KEY) which then errors with socketio.exceptions.BadNamespaceError: / is not a connected namespace. but interleaved in the traceback is Namespace / is connected. It seems like there's a race-condition in how connect(...) -> _handle_eio_connect -> _send_packet -> _handle_connect flows and it's not entirely synchronous.

I skimmed the code but didn't immediately see the problem. Hopefully this helps.

@miguelgrinberg
Copy link
Owner

miguelgrinberg commented Feb 10, 2021

Hey Ian, great to hear from you!

This is not a new problem, but the error is indeed new. The Socket.IO connection happens in the background. When you call connect() you are initiating it, but several exchanges between the client and the server have to happen to get everything set up. You can be sure that it is safe to emit when the connect handler for your namespace is invoked.

In the 4.x and older releases if you called emit() after connect() and the connection wasn't fully set up the message could be lost silently. In the 5.x releases I attempted to "fix" this by adding the BadNamespaceError, which is triggered all the way up to the point where it is 100% safe to emit without the risk of losing a message.

You are not the first to complain about this, so I intend to improve this. The two options that I'm considering are either to delay the return from connect() until the namespaces are connected, or alternatively to cache emits if they are done before the namespace can be used. Either way this is going to be transparent, you will just stop getting BadNamespaceErrror and everything is going to work.

What I've been telling people as a workaround is to move your emit to the connect event handler, and then it'll work 100%. But for some people this is weird, they prefer to have a sequential logic in their client, so I've seen code that does a sleep(1) between the connect and the emit and that also sort of works.

It doesn't appear that I have any open issues to track this, so I might as well use this one. Thanks!

@miguelgrinberg miguelgrinberg self-assigned this Feb 10, 2021
@sigmavirus24
Copy link
Author

The emits being silently dropped makes a lot of sense now in the context of other behaviour we were seeing. That's super helpful to understand. Also I had no clue it would be safe to use a connect handler to emit those messages. I think we'll try that and see how it works. Thanks!

@miguelgrinberg
Copy link
Owner

miguelgrinberg commented Feb 15, 2021

@sigmavirus24 I have attempted to fix this by adding a wait argument to the connect() method. There is no need to change anything in the application code, the default for wait is True. There is also a wait_timeout, which is 1 second by default, but can be made larger for slow networks.

I'll run some more test in the next couple of days. For now, you can test by installing this package from git.

@jsib0
Copy link

jsib0 commented May 20, 2021

This problem still persist even with the upgrade. In my case, I am testing the lost of wifi connection of the Raspberry Pi4 by running sudo ifconfig wlan0 down ,then sudo ifconfig wlan0 up. Once the internet is back up, I get the same error of is not connected to namespace <namespace>

@miguelgrinberg
Copy link
Owner

@jsib0 my guess is that your problem is different, but in any case, if you think this is a problem then write a new issue and include detailed logs that cover the incident.

@leolo0626
Copy link

leolo0626 commented Aug 31, 2022

connect

Can you give me some example on how to move the code inside connect event handler ? @miguelgrinberg

@miguelgrinberg
Copy link
Owner

@leolo0626 this issue has been fixed long ago. If you have a question that applies to current code, please write it with all the context in the discussions board.

@shrikantpadhy18
Copy link

@sigmavirus24 I have attempted to fix this by adding a wait argument to the connect() method. There is no need to change anything in the application code, the default for wait is True. There is also a wait_timeout, which is 1 second by default, but can be made larger for slow networks.

I'll run some more test in the next couple of days. For now, you can test by installing this package from git.

Do you mean to say self.sio.connect(hostname, headers={"User-Agent": "python-socketio[client]/socket"},auth=auth,transports="websocket", wait_timeout=3) increasing wait_timeout can resolve this issue(badnamespace)? currently its 3 second

@miguelgrinberg
Copy link
Owner

@shrikantpadhy18 if you are connecting to a slow server, then yes, increasing the wait timeout may help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants