-
-
Notifications
You must be signed in to change notification settings - Fork 150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AsyncServer.handle_request throws unhandled KeyError exceptions #391
Comments
I have not removed these exceptions because when these conditions occur it is a clear indication that you have a problem that you need to address. You are implying that this is an internal server issue, but it isn't. This error means that your clients are missing the deadlines to respond to PING packets, or that your server is so busy that it is unable to process the PONG responses from the clients in a timely manner (as I explained many many times in the issues this is discussed). Either way if you see these often then you have a problem that needs to be looked into, and reducing the visibility of these errors may make you less aware of the problem. But I guess I could be convinced to downgrade this to an error log that does not include the stack trace. |
Yes, in general, that's what I'm talking about.
In the expected behavior I mentioned, that this issue could be handled by this branch of code python-engineio/src/engineio/async_server.py Lines 287 to 288 in a6e5d92
Here we log the exception and respond with bad request to the client. This way we keep the issue visible for the client and keep it visible for the developer. So we don't remove this error, but provide formatting and handling for this error on EngineIO layer. Is there a specific reason that we need to crash the server by unhandled exception without responding to the client with structured response? |
Also, the reason, why I'm assuming that this is a bug, because if some time passes, the python-engineio/src/engineio/async_server.py Lines 287 to 288 in a6e5d92
So the situation is still the same: e.g. client disconnects from server and continue to send requests in closed session. So it looks like, it is just a race condition, isn't? |
You are assuming that the "Invalid session" error has the same purpose, but it is there to cover mainly other problems, such as the server receiving an invalid sid that never existed, or a sid that is owned by a different worker. In these cases the server does not have enough information to provide a better error message. But when the server receives a sid that was recently closed it can give you a slightly more informative message. You are correct that eventually the server will forget about this sid and if the client continues to send requests they'll be logged with the invalid message, but clients are unlikely to insist anyway, so logging the first (and possibly only) error as a disconnected session makes sense to me, even if it isn't always possible due to timing. |
Got your idea, thanks.
This sounds great. If you need any assistance with pull request, please let me know. Generally, I think it will be enough, because most probably most of developers set up their monitoring systems and track error rate. |
I saw this issue in SocketIO issues section before, but by some reason it wasn't considered a bug.
Describe the bug
When Socket is closed, but hasn't been deleted from memory of BaseServer, it causes KeyError('Session is disconnected`) exceptions which are unhandled by EngineIO or SocketIO servers.
The exception is raised from this method:
python-engineio/src/engineio/base_server.py
Lines 227 to 229 in a6e5d92
when client continue to send polling GET requests and it ends up here (probably POST as well, but we didn't notice errors related to POST)
python-engineio/src/engineio/async_server.py
Lines 286 to 291 in a6e5d92
In some cases it's expected behavior and handled properly, like
python-engineio/src/engineio/async_server.py
Lines 114 to 120 in a6e5d92
and
python-engineio/src/engineio/async_server.py
Lines 189 to 197 in a6e5d92
Apparently, it's some kind of race condition when the socket has been closed, but wasn't deleted by service task
python-engineio/src/engineio/async_server.py
Lines 548 to 556 in a6e5d92
The issue is that: the server shouldn't raise an unhandled exception when the client sends requests which are not consistent with the server's internal state.
Right now we handle it by ourselves, but I believe that exceptions related to server's internal state should be managed by server itself.
To Reproduce
Hard to say how to reproduce it in e2e manner, because it just happens in production, and it's flaky, I didn't find a reason, why the closed Socket stays in sockets dictionary of AsyncServer.
But it could be reproduced in unit tests:
Expected behavior
In that case, I expect that request will end up in branch and I get BadRequest error from server:
python-engineio/src/engineio/async_server.py
Lines 287 to 288 in a6e5d92
Bot not in
python-engineio/src/engineio/async_server.py
Lines 289 to 309 in a6e5d92
Logs
In logs, I see common pattern: socket either was closed by server, either client itself dropped connection, then this unhandled error happens, when client continue to send GET/POST requests (checked about 15 cases manually).
If my assumption isn't correct, and you need detailed logs, please tell me I will collect it for you.
Idea of bugfix
Seems like it would be just enough to rewrite this part for handling GET/POST:
python-engineio/src/engineio/async_server.py
Lines 286 to 290 in a6e5d92
Handling KeyError as missing socket, then if socket is empty, we will go to the logical branch with bad request, otherwise continue processing request.
Additional context
In our team, we use SocketIO along with FastAPI.
We accept requests from clients with different platforms IOS, Android, Web.
Caught it even on low traffic ~1 rps.
Version: 4.9.1 of EngineIO, but as I see, the same code is till present in main branch.
The text was updated successfully, but these errors were encountered: