-
-
Notifications
You must be signed in to change notification settings - Fork 419
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TCPConnection._apply_backpressure() problem with _writeable flag? #2620
Comments
@slfritchie Good catch regarding the edge case where the actor is There is no need for the |
(deleted, sorry, I'd commented on the wrong issue) |
Fix backpressure-related TCPConnection busy-loop bug (issue ponylang#2620).
I'm 80% somewhat certain this might be a bug. The
_apply_backpressure()
function inTCPConnection
doesn't appear to be doing the right thing with the object's_writeable
flag.If a TCP connection has been throttled, then all is well if a subsequent write is able to write all buffered data successfully. However, if the object cannot write everything before hitting another
@pony_os_writev()
call returning 0, then we enter bug territory.The
while
loop inside of_pending_writes()
depends in part on the state of_writeable
. Let's say that our subsequent write came from an ASIO event._event_notify()
will set_writeable = true
and then call_pending_writes()
, see https://github.com/ponylang/ponyc/blob/0.21.3/packages/net/tcp_connection.pony#L503-L506However, if
_pending_writes()
cannot flush everything, it will eventually call_apply_backpressure
. But we were throttled earlier & haven't yet been unthrottled, so_throttled = true
. In its current form, for non-Windows platforms at least,_apply_backpressure()
is a noop.What I've witnessed is that, when these conditions described above happens,
_pending_writes()
goes into a tight spin loop, and it stays there until the socket conditions change enough so that all buffered data gets written. When the peer on the other side of the TCP connection is behaving well, this loop exits pretty quickly. But if the peer or the network is screwed up, then I believe the worst case is that the actor's behavior function won't return, and that will tie up a scheduler thread, ouch.I've witnessed the spin loop for short periods of time at https://github.com/ponylang/ponyc/blob/0.21.3/packages/net/tcp_connection.pony#L506
For my tiny little test case, the following appears to fix the bug: turn off
_writeable
and then tell ASIO to tell us when the socket is writeable. However, I'm not sure yet if it'd break something else.The text was updated successfully, but these errors were encountered: