Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TCPConnection._apply_backpressure() problem with _writeable flag? #2620

Closed
slfritchie opened this issue Mar 30, 2018 · 2 comments · Fixed by #2627
Closed

TCPConnection._apply_backpressure() problem with _writeable flag? #2620

slfritchie opened this issue Mar 30, 2018 · 2 comments · Fixed by #2627

Comments

@slfritchie
Copy link
Contributor

slfritchie commented Mar 30, 2018

I'm 80% somewhat certain this might be a bug. The _apply_backpressure() function in TCPConnection doesn't appear to be doing the right thing with the object's _writeable flag.

If a TCP connection has been throttled, then all is well if a subsequent write is able to write all buffered data successfully. However, if the object cannot write everything before hitting another @pony_os_writev() call returning 0, then we enter bug territory.

The while loop inside of _pending_writes() depends in part on the state of _writeable. Let's say that our subsequent write came from an ASIO event. _event_notify() will set _writeable = true and then call _pending_writes(), see https://github.com/ponylang/ponyc/blob/0.21.3/packages/net/tcp_connection.pony#L503-L506

However, if _pending_writes() cannot flush everything, it will eventually call _apply_backpressure. But we were throttled earlier & haven't yet been unthrottled, so _throttled = true. In its current form, for non-Windows platforms at least, _apply_backpressure() is a noop.

What I've witnessed is that, when these conditions described above happens, _pending_writes() goes into a tight spin loop, and it stays there until the socket conditions change enough so that all buffered data gets written. When the peer on the other side of the TCP connection is behaving well, this loop exits pretty quickly. But if the peer or the network is screwed up, then I believe the worst case is that the actor's behavior function won't return, and that will tie up a scheduler thread, ouch.
I've witnessed the spin loop for short periods of time at https://github.com/ponylang/ponyc/blob/0.21.3/packages/net/tcp_connection.pony#L506

For my tiny little test case, the following appears to fix the bug: turn off _writeable and then tell ASIO to tell us when the socket is writeable. However, I'm not sure yet if it'd break something else.

  fun ref _apply_backpressure() =>
    if not _throttled then
      _throttled = true
      _notify.throttled(this)
    end
    ifdef not windows then
      if _writeable then
        _writeable = false

        // this is safe because asio thread isn't currently subscribed
        // for a write event so will not be writing to the readable flag
        @pony_asio_event_set_writeable(_event, false)
        @pony_asio_event_resubscribe_write(_event)
      else
        None // TODO: Is it worth being paranoid here with an assertion?
      end
    end
@dipinhora
Copy link
Contributor

@slfritchie Good catch regarding the edge case where the actor is throttled and is unable to write out it's full amount. Your change should be a safe fix to the issue.

There is no need for the if _writeable then check because _apply_backpressure can only be called when _writeable is true from _pending_writes (all other places it might be called from are not relevant because they are for ifdef windows).

@slfritchie
Copy link
Contributor Author

slfritchie commented Apr 2, 2018

(deleted, sorry, I'd commented on the wrong issue)

@jemc jemc closed this as completed in #2627 Apr 2, 2018
jemc pushed a commit that referenced this issue Apr 2, 2018
Fix backpressure-related TCPConnection busy-loop bug (issue #2620).
dipinhora added a commit to WallarooLabs/pony-kafka that referenced this issue Apr 4, 2018
dipinhora pushed a commit to dipinhora/ponyc that referenced this issue Jun 5, 2018
Fix backpressure-related TCPConnection busy-loop bug (issue ponylang#2620).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants