Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possibly fragile coordination between mio and tokio for Windows event-triggered simulation #5866

Closed
jasta opened this issue Jul 13, 2023 · 21 comments · Fixed by #6668
Closed
Labels
A-tokio Area: The main tokio crate C-bug Category: This is a bug. M-net Module: tokio/net

Comments

@jasta
Copy link
Contributor

jasta commented Jul 13, 2023

Version
v1.29.1

Platform
Windows

Description
I have been working on adding poll support to mio (tokio-rs/mio#1687) and came across a surprising gotcha that I couldn't figure out where if I did buffered reads as opposed to single byte reads I would get into states where short reads would cause the socket to get stuck and no longer report readiness. I pored over differences between Windows (which also mimics edge-triggered events) and my poll impl only to find nothing until I started debugging from tokio's side and found this:

https://github.com/tokio-rs/tokio/blob/master/tokio/src/io/poll_evented.rs#L169

This got me thinking if we wanted to clean this up and make tokio a little bit more consistent in how it uses mio we have a few options:

  1. Leave it alone and add cfg!(mio_unsupported_force_poll_poll) to match the windows check. This seems like the easiest option but might end up really hard to maintain because mio_unsupported_force_poll_poll hopefully will eventually be supported for target_os="espidf" which would mean we would need to make sure to update tokio when that time comes otherwise an innocent mio diff will break the entire feature. In other words, this semi-fragile hack in tokio will become a very fragile hack if poll support is merged.
  2. Modify IoSource to have an extra method that passes a flag whether it was a short read or write. Lots of code churn here, but the tests should cover every case I'd bet and in fact the tests can be modified to prove the code is more correct than it was before (e.g. the tcp_stream test assert can be dropped as per https://github.com/tokio-rs/mio/pull/1687/files#r1259147052)
  3. Modify mio to pass back hints as to whether readiness can be cleared or not. This is disruptive for sure, but it has the nice effect of being able to expand the clear_readiness hack to other APIs like UDP's recv_from. The original motivation was for performance (see 28ce4ee), so why not expand the wins to other cases?

I personally prefer (2) as it's probably the least disruptive and fragile, but I'm happy to proceed however ya'll prefer.

@jasta jasta added A-tokio Area: The main tokio crate C-bug Category: This is a bug. labels Jul 13, 2023
@jasta
Copy link
Contributor Author

jasta commented Jul 13, 2023

Cc @Noah-Kennedy due to #4840

@Noah-Kennedy
Copy link
Contributor

Regarding 3, I'm not sure how this could be extended to UDP as reads from those sockets correspond to discrete packets rather than pull bytes out of a continuous stream.

@Noah-Kennedy
Copy link
Contributor

I'll take some time to think this through later tonight. One thing we need to be careful with here is that this is quite the hot loop, but those perf constraints should be pretty manageable.

@jasta
Copy link
Contributor Author

jasta commented Jul 14, 2023

Regarding 3, I'm not sure how this could be extended to UDP as reads from those sockets correspond to discrete packets rather than pull bytes out of a continuous stream.

You're right ofc, I double checked the man page as well and it confirms it. I wasn't thinking clearly about recv_from specifically because the API doesn't even allow this kind of thing (what if two peers send me packets?), but it does surprise me a little that you can't infer the same thing from send_to (i.e. I try to write 256 bytes, kernel says I wrote 100...isn't this a guarantee that EAGAIN will be returned...?). I'm guessing that there are conditions in which it's not true though I can't imagine what they would be (we already have EMSGSIZE if you send a datagram packet too large). I dunno, not worth pursuing given that epoll so plainly tells you what you can expect.

EDIT: Actually I bet it's that you'd get EAGAIN for peer A but maybe not for peer B.

jasta added a commit to jasta/tokio that referenced this issue Jul 14, 2023
The new mio_unsupported_force_poll_poll behaviour works the same as
Windows (using level-triggered APIs to mimic edge-triggered ones) and it
depends on intercepting an EAGAIN result to start polling the fd again.

Closes tokio-rs#5866
@Darksonn Darksonn added the M-net Module: tokio/net label Jul 14, 2023
@Noah-Kennedy
Copy link
Contributor

Regarding 3, I'm not sure how this could be extended to UDP as reads from those sockets correspond to discrete packets rather than pull bytes out of a continuous stream.

You're right ofc, I double checked the man page as well and it confirms it. I wasn't thinking clearly about recv_from specifically because the API doesn't even allow this kind of thing (what if two peers send me packets?), but it does surprise me a little that you can't infer the same thing from send_to (i.e. I try to write 256 bytes, kernel says I wrote 100...isn't this a guarantee that EAGAIN will be returned...?). I'm guessing that there are conditions in which it's not true though I can't imagine what they would be (we already have EMSGSIZE if you send a datagram packet too large). I dunno, not worth pursuing given that epoll so plainly tells you what you can expect.

EDIT: Actually I bet it's that you'd get EAGAIN for peer A but maybe not for peer B.

UDP writes are super weird under the hood in Linux. The kernel doesn't have a separate TX buffer for each UDP socket, and instead it just places packets directly in the queue to be transmitted to the NIC. I forget what the exact re-arming conditions for UDP writes currently are.

I'd hazard a guess that a short write with UDP might indicate a loss of readiness, but even if it does, this shouldn't be treated as a guarantee.

@jasta
Copy link
Contributor Author

jasta commented Jul 15, 2023

@Noah-Kennedy any thoughts on how we should proceed with the poll support coming? I can post a naive PR that just expands the cfg check to mio_unsuported_force_poll_poll but I feel a bit uneasy about choosing (1) above without discussion of alternatives...

@Noah-Kennedy
Copy link
Contributor

CC @Thomasdezeeuw

@Thomasdezeeuw
Copy link
Contributor

Technically, Tokio is depending on platform specific behaviour here (https://github.com/tokio-rs/tokio/blob/master/tokio/src/io/poll_evented.rs#L169). It works only because we currently only support epoll and kqueue on Unix, but as @jasta mentioned with poll that's about to change. That said, I personally also depend on this behaviour.

We have an open issue to change this: tokio-rs/mio#1611. But there is one catch which is try_io which doesn't get access to any buffers and or read/written number of bytes, so determining a short read/write/etc. is difficult.

The most correct way to resolve this is to remove the short read/write optimisation. But I understand no one wants the loss of performance. So checking for mio_unsuported_force_poll_poll would work... for a while at least, until we add targets that only support the poll(2) implementation (e.g. espidf or haiku), then it will break again (in a hard to debug way). We could go ahead with tokio-rs/mio#1611, but I think we should consider that a breaking change, but I'm not sure yet.

I would check for mio_unsuported_force_poll_poll for now as a short term solution and we can communicate Mio <-> Tokio when we start adding poll(2) targets without the use of mio_unsuported_force_poll_poll.

@jasta
Copy link
Contributor Author

jasta commented Jul 18, 2023

I'll go ahead and throw up a PR with strategy (1) then, thanks @Thomasdezeeuw !

jasta added a commit to jasta/tokio that referenced this issue Jul 18, 2023
The new mio_unsupported_force_poll_poll behaviour works the same as
Windows (using level-triggered APIs to mimic edge-triggered ones) and it
depends on intercepting an EAGAIN result to start polling the fd again.

Closes tokio-rs#5866
@Thomasdezeeuw
Copy link
Contributor

I would check for mio_unsuported_force_poll_poll for now as a short term solution and we can communicate Mio <-> Tokio when we start adding poll(2) targets without the use of mio_unsuported_force_poll_poll.

This day has come. tokio-rs/mio#1721 will add Vita support based on the poll(2) and tokio-rs/mio#1715 for Solaris.

@Darksonn
Copy link
Contributor

How should we detect this going forward?

@Thomasdezeeuw
Copy link
Contributor

How should we detect this going forward?

Technically speaking Tokio is depending on unspecified behaviour of Mio. However in practice we're not going to actively make this worse (as the we would have to work around the OS).

I think the most robust way forward is to invert the detection. Instead of assuming that we always rearm when e.g. having a short read, assume the opposite and only for known platforms trigger the clear_readiness branch. For certain we can say that kqueue(2) and epoll(2) based implementations are ok with this optimisation. See https://github.com/tokio-rs/mio/blob/8eb4010a92bede550850e177d3dd7c4c76eb90ba/src/sys/unix/selector/mod.rs for an up to date list. Note that it means we need to keep the cfg!(mio_unsupported_force_poll_poll) check as that overwrite the implementation to use poll(2) (which is not ok with this optimisation).

@Darksonn
Copy link
Contributor

Could mio expose a const boolean that says whether the current platform behaves like this on short reads?

@Thomasdezeeuw
Copy link
Contributor

Could mio expose a const boolean that says whether the current platform behaves like this on short reads?

I'm not sure. I think that would be exposing to much platform detail to be honest.

@Darksonn
Copy link
Contributor

If you think that the best approach is to hard-code a list of platforms that support it, then we can go for that.

@Thomasdezeeuw
Copy link
Contributor

If you think that the best approach is to hard-code a list of platforms that support it, then we can go for that.

"best" and "hard-code" rarely go hand in hand, but I think it's the best approach in this case.

@Thomasdezeeuw
Copy link
Contributor

This has come up again, now for Fuchsia, in tokio-rs/mio#1809. Also note that Mio now support multiple target that use the poll(2) implementation (on which this issue exists) without any RUSTFLAGS. So the fix added in #5881 is no longer sufficient.

@Darksonn
Copy link
Contributor

I see that we discussed hard-coding the list of platforms earlier, but it doesn't seem like we ever did so. We should make sure to do it.

@Thomasdezeeuw
Copy link
Contributor

@Darksonn
Copy link
Contributor

So we should only enable the optimization when this matches?

#[cfg_attr(all(
    not(mio_unsupported_force_poll_poll),
    any(
        target_os = "android",
        target_os = "illumos",
        target_os = "linux",
        target_os = "redox",
    )
), path = "selector/epoll.rs")]

What about kqueue platforms?

@Thomasdezeeuw
Copy link
Contributor

What about kqueue platforms?

I think kqueue(2) also supports reregistering on short reads, but I'm not a 100% sure for all platforms

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-tokio Area: The main tokio crate C-bug Category: This is a bug. M-net Module: tokio/net
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants
@jasta @Darksonn @Thomasdezeeuw @Noah-Kennedy and others