-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Send WindowUpdate message early #100
Comments
So intuitively, the idea is that sending (partial) window updates should compensate to some degree for senders having to pause due to transmission delays, is that right? Since if the transmission delay is negligible, the sender pretty much gets the window update as soon as the current window is exhausted (assuming the reader can consume data at the same pace as it is sent and received). What exactly do you have in mind in terms of implementation? My first thought would be to introduce a numeric factor to the configuration that influences the threshold for window updates w.r.t. the current receive window, and which is currently fixed to 0. I think it would apply equally to both of the existing |
Correct.
Thus far I have only tested out the naive approach of sending the WindowUpdate after half of the window has been exhausted, hoping for the WindowUpdate to arrive at the sender before the sender exhausted the other half. I have not yet measured the overhead of potentially sending twice as many WindowUpdate messages back to the sender.
Off the top of my head I am reluctant to introduce yet another tunable. I find todays configuration surface, while small, already difficult to get right . That said, I have not yet explored the impact different numeric factors would have.
Thanks for raising this. Thus far I was under the impression with
I haven't done enough testing, but I can already share a specific use-case which looks promising. Using the altered benchmarks from #102 with an adsl2+ connection (20 Mbit/s, 20ms RTT), each message being 4096 bytes I end up with:
As one would expect, sending an early window update while concurrently using Using only a single stream one can see an increase from 1.7 MBit/s to 2.3 MBit/s, thus a 35 % increase in bandwidth. Note that by sending WindowUpdate messages early using a single stream one achieves the same bandwidth as one would with 10 streams sending WindowUpdate messages either early or on 0. |
For the sake of completeness, here are the numbers for all network types, again comparing WindowUpdate messages send at window exhaustion (0) vs WindowUpdate messages send on exhaustion of half the window:
The characteristics described in the previous comment show up once again for the remaining network types except for the |
Numbers presented above are using
Same characteristics described above for
|
libp2p/rust-libp2p#1849 describes a related performance optimization proposal. As a sender, when expecting to receive a large message (say 10 MB), one could increase the receive window up-front, allowing the sender to send the entire large message in one go instead of sending chunks bounded by the default receive window. The below benchmarks use a message size of 10MB following the example in libp2p/rust-libp2p#1849. First off lets look at the performance of Yamux based on the current
As already noted in previous comments, the current implementation on Running the same benchmark once more with a receive window set to the message size (10MB) as suggested in libp2p/rust-libp2p#1849 the above mentioned decreased bandwidth when using a single channel is gone (see adsl2+ 2.3 MB/sec == 2.3 MB/sec).
How would the proposal described in this issue cope with a large message size (10MB), sending WindowUpdate messages early but leaving the receive window at the default value (256 KB)? Off the top of my head, I expected similar numbers to the optimization suggested in libp2p/rust-libp2p#1849 given that with early WindowUpdate messages the sender should never be blocked for a whole round-trip waiting for a WindowUpdate message.
My assumption turns out to be wrong. In the single stream case one achieves a similar bandwidth to the first benchmark using plain The answer is simple: When sending a data frame, Yamux will choose the smaller of (1) available credit (window) and (2) available data to be send as the size for the next frame. In the large message (10MB) example each frame will thus have the size of the available credit. The receiver considers sending a new WindowUpdate messsage before each frame received. Given that each frame exhausts the entire window, the receiver never sends early WindowUpdate messages, but instead, sends them once the window is 0 (after each frame). The sender ends up being blocked for a whole round trip waiting for the receiver to send a WindowUpdate message, just like the Yamux version on current As mentioned before, Yamux is inspired by HTTP/2, so how does HTTP/2 solve this issue?
https://tools.ietf.org/html/rfc7540#section-4.2 To achieve a similar bandwidth to the proposal in libp2p/rust-libp2p#1849, how about sending only small data frames, like the HTTP/2 specification suggests, allowing the receiver to send early WindowUpdate messages in between frames? Below is the output of a benchmark run, sending early WindowUpdate messages, using the default receive window (256 KB), limiting the size of data frames to half of the default receive window (256KB / 2 = 128 KB). Sending multiple small frames instead of one large frame introduces an overhead due to the additional Yamux headers. A Yamux header is 12 bytes large, adding an additional header for every 128KB of payload is a negligible overhead (
The benchmark shows that with (1) early WindowUpdate messages and (2) data frames restricted in size one achieves the same bandwidth in the single stream case across the different network types as the proposal described in libp2p/rust-libp2p#1849 achieves. All without the need to configure the window size. (Note: It might still be worth increasing the window size when operating on top of a network with a high Bandwidth Delay Product (Long fat network). As a first guess I would expect Yamux to be operated mostly on networks with a bandwidth delay product below the default window size (256KB) (see examples) though I would need to put more thoughts and benchmarking into this to form a solid opinion.) |
Above @romanb started the discussion on when to send an early WindowUpdate message.
Thus far I have only experimented with the naive approach of sending a WindowUpdate message after half or more of the window has been used. How does sending WindowUpdate messages earlier or later than that influence bandwidth? First off, to establish some groundwork, the below benchmark output compares what we already have today (
As you can see sending the WindowUpdate message on 0 ( How do other strategies behave? Below is the benchmark output for sending the WindowUpdate message after 1/4 (
Sending the WindowUpdate message after 1/4th of the window being consumed ( For now I will stick with sending the WindowUpdate message after half of the window being consumed. I am happy to benchmark other alternative strategies though. |
To prevent senders from being blocked every time they have sent RECEIVE_WINDOW (e.g. 256 KB) number of bytes, waiting for a WindowUpdate message granting further sending credit, this commit makes Yamux send a WindowUpdate message once half or more of the window has been received (and consumed in `WindowUpdateMode::OnRead`). Benchmarking shows that sending WindowUpdate messages early prevents senders from being blocked waiting for additional credit on common network types. For a detailed discussion as well as various benchmark results see libp2p#100. Next to the above, this commit includes the following changes: - Use `WindowUpdateMode::OnRead` in benchmarks. I would argue that `OnRead` should be the default setting, given the importance of flow-control. With that in mind, I suggest to benchmark that default case. - Ignore WindowUpdate messages for unknown streams. With this commit WindowUpdate messages are sent more agressively. The following scenario would surface: A sender would close its channel, eventually being garbage collected. The receiver, before receiving the closing message, sends out a WindowUpdate message. The sender receives the WindowUpdate message not being able to associate it to a stream, thus resetting the whole connection. This commit ignores WindowUpdate messages for unknown streams instead. When sending very large messages, WindowUpdate messages are still not returned to the sender in time, thus the sender still being blocked in such case. This can be prevented by splitting large messages into smaller Yamux frames, see libp2p#100 (comment). This additional optimization can be done in a future commit without interfering with the optimization introduced in this commit.
To prevent senders from being blocked every time they have sent RECEIVE_WINDOW (e.g. 256 KB) number of bytes, waiting for a WindowUpdate message granting further sending credit, this commit makes Yamux send a WindowUpdate message once half or more of the window has been received (and consumed in `WindowUpdateMode::OnRead`). Benchmarking shows that sending WindowUpdate messages early prevents senders from being blocked waiting for additional credit on common network types. For a detailed discussion as well as various benchmark results see libp2p#100. Next to the above, this commit includes the following changes: - Use `WindowUpdateMode::OnRead` in benchmarks. I would argue that `OnRead` should be the default setting, given the importance of flow-control. With that in mind, I suggest to benchmark that default case. - Ignore WindowUpdate messages for unknown streams. With this commit WindowUpdate messages are sent more agressively. The following scenario would surface: A sender would close its channel, eventually being garbage collected. The receiver, before receiving the closing message, sends out a WindowUpdate message. The sender receives the WindowUpdate message not being able to associate it to a stream, thus resetting the whole connection. This commit ignores WindowUpdate messages for unknown streams instead. When sending very large messages, WindowUpdate messages are still not returned to the sender in time, thus the sender still being blocked in such case. This can be prevented by splitting large messages into smaller Yamux frames, see libp2p#100 (comment). This additional optimization can be done in a future commit without interfering with the optimization introduced in this commit.
Following up once more on #100 (comment) more specifically limiting the maximum size of the payload in a Yamux data frame. The benchmarks below are run on top of #109 using a message size of 10 MiB.
The benchmarks do not give a clear winner. I am leaning towards simply using 16KiB given that it outperforms the status quo in each run (except for the unconstrained network type), introduces a low overhead and is in line with the HTTP/2 spec (though I could not yet find the reasoning for choosing 16 KiB in HTTP/2). |
To add some data to the above from a more realistic environment, I have setup a virtual server in Helsinki. On this server I deployed:
First off, to establish a baseline, here is the output of an iperf run:
Running option (1) (current develop branch):
Running option (2) (early window updates, small frames):
I would boldly claim that this whole effort does get us a bit closer to what a plain TCP connection can do 🎉. |
To prevent senders from being blocked every time they have sent RECEIVE_WINDOW (e.g. 256 KB) number of bytes, waiting for a WindowUpdate message granting further sending credit, this commit makes Yamux send a WindowUpdate message once half or more of the window has been received (and consumed in `WindowUpdateMode::OnRead`). Benchmarking shows that sending WindowUpdate messages early prevents senders from being blocked waiting for additional credit on common network types. For a detailed discussion as well as various benchmark results see libp2p/rust-yamux#100. Next to the above, this commit includes the following changes: - Use `WindowUpdateMode::OnRead` in benchmarks. I would argue that `OnRead` should be the default setting, given the importance of flow-control. With that in mind, I suggest to benchmark that default case. - Ignore WindowUpdate messages for unknown streams. With this commit WindowUpdate messages are sent more agressively. The following scenario would surface: A sender would close its channel, eventually being garbage collected. The receiver, before receiving the closing message, sends out a WindowUpdate message. The sender receives the WindowUpdate message not being able to associate it to a stream, thus resetting the whole connection. This commit ignores WindowUpdate messages for unknown streams instead. When sending very large messages, WindowUpdate messages are still not returned to the sender in time, thus the sender still being blocked in such case. This can be prevented by splitting large messages into smaller Yamux frames, see libp2p/rust-yamux#100 (comment). This additional optimization can be done in a future commit without interfering with the optimization introduced in this commit.
Flow control in Yamux and HTTP/2
The Yamux flow control mechanism is very similar to HTTP/2's flow control. This is to no surprise, given that Yamux is inspired by the early SPDY efforts.
In both Yamux and HTTP/2 the WindowUpdate message is the integral part of the flow control mechanism.
https://github.com/hashicorp/yamux/blob/master/spec.md#flow-control
https://tools.ietf.org/html/rfc7540#section-5.2.1
In HTTP/2 it is up to the receiver when to send a WindowUpdate message. If I understand the short Yamux specification correctly, the same applies to Yamux.
See https://tools.ietf.org/html/rfc7540#section-5.2.1
For a general overview of HTTP/2 flow control I can recommend "HTTP/2 in Action" [1]. Chapter 7.2 on the topic can be (pre-)viewed on the publishers website.
WindowUpdate message strategies
HTTP/2 implementations can use WindowUpdate messages to implement various (possibly advanced) flow control strategies. One example of a simple strategy is the nghttp2 library which sends a WindowUpdate message once it (receiver) has received and consumed more than half of the flow control window.
Today with
WindowUpdateMode::OnRead
this Yamux implementation sends a WindowUpdate message once (a) the window credit has been fully depleted and (b) the read buffer is empty, thus all bytes have been consumed. See implementation for details.Comparison
Imagine the following simplified scenario:
A sender S is communicating with a receiver R. S wants to send 1 MB in multiple chunks to R. R uses a receive window of 256 KB (Yamux default). S and R are connected on some network inducing both a delay and a bandwidth constraint.
Algorithm 1 (Yamux today): Send WindowUpdate message once (a) window is fully depleted by sender and (b) receiver has consumed all send buffered bytes.
Once S has depleted its window, having sent 256 KB to R, S is blocked and has to wait for a WindowUpdate message to be send by R. This message is only send by R once 256 KB have been received and all of those are consumed. Thus every 256 KB the sender is blocked, having to wait for a whole round-trip (for all its data to arrive as well as for the WindowUpdate to be received).
Algorithm 2 (nghttp2): Sending WindowUpdate message once half or more of the window has been received and consumed.
While S sends the first 256 KB of data, R receives and consumes chunks of that data in parallel. Instead of waiting for the window to be fully depleted by S and fully consumed by R, R already sends a small WindowUpdate message (# bytes consumed of the current window) once half or more of the window has been depleted and consumed. This WindowUpdate will likely arrive at S before it depletes the entire window and thus S never stalls.
Summary
Long story short, to prevent senders from being blocked every time they have sent RECEIVE_WINDOW (256 KB) number of bytes, I am suggesting Yamux to adapt the same WindowUpdate strategy as nghttp2, namely to send a WindowUpdate message once half or more of the window has been received and consumed. An early benchmark with a patched Yamux using an adapted version of @twittner bench tool looks promising. I still have to do more testing on a high-latency network.
What do people think? Let me know if I am missing something. Happy for any pointers to similar discussions in the past.
[1] Pollard, Barry. HTTP/2 in Action. Manning, 2019.
The text was updated successfully, but these errors were encountered: