Retransmissions and send order #523

vasilvv · 2023-06-21T14:10:19Z

RFC 9000, Section 13.3 says "Endpoints SHOULD prioritize retransmission of data over sending new data, unless priorities specified by the application indicate otherwise; see Section 2.3." Since we are defining an API for specifying relative priorities of streams, we may want to define some behavior that developers could rely on.

I believe the main question here would be "should new data on higher sendOrder stream preempt retransmissions of data lost on a lower sendOrder stream?". I believe that the answer should be "yes", and at least one of the proposed MoQ priority schemes relies on that to be efficient (@kixelated should correct me if I am wrong). That said, as far as I understand, this is not what either Chrome or Firefox QUIC stack currently does.

(from my personal perspective, being able to control retransmission order is the main point of sendOrder API, since otherwise the API client can just order the writes itself)

LPardue · 2023-06-21T15:22:50Z

I agree that sendorder should be considered.

This spec probably wants to think about probe packets too, i.e. from RFC 9218 we said

Section 6.2.4 of [QUIC-RECOVERY] also highlights considerations regarding application priorities when sending probe packets after Probe Timeout timer expiration. A QUIC implementation supporting application-indicated priorities might use the relative priority of streams when choosing probe data.

Retransmissions also pose some interesting questions for fairness of bandwidth allocation between streams and datagrams but mayeb that's worth a separate issue.

aboba · 2023-06-21T16:17:49Z

Pre-empting retransmissions can be problematic:

If it causes discardable frames to pre-empt non-discardable frames. Since discardable frames depend on non-discardable frames, pre-empting retransmission of non-discardable frames magnifies loss. For example, I-frames are typically 10+ times larger than P-frames, so that retransmission of I-frames is much more likely than P-frames. As a result, pre-empting retransmission of I-frames dramatically increases I-frame loss probabilities. For example, with a packet loss probability of 1 percent, a 25-packet I-frame will have a loss probability of more than 22 percent if re-transmission is pre-empted.
If it complicates implementation of partial reliability. By setting a timer, it is currently possible to limit the total transmission time for frames. The maximum transmission time may be set higher for non-discardable frames than for discardable frames. Pre-empting the retransmission of non-discardable frames makes it very difficult to set an appropriate timer.

kixelated · 2023-06-21T18:34:30Z

So I implemented this at Twitch. I went with retransmissions according to send order because it was easier to implement, but I could have gone either way.

Deprioritize retransmissions

Flow control can be tricky when you deprioritize retransmissions. Any gap in a stream means that the tail is not flushed to the application but counts towards flow control. With a high send order, a stream may be starved and stay in this state indefinitely.

Eventually you might hit the MAX_DATA limit, in which case you the endpoint must go back and retransmit these old streams exclusively for the purpose of freeing up flow control (if they haven't been reset by now). However you can't transmit new data while at the MAX_DATA limit, so you won't be able to send a full flight and throughput will suffer.

So I would say that if you're going to deprioritize retransmissions, you should only do so while there's a full flight available in MAX_DATA, otherwise you risk a momentary stall. I feel like something is wrong if you've hit this point though, and it's jarring to switch from transmitting new frames to retransmitting old frames once you hit some invisible line.

Prioritize retransmissions

Prioritizing retransmissions independent of send order could degrade the user experience. At the extreme, if you lose an entire flight of packets (ex. network outage), then you have no choice but to retransmit that same flight immediately after recovery. The congestion window may be significantly smaller so you're spending at least a full RTT on lower priority streams

This might be a mostly an academic concern. If the network outage is short, then the send order for streams has likely not changed dramatically by the time the ACK arrives. If the network outage is long, then I'm not sure the user will care about "poor" prioritization shortly after recovery. But I do think that it kind of sucks to receive data from 5s ago after emerging from a tunnel rather than new data, even if it's only for the first few RTTs.

kixelated · 2023-06-21T18:39:01Z

If it causes discardable frames to pre-empt non-discardable frames. Since discardable frames depend on non-discardable frames, pre-empting retransmission of non-discardable frames magnifies loss. For example, I-frames are typically 10+ times larger than P-frames, so that retransmission of I-frames is much more likely than P-frames. As a result, pre-empting retransmission of I-frames dramatically increases I-frame loss probabilities. For example, with a packet loss probability of 1 percent, a 25-packet I-frame will have a loss probability of more than 22 percent if re-transmission is pre-empted.

That doesn't happen when send order matches dependencies. The I-frame MUST have a lower send order than P-frames than depend on it. Otherwise, with even the slightest congestion (independent of packet loss), the I-frame could be starved by frames that depend on it (tail-of-line blocking).

Unless you're talking about an I-frame from the previous GoP versus a P-frame from a future GoP. It's debatable if you want to transmit the most recent P-frame or (re)transmit part of an I-frame from X seconds ago. In my opinion, the send order is the order that the application WANTS the frames to arrive in, independent of any network conditions, so the relay should transmit accordingly.

If it complicates implementation of partial reliability. By setting a timer, it is currently possible to limit the total transmission time for frames. The maximum transmission time may be set higher for non-discardable frames than for discardable frames. Pre-empting the retransmission of non-discardable frames makes it very difficult to set an appropriate timer.

That's would happen with any prioritization scheme. Even without packet loss, the congestion window can prevent the full stream from being transmitted, and by the time more window is available, higher priority streams are available that take precedent regardless of the expiration timer. But I agree that prioritizing retransmissions based on send order would amplify the effect.

But that sort of partial reliability scheme wouldn't use send order at all. You would send every stream with the same priority (equal bandwidth) and reset them after x ms. Throwing send order into the mix is extremely confusing, and I'm not even sure how both send order and deadlines can work in tandem for congestion response.

martinthomson · 2023-06-22T01:21:12Z

I agree with @kixelated's comments on the substance of the challenges with retransmissions. My view is that a general implementation should prioritize retransmissions to avoid the flow control deadlock problem that doing otherwise might create. Offering a way to abandon (or set time limits) on retransmissions for lower priority (or lower send order) data would allow the stack to avoid the priority inversion that retransmissions might cause. Then, you can make the lifetime of a stream appropriate. For the video example, you might imagine key frames having a lifetime equal to the expected key frame interval, whereas other frames would have a lifetime closer to the frame rate.

jan-ivar · 2023-07-19T00:03:11Z

Meeting:

Ideas: Wherever we set sendOrder, add a boolean toggle (prioritizeRetransmits) or a timeWindow for that? dunno
@vasilvv: might want to shelve this for now until people who do servers tell us if this matters or not

kixelated · 2023-08-04T17:55:46Z

I think I'm wrong about a few things.

However you can't transmit new data while at the MAX_DATA limit, so you won't be able to send a full flight and throughput will suffer.

Actually no, because retransmissions aren't subject to the MAX_DATA limit. You won't have a full flight of prioritized data, but you will still have a full flight of data.

So I would say that if you're going to deprioritize retransmissions, you should only do so while there's a full flight available in MAX_DATA, otherwise you risk a momentary stall.

So there still will be a sudden shift from the application's point of view. The receiver will be in a steady state where only high priority data is being received and then suddenly it receives old and low priority data instead for an RTT. The retransmissions free up flow control so you're back in the steady state until sufficient packet loss occurs again.

However my recommendation doesn't avoid that; it just pushes forward the timeline by an RTT. If this stall is really an issue, then you could amortize it by retransmitting a % based on how close you are to the MAX_DATA limit... but that seems even more complicated to implement.

Ideas: Wherever we set sendOrder, add a boolean toggle (prioritizeRetransmits) or a timeWindow for that? dunno

The application can implement the timeWindow itself and that's actually how my client works. I try to transmit media for up to 10s, after which I close the stream with a non-fatal error code because it was starved. This is also necessary if there's any local send buffer limit such as a maximum amount of data that can be queued in WebTransport.

A boolean for prioritizeRetransmits seems premature. I think we experiment on the server side to see if it has any impact first. If it does have an impact, I still don't think exposing that knob to the application makes sense. The more I think about it, the more I think the browser could transparently deprioritize transmissions.

wilaw · 2023-08-14T08:38:40Z

Feedback from IETF #117 in San Francisco
Notes here: https://notes.ietf.org/notes-ietf-117-webtrans

Cullen Jennings: If we just did this by default, it would not be great. Lossy networks would have priority inversions from other networks. If you say that the application can set different send orders for the main packets and then a separate one for if it’s retransmitted, you have to let the application control it. .. If you want to separate send orders, you should have two, both set by the application.

Luke Curley: I’ve done two, one with and one without, didn’t see any difference really. There are some flow control issues, unless your packet loss rate is really high, you don’t really see any difference. Always retransmit first, otherwise flow control is hell, you have gaps in streams that still consume flow control.

Mo: I like Cullen’s idea of having specific retransmission order. If I can use it as a hack to get datagram semantics with streams, set it to the lowest possible.

jan-ivar · 2023-08-15T23:39:55Z

Meeting:

Might be too early to see
Chrome: retransmissions always go first
Firefox: pretty sure the same
As a feature this seems like something that could be added later. tempting to defer
Argument against: Such a feature might merely allow folks to over-commit more
A thing that only happens when you're not having a great day already. You only benefit from this if you're expecting things out of order to help your performance

jan-ivar · 2023-08-15T23:46:53Z

Can we mark this ready for PR with a note, e.g.

Ordering of retransmissions is at the discretion of user agents, but retransmissions are encouraged to happen at high priority.

suhasHere · 2024-03-19T08:03:38Z

I shared a talk on Simulcast video delivery over MoQ and shared some learnings from the same here with respect to retransmissiosn and priorities:

https://datatracker.ietf.org/meeting/119/materials/slides-119-moq-simulcast-video-delivery-learnings-01

The gist as shared by Christian Hiutema in picoquic implementatiom is

Retransmit is done at the same priority as the original stream. SO if we have seen packet losses on both a low def high priority stream and a high def low priority, the order will be: retransmit of high priority stream, then new data on high priority stream, then retransmit for low priority stream, then new data for low priority

wilaw added the Discuss at next meeting Flags an issue to be discussed at the next WG working label Jun 21, 2023

kixelated mentioned this issue Jun 21, 2023

WebTransport: stream priority hyperium/h3#197

Open

jan-ivar added the Ready for PR label Aug 15, 2023

wilaw removed the Discuss at next meeting Flags an issue to be discussed at the next WG working label Sep 20, 2023

wilaw added this to the Candidate Recommendation milestone Dec 20, 2023

nidhijaju mentioned this issue Mar 25, 2024

Add note about retransmissions and send order #595

Merged

jan-ivar closed this as completed in #595 Mar 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Retransmissions and send order #523

Retransmissions and send order #523

vasilvv commented Jun 21, 2023

LPardue commented Jun 21, 2023

aboba commented Jun 21, 2023 •

edited

Loading

kixelated commented Jun 21, 2023 •

edited

Loading

kixelated commented Jun 21, 2023

martinthomson commented Jun 22, 2023

jan-ivar commented Jul 19, 2023

kixelated commented Aug 4, 2023 •

edited

Loading

wilaw commented Aug 14, 2023

jan-ivar commented Aug 15, 2023

jan-ivar commented Aug 15, 2023

suhasHere commented Mar 19, 2024

Retransmissions and send order #523

Retransmissions and send order #523

Comments

vasilvv commented Jun 21, 2023

LPardue commented Jun 21, 2023

aboba commented Jun 21, 2023 • edited Loading

kixelated commented Jun 21, 2023 • edited Loading

Deprioritize retransmissions

Prioritize retransmissions

kixelated commented Jun 21, 2023

martinthomson commented Jun 22, 2023

jan-ivar commented Jul 19, 2023

kixelated commented Aug 4, 2023 • edited Loading

wilaw commented Aug 14, 2023

jan-ivar commented Aug 15, 2023

jan-ivar commented Aug 15, 2023

suhasHere commented Mar 19, 2024

aboba commented Jun 21, 2023 •

edited

Loading

kixelated commented Jun 21, 2023 •

edited

Loading

kixelated commented Aug 4, 2023 •

edited

Loading