-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WebTransport/WebCodecs interaction (copies) #231
Comments
I think you nailed the use cases, but the assumptions about GPU buffers aren't quite right (I probably didn't communicate this well in our meeting). Here's some background:
In spite of point 3, I think we should still map out the flow from WebTransport to what copies of encoded data are forced by the current API shape. For use case #1 (sending encoded data -> web):
For use case #2 (downloading encoded data from webtransport)
Again, I don't think any of these copies is a big deal. Just trying to map it all out and look for optimizations. |
@chcunningham There may be some differences depending which WebTransport mode is used. WebTransport supports ordered/reliable transport (where data is sent over a single reliable stream), unordered/reliable transport (a distinct reliable stream for each message), or unreliable/unordered (HTTP/3 datagrams). Reliable ordered transport on a single stream might be used with HLS or DASH, with WebCodecs substituted for MSE. For low-latency game streaming, you might want a different transport to minimize latency, such as unreliable/unordered (transporting packetized media over HTTP/3 datagrams, with the application handling robustness via its own RTX/FEC/RED), or partially reliable/unordered (e.g. using a distinct stream for each EncodedVideoChunk, with a retransmission time limit), or maybe some mixture (e.g. reliable stream for key frames, unordered/unreliable for P-frames). Let me try to walk through the use cases, covering reliable/ordered or unreliable/unordered transport. I believe that the reliable/unordered case is similar to reliable/ordered. For use case #1, if I read Issue w3c/webcodecs#155 correctly, if the goal is to upload containerized media, you would write the EncodedVideoChunk to the reliable stream, which would handle segmentation, re-transmission and ordering transparently. For datagrams you most likely would not be sending containerized media, so if the EncodedVideoChunk is provided in containerized form, it would be necessary to de-containerize it and then packetize. The application would also be responsible for robustness (e.g. RTX/FEC/RED) and ordering. It would be nice to be able to select portions of the EncodedVideoChunk to be written into the datagram payloads without having to copy, so the de-containerization/packetization process can be most efficient. For use case #2, if you are reading the EncodedVideoChunk from a reliable stream, and it is received in containerized form, you'd want the segments to be deposited into a contiguous ArrayBuffer which you can then feed to the WebCodecs decoder. For datagrams, if the media was not containerized for transport, but needs to be provided to WebCodecs in containerized form, you'd need to de-packetize and containerize the media before feeding it to the WebCodecs decoder. |
@aboba - can you reference the discussions you spoke about in this mornings call? Concerning BYOB readers. |
Are both of datagrams and reliable streams target? |
@yutakahirano Yes. Some use cases:
|
@wilaw Here are some references: |
Here is my mental model for the reliable streams case, client => server.
[A], [B], [C] are potential copies. [A] comes from the followings in the spec side:
On the other hand, we may be able to optimize away the copy with (all of) the followings:
In other words, if the contents of the video memory is not accessible to scripts then it is possible to provide the GPU memory handle directly to the network process and eliminate the copy at [A]. Reg: [B] I expect at least one copy here because we have a TLS encryption process here (but it depends on the definition of "copy"). I can think of a system where we allocate a buffer backed by the network hardware, and there is only one copy involved (the input is the IPC buffer, and the output is the buffer backed by the network hardware, and all the HTTP/3 and QUIC protocol processing is done between them) for [B] and [C] combined. @DavidSchinazi knows much more than me here. @aboba, does this make sense / is this useful? |
One correction:
I came to think this is problematic in terms of security. Running the conversion logic should be done in the renderer, not in the network process. Still, we should be able to eliminate the copy and intermediary buffer allocation. |
Meeting:
|
BYOB support has been added as of M108. Since encoded chunks are much smaller than VideoFrames, copies of encoded chunks between WebCodecs and WebTransport are not as big an issue as concurrency in frame/stream transport, which appears to be achievable by removing |
@aboba - can we close this issue? |
There are a number of use cases where it is envisaged that WebTransport + WebCodecs will be used together:
In use case #1, it is desirable for the encodedVideoChunk produced by WebCodecs to be passed to WebTransport and sent with as few copies as possible (e.g. just one copy for process separation). Is this possible?
In use case #2, it is desirable for WebTransport datagram or reliable stream reader to transfer datagrams/segments of the encodedVideoChunks into a GPU buffer rather than main memory, so as to allow most efficient ingestion by WebCodecs and avoid a copy. Is this possible?
The text was updated successfully, but these errors were encountered: