-
Notifications
You must be signed in to change notification settings - Fork 385
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MSC2354: Device to device streaming file transfers #2354
base: old_master
Are you sure you want to change the base?
Conversation
oh, cool! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only did a quick skim, and have some suggestions/comments.
Thanks for getting the ball rolling on this!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This certainly seems more reasonable than abusingto_device
messages with base64 encoded blobs :D
Couple ideas:
|
Actually random access or at least resume-from-offset -functionality could be essential for transferring large files in spotty network conditions or while roaming (though I don't know if WebRTC already has some built-in mechanism to overcome those issues). |
Wouldn't this also fix matrix-org/matrix-spec#37? |
"content": { | ||
"transfer_id": "12345", | ||
"lifetime": 60000, | ||
"filename": "example.doc", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nice to have a sha256sum field here, to nail down the exact content to be transferred.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea but think that it should be optional. I can imagine cases where the file is generated on the fly, for example by bots. Alternatively in embedded cases the resource usage may be undesirable. Even end-user-cases such as streaming a large video from network storage should not need to be blocked by downloading the first time to compute a hash and maybe requiring re-downloading some or all of the file to send it.
I also think it doesn't add much either since you are getting the offer on the same channel and the offer + WebRTC provide integrity. But having an additional checksum doesn't hurt.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checksum checking should definitely be part of the transfer process, however it makes more sense to send any hashes after the transfer is complete.
Co-authored-by: Jonathan de Jong <[email protected]>
Co-authored-by: Jonathan de Jong <[email protected]>
Co-authored-by: Erkin Alp Güney <[email protected]>
the URL. By necessity this has a maximum filesize and also results in a more or less permanent availability | ||
of said file. For larger files, and/or files that should only be sent from point-to-point, it may be desirable | ||
to be able to send from device to device. As matrix already implements | ||
[WebRTC signalling for voip](https://matrix.org/docs/spec/client_server/r0.6.0#voice-over-ip), this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[WebRTC signalling for voip](https://matrix.org/docs/spec/client_server/r0.6.0#voice-over-ip), this | |
[WebRTC signalling for VoIP](https://matrix.org/docs/spec/client_server/r0.6.0#voice-over-ip), this |
[WebRTC signalling for voip](https://matrix.org/docs/spec/client_server/r0.6.0#voice-over-ip), this | ||
functionality can be replicated for streaming file transfers from device to device. It can even be possible | ||
to send files not between devices of two matrix users, but two matrix devices owned by the same user. For | ||
this we can use the webrtc datachannel. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this we can use the webrtc datachannel. | |
this we can use the WebRTC datachannel. |
"content": { | ||
"transfer_id": "12345", | ||
"lifetime": 60000, | ||
"filename": "example.doc", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea but think that it should be optional. I can imagine cases where the file is generated on the fly, for example by bots. Alternatively in embedded cases the resource usage may be undesirable. Even end-user-cases such as streaming a large video from network storage should not need to be blocked by downloading the first time to compute a hash and maybe requiring re-downloading some or all of the file to send it.
I also think it doesn't add much either since you are getting the offer on the same channel and the offer + WebRTC provide integrity. But having an additional checksum doesn't hurt.
"filename": "example.doc", | ||
"info": { | ||
"mimetype": "application/msword", | ||
"size": 46144 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On a similar note, is size required? What if the transfer should be dynamically generated?
It's tempting to want to use this for device to device transfer in public rooms, | ||
but an evil homeserver could hijack the webrtc session by pretending to be a device | ||
for the intended recipient (if that user has an account on the evil server). As such this | ||
should be limited to private rooms. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should public/private be replaced with unencrypted/e2ee rooms? IIUC the main concern is that without e2ee the homeservers can send/accept transfers on behalf of their users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm unclear on the risks associated here, but would the perceived risks be mitigated by offloading the actual file transfer to a webtorrent tracker endpoint, similarly to how TURN is required to negotiate calling?
This file transfer will only send the file once, from one device to another. | ||
|
||
While this proposal focusses on streaming file transfer, | ||
a webrtc datachannel could be used for any generic data transfer, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a webrtc datachannel could be used for any generic data transfer, | |
a WebRTC datachannel could be used for any generic data transfer, |
"origin_server_ts": 1432735824653, | ||
"room_id": "!jEsUZKDJdhlrceRyVU:example.org", | ||
"sender": "@example:example.org", | ||
"type": "m.d2dfile.invite", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I'm not a huge fan of d2dfile
. d2d
isn't self-documenting, you would need to look up to find out what it means. How about file_transfer
. I think "transfer" is a good name to emphasize that this is ephemeral and live. Other options are live_file
, direct_file
, live_file_transfer
or direct_file_transfer
.
which may be unwanted for file transfers. | ||
|
||
Both sender and reciever need to be online simultaneously for this mode of file transfer | ||
to work, which is unexpected in the context of the existing matrix file transfer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to use the homeserver as intermediate storage? This would require less resources on the HS side than traditional file sending, while drastically relaxing any time constraints.
Generally I like the proposal, and I would prefer to keep it specific to file transfer only (as opposed to other potential usages of such a connection). One thing that I am missing is more details about how communication over the WebRTC channel works exactly. |
Isn't https://spec.matrix.org/unstable/client-server-api/#send-to-device-messaging a perfect match for this application? The document doesn't seem to touch that at all. I don't think the full handshake should be within the room; why should other's even be aware if one requests a file from an invite? Even if the initial invite needs to be in the room (and even parts of that could be preserved for direct to_device -messages; parts of communication could be in the webrtc session itself), I don't think the actual requests should be. |
I left an inline content related to this, but would it make more sense to offload this as an external feature (i.e. matrix itself supports external storage resources like SharePoint, Box, Dropbox, GDrive, WebDAV, and most importantly in this case webtorrent and returns links to those endpoints as appropriate), rather than rolling this feature into the matrix protocol? In this way it could be configurable on a per server basis what storage endpoints to use as fallback file share mechanisms for files over a given size, and for p2p that fallback would be webtorrent. Fallbacks would require valid endpoints to communicate with and link to (in the case of p2p it would depend on a webtorrent tracker service endpoint) , so a corporate server could ensure files were all shared via governed systems that might be used outside of matrix. This is not dissimilar to needing to depend on external TURN/STUN for connecting calls. The matrix spec does not try to reimplement TURN, and so p2p file transfer over webRTC is a "solved problem" with with webtorrent. |
Speaking as a non-maintainer, for what it's worth: There is a widespread need for easy peer-to-peer file transfer. Existing tools (e.g. Magic Wormhole) can sometimes approximate this, but they tend to be too difficult for average people to use, or too fragile for large files, or too slow even when the network path between the peers is fast. Having direct transfers standardized in Matrix clients, without requiring a server admin to opt in or carry the data (except for cases requiring TURN), could elegantly solve the this problem for a great many people. From what I understand of WebRTC data channels, they seem perfect for this job.
Relying on external services like that would:
Wouldn't that present an obstacle to non-web-browser Matrix clients, since WebTorrent has few implementations compared to WebRTC? Also, does it work for large files? In my attempt to transfer a multi-GiB file using FilePizza, it seemed to be trying to pull the entire file into RAM, and failed miserably, though I don't know if that was the protocol's fault or the web app's fault. People need easy, direct, private file transfers, not another service or admin acting as gatekeeper. Matrix has a rare opportunity here to orchestrate that, delivering exactly what people need. I daresay it would be a compelling reason to adopt Matrix. Perhaps even a "killer feature". |
@foresto there's a lot to unpack in your comment, and i think a better place to address those issues is in a subsequent MSC where i can express my thoughts more thoroughly. i'll be sure to reference this MSC and your comments in my submission 👍 EDIT: on second thought, as i process these ideas more, they end up being unrelated to the matrix spec, and are purely media handling implementation details. i don't think i'll be generating that msc after all. |
@foresto This is very likely the browser's fault, and not going to improve very soon. As far as I understand it, only chrom* currently provides the API to stream a file to disk, and Firefox does not implement this feature. |
Related to matrix-org/matrix-spec#432 and should fix matrix-org/matrix-spec#189
Rendered
Signed-off-by: Mathijs van Gorcum [email protected]