-
-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Need to explicitly spell out what media formats are supported #453
Need to explicitly spell out what media formats are supported #453
Comments
also stickers |
relatedly, we should spell out what formats the |
The thumbnail endpoint should probably use the |
I have a few concerns with returning animated gifs/videos:
This makes me feel quite hesitant about ideas such as allowing servers to optionally support animated gifs, or allowing clients to ignore animations by only rendering the first frame (as they still would pay the cost of downloading the entire file). I guess ideally we might allow clients to specify if they want an animated thumbnail, plus having the server ensure that animated thumbnails are still fairly small in size (e.g. only cutting out frames if its too big?). |
If the client only advertised support for JPEG in the Call my a millennial but I think that most people want to see animated gifs animating by default. It seem like the two main options are: 1. Require the client to download the original for animation or 2. Require "video" support in |
I think we need to give the option of supporting animated GIFs for folks who want them, and to faithfully bridge from Discord and others. But I agree that clients should be able to request non-animated GIFs somehow (for users who don't want to bog down their apps with animations, or only load the animated GIF on hoverover, etc). I don't see this as splitting the ecosystem, as the onus will be on client developers to provide a way for viewing the animated GIF if available (eg. on hoverover), just as we do for animated GIFs in the timeline. We can spell out this expectation in the spec while we're formalising the supported formats. In terms of whether we advertise that we want animated GIFs via Accept header mime-types or querystring parameters on the media repo... I don't have particularly strong thoughts. We don't use Accept headers much (at all?) currently in the API, but perhaps that's a mistake. |
I think the main downside for using |
I'd say to play with/use the |
To clarify: I think its fine if its client controlled, but probably less fine if its server controlled. |
@ShadowJonathan accept isn't a forbidden request header so there should be no issue setting it from the browser. |
I don't really understand why you are obsessing about There is a bit of a problem with either way of allowing clients to specify a format, which is that the default behaviour of Synapse is to generate thumbnails at upload time. If we want to give clients the option to choose between formats, that implies that Synapse would have to generate many more thumbnails. So, my inclination is to resist giving clients the option of format for thumbnail, and just say that thumbnails must be one of |
Because it is a standardized way for clients to negotiate.
It doesn't. But as stated it does serve as a short-term fix as clients can ask for JPEG to get a still image.
No, this is why I suggested to pick a short list of supported formats. For example if we picked However this means that it is easy for the server to add more formats if it wants. For example synapse may decide that it finds that webp is a good bandwidth saver so it may start supporting that, now any clients that support it may be served that instead. For example some proxies (like Cloudflare) support on-the-fly transcoding, so it would be possible to add that without any protocol or code changes.
This is basically a less flexible version of what I proposed, and I don't see any real benifits. I will reword what I said.
The server still only needs to support and pre-generate 1 thumbnail (minimum) but it provides the option for servers to use better formats if there is mutual support. Optionally, we may also add a requirement on the server such as it must support at least jpeg. As in if jpeg is in Optionally, we may also want to ensure there is at least one format that supports video so that the it is a client decision if animations should be played, so maybe we require animated gif support. |
Matrix should spec sets of mime types that are expected to be supported by clients and servers for dedicated media events like m.image, m.video, m.voice, m.audio, m.sticker, etc. It is necessary because you need to be able to expect that your conversation partners can understand you (i.e. read certain media formats). Obviously you should still be able to send any file as a file, but there are media types you're expected to be able to view inline the conversation. The server should take care that media has an appropriate format for the media type (image, video, ...) before committing it to the media repo if possible and necessary. Duplicates of the same media in different formats (beyond thumbnails) are to be avoided as to not store a 10MB mp4 plus 10MB webm plus 10MB av1 etc. At the same time it does not seem unreasonable to allow multiple file formats for each media type, such as png and jpg for images. On the other hand this creates complicated situations where the thumbnailer will generate a jpg thumb (which is usually reasonable and smaller) from a png with alpha, resulting in an inaccurate representation (jpg has no alpha). Optimally the formats chosen are ubiquitously supported, but other considerations apply, e.g. a reasonably modern animated image format (webp? apng? jpgxl? perhaps a video format like mp4/avc?) may be included over the fairly obsolete gif if that format warrants the effort of making every client include the appropriate codec if there are platforms that do not natively support it. In these cases the server could offer the modern format by default, but if the client does not iOS does not seem to support webm video, so mp4 should be chosen instead. Actually, it might be proper to even spec what actual codecs can be used, in this case mp4 = avc + aac, as avc + mp3 might not play back as well. If I try to upload webm to my home server, either the server will transcode it to mp4 or tell my client to transcode it to mp4 first. This implies there needs to be an explicit option to send media "as a file" instead, such that it isn't transcoded. A general issue about server side transcoding are DOS attacks by requesting much CPU to transcode over and over again, thus should be limited to clients while delivering only canonical format over federation. Further, clients may be rate limited? |
Transcoding won't work with encrypted files |
Completely right, so client-side transcoding it is. A very real example I want to point out: new-ish iOS versions save pictures in HEIC (HEIF) format. This isn't a consistently supported codec across platforms, and iOS clients should
Another point i missed earlier: This also raises the question: should clients try to show Something else to consider is that the mime type field can be arbitrarily set by clients and could possibly be different from the actual file. Could this be an attack vector? I think element-web pretty much just puts the media in an |
Something related has been done and specified in element-hq/element-android#3444 (comment) |
Linking matrix-org/synapse#1278 as related to this issue. |
This has already happened a long time ago, some client+server+media-repo combinations result in users seeing animated avatars without an option to stop them leading into accessibility issue while others like Nheko intentionally don't play them at all. |
(Animated) vector graphics -- particularly Lottie as used by Telegram (stickers) and suggested in #9 -- are probably out of discussion for the same reason as SVG? |
element-hq/element-x-ios#2374 is a good example of this biting people in the wild - iOS matrix clients will silently fail to play back VP9 codec video muxed in MP4s (despite iOS having a VP9 codec for WebRTC, thanks to Apple trying to push H.265 instead). So, rather than saying "uh, send whatever format a typical browser should be able to play", i really do wonder if we should formally say what formats are supported in detail - obligating the sender to transcode, or send as a |
I wonder if a nicer alternative would be to allow sending multiple Formats. Similar to how html5 handled it these days. That way the client can pick one that it might best be able to display. |
I would try to avoid multiple formats because in e2ee rooms that would mean the sender needs to upload multiple formats. When on mobile on a battery and a potentially expensive connection it is best not to have to transcode and upload multiple formats. The best option is probably making a fairly short list of supported formats and occasionally a new one can be added. But ideally it would be added in advance and clients would avoid actually sending in the new format until it is likely that most receivers have updated their client with support. |
how feasible would it be for matrix as an open standard to define formats based on open standards as required for these media types? there might be locked down platforms that stick to their proprietary and closed and not widely adopted formats (looking at apple), and the fair thing to do seems to not disadvantage platforms that can't support these proprietary formats, but instead force clients on platforms not natively supporting these formats to work around it themselves (such as ios apps shipping their own codecs to read and write them if necessary). yes, the best case is for everyone to play nice, but there will always be the odd ones being disruptive in a bad way, and i don't think an open protocol can continue to call itself that if a hard-ish dependency is not at least open-ish. |
e.g. should client support displaying SVGs? HEIC? 3GP? etc.
Miniproposal: we should piggyback on HTML5, with the exception of SVG which is particularly vulnerable to exploding clients thanks to billion lol attacks.
However, if clients think they can display more exotic images safely, they can - but otherwise, we should fall back to thumbnails, ideally those provided by the sending device (in case servers cannot thumbnail or decline to risk thumbnailing the origin image).
The text was updated successfully, but these errors were encountered: