Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add WebP as a core format #1344

Closed
HadrienGardeur opened this issue Oct 14, 2020 · 16 comments
Closed

Add WebP as a core format #1344

HadrienGardeur opened this issue Oct 14, 2020 · 16 comments
Labels
EPUB33 Issues addressed in the EPUB 3.3 revision Spec-EPUB3 The issue affects the core EPUB 3.3 Recommendation Status - Subject to Review A tentative decision has been made on the issue but may be changed before becoming a recommendation Topic-PublicationResources The issue affects support for publications resources

Comments

@HadrienGardeur
Copy link
Member

In our current EPUB 3.x charter, WebP is clearly identified as a candidate for our list of core formats.

Now that Apple has released iOS 14, WebP support is finally available on all major modern browsers: https://caniuse.com/webp

WebP has many benefits over JPEG and PNG and while AVIF might eventually become a better option, it's probably a few years away from being widely adopted.

@HadrienGardeur
Copy link
Member Author

HadrienGardeur commented Oct 15, 2020

Since I won't be able to attend the call this week, some additional context:

  • we've looked into the files that we host (around 2 million EPUB files) at De Marque and noticed that images and fonts are by far the two components of an EPUB that take the most storage space
  • images are rarely optimized by publishers, we could considerably lower the size of an EPUB by simply processing these images in a dedicated pipeline (which we're not really allowed to do as a distributor and that's a shame)
  • optimized formats like WebP or AVIF could certainly help with that and we're also considering moving all our covers to WebP (which would probably be served through a CDN where we transcode them upon request)

While this impacts all EPUB potentially, this is even more important for FXL content and comics/manga in particular.

@llemeurfr
Copy link

Codec technology is moving slowly (mp3 was born in 1993, jpeg in 1992) but sometime a new format takes ground, with a large support from browsers. It seems WebP is in this case, even if Apple has not completely grabbed it.

As usual, the issue with "new" formats is that "old" user agents are not able to deal with them. This is not only an issue for media formats (Opus is another case), this is also true for HTML 5 and CSS, as new features (even flexbox?) may not be supported by devices stating they are EPUB 3 compliant. This is especially the case for e-readers, their software being often frozen for 10 years.

EPUB 3 has chosen the path of evergreen HTML 5 and CSS : it would be counter intuitive to freeze its availability to handle new media formats.

What can we do to deal with technical evolution vs interoperability with old stuff? a "basic profile" of EPUB 3, which would secure a very large interoperability between producers and user agents for a very long period, could ease the issue for basic ebooks for sure. It would replace EPUB 2 (which is already a spec frozen in time), and may help killing EPUB 2 usage. But I agree this is a complex issue.

@wareid
Copy link
Contributor

wareid commented Oct 16, 2020

As discussed in the meeting on Oct 15, WebP has been resolved to be added to the list of core media types.

@iherman
Copy link
Member

iherman commented Oct 16, 2020

Well... from a standardization point of view, we may have a problem at this point. (Thanks to @mattgarrish who raised this question in #1347.) It is not entirely clear whether, specification-wise, WebP is stable enough to be referred to normatively.

Here are the problems I found.

Based on these, I have my doubt that a reference the WebP abides to the requirements on normative references.

Here are some questions/ways forward:

  • Does anyone has any other references that may be acceptable for our purposes?
  • @GarthConboy (or Brady) may have some contacts to the google developers to give us more information
  • I can raise the question in the W3C team to see if there are information there

However, I am afraid that until this issue is not solved, we should not merge #1347...

@mattgarrish
Copy link
Member

mattgarrish commented Oct 16, 2020

a "basic profile" of EPUB 3, which would secure a very large interoperability between producers and user agents for a very long period, could ease the issue for basic ebooks for sure

Ya, there is also a case to be made that core media types as a specification creation have outlived their usefulness. They're really just an advisory guide to the current state of browser core support, since it's unlikely anyone goes out and specially implements support in HTML/SVG just because we list formats. Video kind of proved that point long ago.

We could probably do away with fallbacks for any non-spine resources and publish a note that lists the current recommended formats to use for universal readability (or as close as you can get to it). As far as shaming developers into support, a note could actually point out who the laggards are in specific cases.

(But it's also Friday, so there's probably some flaw in my reasoning that doesn't make things that simple.)

@iherman
Copy link
Member

iherman commented Oct 17, 2020

  • I can raise the question in the W3C team to see if there are information there

... which, in the meantime, I have. @plehegar will talk to our Google contact to see what exactly the situation is.

@dauwhe
Copy link
Contributor

dauwhe commented Oct 18, 2020

... which, in the meantime, I have. PLH will talk to our Google contact to see what exactly the situation is.

I raised the issue on Twitter, and I think the people working on WebP at Google saw it :)

@iherman
Copy link
Member

iherman commented Oct 19, 2020

Referring to #1347 (comment) by @dauwhe:

It appears that the developer is not interested in standardization:

The specs used a reference are:
lossy VP8: https://tools.ietf.org/html/rfc6386

The (large) time investment it takes going to ISO or similar for official spec was diverted to making libwebp better :)

They do not address the question of registering the media type. Is that possible without a specification from a standards body?

I do not know the answer to the last question, but this is not our concern; if the implementers are not interested in formally registering anything, then we cannot do change that.

@plehegar is a possible arbiter here, but personally I feel uneasy listing a media type in the document which does not refer on any document I could find. This means that this can change at any time. The specification references are also a bit shaky in this respect. At the moment WebP very much looks like a proprietary technology to me what is may not be secured to be stable.

I would propose that we may want to have a more general discussion on how we handle the core media types see, for example, the comment of @mattgarrish in #1344 (comment). If the whole section changes to become an advisory section then the situation will become very different. As @mattgarrish said, we have already accepted to do that for video formats.

My proposal is that we suspend this issue, and the corresponding PR #1347, and see what we should do with the concept of core media types. WebP may not be only issue here, I could see HIEC coming up at some point, too.

(Administratively, raising a separate issue may be better than having this type of discussion spread over this issue and #1347 already.)

Cc @wareid @shiestyle

@mattgarrish
Copy link
Member

we may want to have a more general discussion on how we handle the core media types see

To add to my comment above, my recollection of why we added core media types for 3.0 (or at least one significant reason) was because there was concern that hand-rolled rendering engines like the old RMSDK might still be produced for EPUB 3. In that case, we wanted to ensure minimum conformance of rendering.

Did that happen anywhere, though? If not, I'm not sure how important core media types are to define in the specification (or normatively require rendering support for). We can keep the rendering requirements that matter most, like XHTML+CSS and SVG, but we probably don't need to enforce a profile of support that is essentially going to exist anyway.

But 3.0 was a long time ago, so I freely admit I might be forgetting some other purpose for them.

@HadrienGardeur
Copy link
Member Author

To add to my comment above, my recollection of why we added core media types for 3.0 (or at least one significant reason) was because there was concern that hand-rolled rendering engines like the old RMSDK might still be produced for EPUB 3. In that case, we wanted to ensure minimum conformance of rendering.

I think we're beyond that point now. RMSDK is not built on a Web rendering engine, that's not a valid reason for rejecting modern CSS, JS or HTML.

We need to adopt a best practice at least for all media formats, not just images. If we want to move away from white listing audio, video and image formats, that's a discussion that we need to have during this revision cycle.

From my perspective, I don't think that we should be in the business of curating such a white list in our specification. I understand the need to recommend specific formats, but it feels more suited for a best practice document than a specification.

@iherman
Copy link
Member

iherman commented Oct 24, 2020

This issue was discussed in a meeting.

  • RESOLVED: Add WebP as a core media type in EPUB 3.3
View the transcript WebP and VTT as media types
Wendy Reid: #1344
Wendy Reid: #1299
Wendy Reid: caniuse looks good for WebP
… supported in Big Sur, which is out soon (?)
… the argument is a more efficient image format
… DeMarque feels it significantly reduces the overall size of their EPUBs
… which is good
… the other possible CMT is VTT,
… used for video captions
Marisa DeMeglio: if we had web vtt as a core type
Wendy Reid: it’s text vtt
Marisa DeMeglio: does it accompany the video?
Wendy Reid: the mystery is how we refer to videos; we don’t have a CMT for video
… it would accompany whatever video format you were using
Shinya Takami (高見真也): (speaks in Japanese)
… I discussed both of these with Mr. Koike… we don’t know what impact they would have on RSs
Brady Duga: Matt seems to think a fallback isn’t needed in this case?
… since VTT is itself a fallback for the track element?
… do we really need to do anything other than clarifying the manifest section
Wendy Reid: that’s a task on it’s own
Brady Duga: that’s why we have matt :)
… you don’t need a fallback for a foreign resource in this case
… so it might make it confusing to have it as a CMT
Dave Cramer: This is the perfect opportunity to mention that CMT doesn’t mean that something can be a spine item
… Content Docs can be spine items, but CMTs mean they don’t need a fallback
… a RS would be assured of finding something to display
Brady Duga: that’s a good point, but you could put vtt in an object tag :)
… the real problem is, if we required a fallback for VTT, there isn’t one.
… so you couldn’t use VTT. There’s not a format that would work instead. You can’t replace it with HTML.
… I’m assuming it has time-based features.
… if it does require a fallback, we should make it a CMT. If it doesn’t require a fallback, there’s no point.
… thumbs up to WebP
Shinya Takami (高見真也): how about adding 2nd-priority CMT
… maybe JPEG/PNG is already popular
… webp and opus are new technologies, maybe there should be fallbacks
Zheng Xu (徐征): If a CMT can be a spine item…
… what is the core media type we wanted to define to have benefit for end user
… we can use it as content doc without fallback
… if CMT does not have this purpose, even some mimetype can be used as CMT but can’t be put in spine item
… how do we use this scope?
… like web browser we have caniuse
… if we have something like caniuse for reading systems
… but if some reading system can’t support webp, it’s a reading system problem
… we need to draw a line for CMT to ask creator to add fallback
… it’s still a question for me
Wendy Reid: it’s a fair question
… having tests will help
… as we discussed earlier, do we throw away the idea of CMT, and the consensus was we didn’t want to do that.
… this puts a lot of pressure on content creators, and leads to interop problems.
Brady Duga: re: fallbacks for webp and opus
… defeats the whole purpose, which is to make your EPUB smaller
… so that’s why you want to extend CMTs
… a size benefit
… we can add to our list of reasons: Is there a benefit you cannot achieve with existing core media types
… the answer for both WebP and Opus is that they are smaller.
… we have opened things up, it was called FXL
… we have insane hacks that downloads Times New Roman so that it can match fruit company’s rendering
… free-form content makes it hard to support lots of things
… we also don’t have enough people to make caniuse for epub work
Zheng Xu (徐征): caniuse for RS… I know it would take a while
… they have more content creators
… we have smaller scope of user
… say user wanted to use some type of… do whatever you want
… use any font, but it might break FXL… sure
… if we can define certain type… with web browsers people only test in 3 browsers
… for reading systems it’s not the same
… we have too many types of reading systems
… as RS creator, if we cannot support it, we can put our support status in caniuse
… it’s kind of more use-case driven
… if we draw a line here,
… how can create a fallback for it?
… this fallback blocks creator from using this feature
… it’s not possible to create fallback
… I think purpose of creator and spec and RS
… expand it, let it be more easy to use
Dave Cramer: I would just remind everyone we tried to do caniuse for EPUB and it failed epically
… there’s a handful of major browsers, and hundreds of reading systems
… which all behave differently on different platforms
… the volunteers could never keep up
… we failed at trying to keep up with it and maintain
… we need more marisas
Marisa DeMeglio: you took the words out of my mouth
… I developed the site
Marisa DeMeglio: https://daisy.github.io/old-epub3-support-grid/features/
Marisa DeMeglio: keeping it up to date was not possible
… we relied on volunteer testers; the tests were not automated
… even though we had some provisions to make it easier
… the rate at which reading systems were changing required too much retesting
… a couple of months ago, BISG asked me to put the static site back
… i’ve put it back
… link ^^^
… this only scratches the surface
… but this is what we tried to do for caniuse for epub
Zheng Xu (徐征): I understand; I respect the amount of work required
… two things
… one, in terms of caniuse or test
… if it needed to be maintained by each RS
… publishers would know how to create content for a RS
… we might have more than a hundred reading systems. No 3rd party can do that.
… webp… based on the current track, it’s hard to add more to CMT
… what is benefit of having this CMT?
… we have spec, we have 3.3 that defines that webp must be CMT
… but we have a RS that does not support this
… that is a problem
Wendy Reid: we have to get back to the core of the core of the talk about core media types
… it sounds like we have consensus on criteria
… they need overwhelming support by browsers and operating systems, with a path to completion
… and they need a direct benefit to content creators and/or reading systems
… based on those criteria, let’s start with WebP
… I’ll put in a proposal
Proposed resolution: Add WebP as a core media type in EPUB 3.3 (Wendy Reid)
Brady Duga: +2 (1 for me, 1 for Garth)
Teenya Franklin: 0
Laura Brady: 0
Shinya Takami (高見真也): +1
Matthew Chan: +1
Marisa DeMeglio: +1
Zheng Xu (徐征): +1
Yu-Wei Chang (Yanni): +1
Ben Schroeter: 0
Wendy Reid: +1
Resolution #1: Add WebP as a core media type in EPUB 3.3
Wendy Reid: I can’t make a strong case for VTT
Dave Cramer: I don’t think we should vote on this right now
… it might not need to be a CMT
… it should be available in EPUB
… let’s figure out how to do that

@mattgarrish mattgarrish added the Spec-EPUB3 The issue affects the core EPUB 3.3 Recommendation label Oct 26, 2020
@mattgarrish mattgarrish added Status - Subject to Review A tentative decision has been made on the issue but may be changed before becoming a recommendation Topic-PublicationResources The issue affects support for publications resources EPUB33 Issues addressed in the EPUB 3.3 revision and removed Spec-EPUB3 The issue affects the core EPUB 3.3 Recommendation labels Nov 7, 2020
@iherman
Copy link
Member

iherman commented Dec 4, 2020

The issue was discussed in a meeting on 2020-12-04)

List of resolutions:

  • Resolution 1: EPUB 3.3 will keep the concept of core media types as it is today.

View the transcript

1. Core Media Types

See github issue #1344, #1299, #645.

Wendy Reid: we had discussed this back in sep. that we would vet them one at a time
… we've tried to add some ones
… but run into issues with things that are implemented, but not standardized

Dave Cramer: the general question here: We have 2 competing principles. 1 is supporting what the web supports, and we hope that epub will maintain compatibility with the web.
… 2. but epubs also must work. e.g. someone who buys a book should be able to read it on their device
… an issue esp. because older RSes are still out there
… e.g. I made a little test of webp inside an epub, and it only worked on 50% of the RSes it was tested on
… we could give up on the idea of core media types and just leave the decision to content authors, but that could result in a bad experience

Ivan Herman: we have already made the similar decision in terms of css
… e.g. whatever css can do, it is fair game for authors

Dave Cramer: i think css is different. CSS has very well defined fallback behaviour. Not true with, e.g., new media type in epub

Tzviya Siegman: +1 to dauwhe about fallbacks

Dave Cramer: with CSS, the reading experience may be degraded, but not entirely lost
… there's also lots of experience out there about writing CSS that works even if certain features aren't supported
… not exactly the same with EPUB

Ivan Herman: +1 to dave's response

Brady Duga: agreed
… in terms of modelling epub after CSS, 90% of the issues I fix are to do with CSS not working. So not enthused about using the same model for media types

George Kerscher: where are we in terms of video formats?

Wendy Reid: with video, we've also taken an "open approach" i.e. just use what the RS accepts
… h264 encoded is probably the standard?

George Kerscher: i think the lack of clarity around the video is holding us back. People would love to see video in epubs

Dave Cramer: video is tricky because there are even brand new devices that won't support video - eInk!

Wendy Reid: also, some platforms have upload sizes that make video incompatible

Avneesh Singh: this is a problem i see with both video and audio in epub. The larger the file size of the zipped file, the greater the chances of corruption after download
… going back, we said that epub 3.3 would not be a major revision. And we only have 1 year to finish the spec.
… recall our experience with epub 3.1. The publishing industry, unlike the web, is slow to move to new standards

Dave Cramer: a lot of the trouble with epub 3.1 was with epubcheck not being available (although the spec also had its own issues)
… i think we should keep core media types
… and maybe periodically consider new media types as the underlying technology evolves
… and i think this discussion shows that that decision comes down to the specific media type under consideration

Ivan Herman: i'm okay with what Dave said, but what are the criteria when we decide that something becomes a core media type?
… earlier there was a request that webp to become a core media type, which we accepted without lots of discussion, but now (I believe) there's still an open PR about it

Bill Kasdorf: when we originally created core media types, we still allowed the use of other media types, just with the caveat that the author must provide a fallback, true?
… yes, okay

Hadrien Gardeur: but we shouldn't always assume that there will be a working group to oversee the question of core media types, especially with newer media like video/audio
… perhaps we can have a normative document where we would retain the capability to update it when we need to (i.e. between working groups)

Garth Conboy: we currently have a list of types for image, a list for audio, and a vague suggestion for video
… but I don't think the current state of support for video is broken right now
… about what Hadrien said, maybe it could be an external vocabulary document?

Garth Conboy: See the relevant section in the 3.2 spec

Ivan Herman: about Hadrien's point, 1. The new process at W3C will make these types of updates much much easier. Under the new process we can update the spec if there is committee agreement.
… 2. But we also have the option to separate the media type into a separate registry. The W3C will have a more formal way to update registries in the future.

Brady Duga: I like the idea of pulling out the media types to a separate location rather than having them buried inside the main spec
… re. Bill's Q about fallbacks. The specific issue with webp is that webp makes images smaller. If authors had to provide fallbacks when using webp, that would kind of defeat the point (by expanding the epub size)
… there seems to be some disagreement about how to add new core media types

Avneesh Singh: we already tried what Hadrien suggested in epub 3.1, but then with epub 3.2, we put it back in
… externally incorporated documents created additional issues when it came time to take 3.2 to ISO
… maybe we could ask Makoto whether whatever we decide here will create an issue for ISO

Dave Cramer: yes, external registries have definitely been an issue for specs in the past
… and audio and video formats are more amenable to being remediated by fallbacks than images (the fallback can even just be text)

Ivan Herman: to wrap up, we seem to be converging towards the point of not changing anything for now
… we keep core media types as they are today and move on (and perhaps changes in the W3C processes will make these easier to maintain going forward)

Wendy Reid: yes, we want to keep core media types, and we will wait and see about the idea of using external registries
… esp. given that how registries will work in the future is still being sorted out
… and the Process 2020 will allow us to make piecemeal modifications to the spec without revisiting everything
… we're still in favor of adding webp, we just need to work out the implementation issues around webp

Proposed resolution: EPUB 3.3 will keep the concept of core media types as it is today. (Ivan Herman)

Ivan Herman: i think webp is a separate discussion

Tzviya Siegman: +1

Ivan Herman: +1

Matthew Chan: +1

Brady Duga: +1

Hadrien Gardeur: 0

Wendy Reid: +1

Garth Conboy: +1

Charles LaPierre: +1

Juliette McShane: +1

Avneesh Singh: +1

Toshiaki Koike: +1

Bill Kasdorf: +1

Masakazu Kitahara: +1

Garth Conboy: and for video, we're going to keep it as it is for now - i.e. no specific type, just a suggestion

Hadrien Gardeur: isn't that an inconsistency? There's core media types for some types of content, but not for video?

Dave Cramer: In the past that was because video types were evolving so quickly

Hadrien Gardeur: Images seem to be evolving quickly today

Dave Cramer: In case of conflict, consider users over authors over implementors over specifiers over theoretical purity.

Resolution #1: EPUB 3.3 will keep the concept of core media types as it is today.

George Kerscher: +1

Wendy Reid: Well with image elements, there are a robust assortment of image fallbacks

Gregorio Pellegrino: +1

Brady Duga: Let's please try to keep substantive discussion out of the irc chat. Let's keep the irc chat just for metadata about the meeting only please!
… everyone may not be closely watching the chat log

@mattgarrish
Copy link
Member

Is WebP formally "in" now?

We still link to this issue in the specification. Should be removed, if so.

@dauwhe
Copy link
Contributor

dauwhe commented May 12, 2021

We thought it was a bit weird to keep the issue open when it was actually in the spec. I'm OK with a link to a closed issue. I'm OK with removing the link. I'm OK with reopening the issue and leaving the link :)

@mattgarrish
Copy link
Member

I'm OK with a link to a closed issue.

That's fine. From past experience, respec won't alert you that there are links to closed issues, so just wanted to make sure it still belonged. Maybe as we get closer to REC pubrules might barf a warning if we forget to remove.

@plehegar
Copy link
Member

fyi, there is an RFC to register the media type for image/webp.

@mattgarrish mattgarrish added the Spec-EPUB3 The issue affects the core EPUB 3.3 Recommendation label Sep 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
EPUB33 Issues addressed in the EPUB 3.3 revision Spec-EPUB3 The issue affects the core EPUB 3.3 Recommendation Status - Subject to Review A tentative decision has been made on the issue but may be changed before becoming a recommendation Topic-PublicationResources The issue affects support for publications resources
Projects
None yet
Development

No branches or pull requests

7 participants