-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Are data URLs always disallowed? #1564
Comments
I fully agree with this statement. But the question is whether the premise should be true or not. What happens if, for example, one uses a data URL in a CSS file (it is, in fact, a fairly useful trick, see in the example below)? Consistency would require to make that invalid, too. I would rather consider having an exception for data URL-s not in the sense of forbidding them, but more in terms of dropping the requirement that their corresponding resources should be listed in the manifest or spine. (I guess the question is whether this would create issues with RSs.) Here is a, I think, typical usage of data url-s in CSS: adding some sort of a watermark to a page:
|
I think that also for bitmap images (attribute
|
Well, I wouldn't go quite this far. I think it's okay to embed resources using data URLs since we can still check whether they are core media types or not (we could limit the use to CMTs). But allowing top-level navigation to them seems like a security issue. Aren't browsers beginning to restrict this, too? It also gets us back into the reason why navigating to any resource not in the spine has been disallowed (i.e., reading systems have no reference to where to take the user next). |
I have just checked, and that is correct. The error console in the browser says:
The question is how we would specify where data URL-s are allowed and where they aren't. I have not found (yet) any explicit statement about in the HTML standard which we could refer to :-( I have found this on the MDN page:
And there are some other security related issues in the text. Would be good to have some feedbacks from RS people... |
(I hope you don't mind the intrusion!) I think these are two orthogonal issues, one is whether "embedded resources" (in CSS, link, but in principle even an iframe or an object) should be accepted in the spec, the other is how to treat navigation links The trouble I have found with navigation links is that different software interprets it differently, and it's all very buggy. I have done some tests with different apps and a physical e-reader using an Embedded resources on the other hand seem non-ambiguous in terms of how they should be used. For reference, this was the discussion about disabling navigation to data URLs in Chrome. There are some use interesting use cases for not killing the links altogether, some could apply to an eBook (for instance an author might want to instruct the users to copy and paste the links in the browser manually). |
Absolutely not. On the contrary, we welcome any relevant comment from the community, so, please, keep them coming! Thank you. |
@xworld21 your separation of concepts is very helpful.
|
Yes, that's why I pointed out at the start that any use of data: URLs is currently invalid per the specifications. But...
Because data: URLs identify their media type, we can still check whether they are CMTs. Since you can't provide a fallback to a CMT using data: URLs, it would be invalid to embed a foreign resource this way. (The absence of a media type would also be invalid since text/plain is not a core media type.) |
That is fine but, per the spec today, using a data URL in a, say, I may miss something. |
I suppose the same rules for Foreign Resources would apply, i.e. when an element offers intrinsic fallback, then one MAY use data: URLs of non-CMT? While if there is no fallback mechanism (e.g. CSS does not do fallback), then the data has to be CMT, because you wouldn't be able to use a manifest fallback. Or to be pedantic, you could by duplicating the URL in the manifest, but as @iherman says, that would make using data URLs pointless.
Just a clarification (for me): would that imply that data URLs in |
Yes, I was thinking specifically of where manifest fallbacks are the only means of providing an alternative, like with
I'd expect they'd be banned in both cases, but it would be good to hear from some reading system devs on this. Spawning a browser instance is akin to doing a window.open, so it sounds equally problematic to opening within the reading order (i.e., you still end up with a top-level browsing context). |
I would agree with this. I guess most of the security issues raised by data URL-s are related to this usage anyway. |
I was wondering how to make a minimal change on the content document to allow for the reasonable usage of data URLs (see #1564 (comment) or #1564 (comment)) but avoiding the possible security issues that we listed. What about doing these two changes in the spec:
In my (possibly erroneous reading) of the spec (1) means that there is no obligation to add the data url as part of the manifest, because it is not considered to be “located” in the EPUB Container; and (2) puts an extra constraint that avoids a problem with navigation links (but it is o.k. to use data urls as “embedded resources”). How does that sound? |
Ya, I'm not sure that's where I'd be looking for information about data URLs. The resource is technically following the existing rule of being located in the container; it's just embedded directly within another resource. My first impression is that if go this route we should specifically explain data URLs in a new subsection under Section 3 since they're relevant to most of the other subsections there. That and/or cover them in the manifest item section as an exception to the requirement to list every resource. But this likely to take a lot of smaller changes, too. It's probably going to affect the definition of publication resource, the foreign resource requirements, etc. |
+1 to having a separate, explicit text. I am not sure how it would affect the other sections... |
The issue was discussed in a meeting on 2021-03-18 List of resolutions:
View the transcript3. Data URLsSee github issue #1564. Dave Cramer: where should data URLs be allowed in epub? Matt Garrish: if we allow them to be embedded, basically, we still can check if they are core media types Dave Cramer: so is it good enough to forbid the Matt Garrish: that would be possible Brady Duga: i don't think we spec away the security issues here Dave Cramer: the problem is that i want this in epub check, and to do that we need the spec to say something Matt Garrish: do we have to spec out what the RS does? Brady Duga: i'm okay putting this in just for epub check purposes... Dave Cramer: and i'm okay writing a test for it, e.g. an RS fails if it tries to follow the
Dave Cramer: is the link element an issue here? Matt Garrish: i wouldn't think that it would matter, because there is no requirement for the RS to do anything
Brady Duga: what about in SVGs? Wendy Reid: we'll do some research Matt Garrish: i can take a look at it, yes
|
The issue was discussed in a meeting on 2021-03-26 List of resolutions:
View the transcript2. Data URLs PRSee github pull request #1582. See github issue #1564. Matt Garrish: this is from what we discussed last week's call Brady Duga: its hard to get the language right so that we don't disallow things by mistake, or allow things by mistake Matt Garrish: this very much seems to be raised and discussed in the browser realm Ivan Herman: this PR does a lot that is not contentious, i think Matt Garrish: in that light, do we maybe want to keep the authoring side requirements, but leave out the RS expectations side? Ivan Herman: pref trying to do something now, even with it being incomplete, raise a separate issue about the terminology, link that issue to the text and ask for external help, e.g., from the TAG. Brady Duga: would refer this to TAG, yes Ivan Herman: also, let's mark issues as explicitly "referred to TAG for input" Dave Cramer: i'm okay with merging for now, on understanding that this is not final final
|
The question of whether a data URL is allowed in an
a
tag came up in the epubcheck tracker.My understanding is this is disallowed because you can't navigate to non-spine resources.
I've also understood that embedding resources using data URLs is forbidden since they're also not listed in the manifest.
Assuming both these impressions are correct, we should probably explicitly note somewhere for extra clarity that the data: scheme cannot be used.
The text was updated successfully, but these errors were encountered: