Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xml:base in package document and elsewhere #1456

Closed
dauwhe opened this issue Jan 4, 2021 · 20 comments
Closed

xml:base in package document and elsewhere #1456

dauwhe opened this issue Jan 4, 2021 · 20 comments
Assignees
Labels
EPUB33 Issues addressed in the EPUB 3.3 revision Spec-ReadingSystems The issue affects the EPUB Reading Systems 3.3 Recommendation Topic-ContentDocs The issue affects EPUB content documents Topic-PackageDoc The issue affects package documents

Comments

@dauwhe
Copy link
Contributor

dauwhe commented Jan 4, 2021

EPUBCheck seems unhappy with any use of xml:base in the package document. EPUB requires support for xml:base. I don't even want to think about using xml:base in container.xml.

Is xml:base only for use with EPUB content documents? Can we be clearer in the spec about where it is and isn't allowed?

@dauwhe dauwhe added Topic-ContentDocs The issue affects EPUB content documents Topic-PackageDoc The issue affects package documents Spec-ReadingSystems The issue affects the EPUB Reading Systems 3.3 Recommendation labels Jan 4, 2021
@dauwhe dauwhe changed the title xml:base in package document and elsewhere xml:base in package document and elsewhere Jan 4, 2021
@mattgarrish
Copy link
Member

The note about HTML and SVG dropping support is probably misplaced, as prior to that being added the only mention of xml:base was that reading systems had to support it. But that was only required for completeness of xml processing, not because the attribute is actually allowed in every xml format. It's never been supported in the package document.

It might be less confusing to copy the note into both the HTML and SVG definitions, since those are the places where it is relevant. Commenting on xml:base for any format someone might include seems unnecessary.

It might also help to move this sentence from the item element definition to a more obvious location:

In the case of relative IRIs, the IRI of the Package Document is used as the base when resolving to absolute IRIs.

This is true for all href attributes and anything else that takes a relative IRI (refines, ?).

But this begins to tie into #1374.

@iherman
Copy link
Member

iherman commented Jan 5, 2021

@murata2makoto is the expert here. But it is interesting to note that the core XML specification does not include xml:base, which is defined by a separate specification. It is therefore, spec-wise, perfectly fine to disallow its usage, or restrict it explicitly to some content. @murata2makoto should tell us whether this is an acceptable practice spec-wise.

Note, however, that "disallowing" it in HTML is not that simple, because HTML also has the equivalent <base> element...

@murata2makoto
Copy link
Contributor

<rant>I opposed to xml:base when it was invented and I still do.</rant>

I think that disallowing the xml:base attribute in some XML vocabulary is perfectly fine. It is certainly possible to write a schema (in both RELAX NG and W3C XML Schema) so that this attribute is disallowed or restricted. The latest version of WHATWG HTML does not mention xml:base. In my understanding, this attribute is thus disallowed. We only have to use the base element of HTML.

@dauwhe
Copy link
Contributor Author

dauwhe commented Jan 5, 2021

Proposal: specify that xml:base is allowed only in EPUB Content Documents (XHTML and SVG), and even then is a bad idea.

@mattgarrish
Copy link
Member

specify that xml:base is allowed only in EPUB Content Documents

I'm a bit concerned we're creating an issue where none exists.

Not that I'm a fan of the attribute, but where else is it allowed that we need to worry about? If nowhere, what does this rule do other than potentially complicate the inclusion of data files and the like? Are those a concern if they're just used by script or travelling harmlessly in the container?

And how is this requirement tested outside of content documents?

@iherman
Copy link
Member

iherman commented Jan 6, 2021

@mattgarrish naïve XML users may believe that xml:base is an integral part of XML processing, i.e., that it is o.k. to use in the package documents and elsewhere. Making it clear that it isn't helps avoiding the confusion.

@llemeurfr
Copy link

xml:base is not part of the core xml specification (0 items found in the Fifth edition), it is a "facility" defined in a satellite specification. An xml based specification which decide to allow its use must normatively reference it (you can find this in the XML base spec when XLink is given as an example).

So xml:base should not be treated as an opt-out as @murata2makoto suggests, but as an opt-in. e.g. I found xml:base defined in SVG 1.1 but not in SVG 2.

Nevertheless, because there is no "EPUB best practices" companion for the spec, I'm in favor of a note in the package document spec to make clear that xml:base is not processed by reading systems in such documents.

@murata2makoto
Copy link
Contributor

@llemeurfr wrote:

So xml:base should not be treated as an opt-out as @murata2makoto suggests

This is not what I suggested. You can disallow it by writing a schema that does not mention it. Also, you can allow it by writing a schema that mentions it.

@murata2makoto
Copy link
Contributor

EPUB 3.0 has normative schemas. They allow xml:base only in XHTML (and MathML in XHTML).

@mattgarrish
Copy link
Member

naïve XML users may believe that xml:base is an integral part of XML processing

Banning something from formats we can't specifically identify is a strange precedent. Plus there's always a bigger naive xml user out there, as the old saying goes. ;)

Adding this requirement doesn't change anything in terms of the package document, though, as the attribute has never been allowed in it. As @dauwhe found out, put it in the package document and epubcheck will already throw errors at you.

I'm also looking at this from an epubcheck perspective and I just don't see this being a realistic requirement to check. We're saying that every xml format has to be validated and errors thrown if xml:base is found, even if xml:base is allowed by the format and regardless of the purpose of the file. (But not for content documents, where nothing will be emitted.)

@murata2makoto
Copy link
Contributor

murata2makoto commented Jan 6, 2021

Everybody appears to believe that xml:base has been disallowed except for content documents. epubcheck complains if it encounters xml:base. Matt thinks that the spec is already clear and that no further actions are needed. Ivan and Dave think that a note in the spec is useful.

If we introduce a note for xml:base, we should do the same thing to xml:lang and xml:space. Moreover, xsi:type, xsi:nil, xsi:noNamespaceSchemaLocation and xsi:schemaLocation also need something similar. We also have XInclude and XLink. Where should we stop?

@mattgarrish
Copy link
Member

Ivan and Dave think that a note in the spec is useful.

I'd actually read the proposal as adding a "MUST NOT" for xml:base anywhere outside of content documents.

I'm not advocating for xml:base as much as concerned about the overreach of disallowing it in grammars we don't even control.

If there's confusion about how the package document base is calculated, let's address that separately. If we create a new section that explains how the base of the package document is determined for relative URLs, there shouldn't be a need to explain that an attribute that isn't part of the grammar isn't allowed.

(Noting not to use the attribute in HTML and SVG is still fine with me.)

@dauwhe
Copy link
Contributor Author

dauwhe commented Jan 6, 2021

If we introduce a note for xml:base, we should do the same thing to xml:lang and xml:space. Moreover, xsi:type, xsi:nil, xsi:noNamespaceSchemaLocation and xsi:schemaLocation also need something similar. We also have XInclude and XLink. Where should e stop?

I think the difference is that the spec says that EPUB Reading Systems MUST support the XML Base spec. We don't require support for XLink or XInclude.

@mattgarrish
Copy link
Member

EPUB Reading Systems MUST support the XML Base spec.

But that's only true, I believe, because XHTML and SVG used to allow xml:base. Since the specifications we now reference no longer include the attribute, we could potentially drop the requirement as part of warning authors against the use of it. It also seems like another case where support is going to be driven by the browser cores more than anything we say in our specification.

I can't imagine it's ever been widely used in epub, either.

@dauwhe
Copy link
Contributor Author

dauwhe commented Jan 7, 2021

How about adding this to the RS spec, just to let implementers know what the story is:

Note: XML Base attributes are not allowed in EPUB Package Documents. They may appear in EPUB Content Documents, although both HTML and SVG are removing support.

@iherman
Copy link
Member

iherman commented Apr 2, 2021

The issue was discussed in a meeting on 2021-04-01

List of resolutions:

View the transcript

2. Clarify base IRI

See github pull request #1468.

Dave Cramer: this is a PR about base IRI in package documents

See github issue #1374, #1456.

Matt Garrish: basically all the PR does is define base IRI for package because it wasn't clear how that was to be calculated
… it is defined for container.xml, but for package doc there was just a stray sentence that absolute IRIs are to be calculated from base IRI of document
… Ivan wanted some clarification
… PR pulls out that statement and elaborates on it
… Laurent suggested that maybe we define everything in core docs as paths rather than IRIs
… not sure why we'd want to do that since we already define abstract container to allow us to use IRI language
… how far do we want to get into relative paths vs absolute paths
… can we just clean up what we already have, or do we want to take up this IRI vs path question at this point?

Dave Cramer: i'm happy with PR
… worried about Laurent's idea because not sure we want to start talking about paths when all the other specs that we rely upon are already happy about how we define things
… also, this issue didn't come out of a concrete problem with a RS or similar, it came from abstract issue about spec language

Matt Garrish: the PR seemed to make Ivan happy
… from the perspective of what we need to describe here, I think we've done enough
… the question about changing to path language leads off into other areas

Dave Cramer: i think we should accept the PR, and then if Laurent wants to raise the other question, maybe he can come back with a more detailed rationale

Matt Garrish: there was an issue about whether relative IRIs MUST be resolved, but it was only ever the intention that it be possible if you need to do it

Proposed resolution: Merge PR 1468 (Wendy Reid)

Dan Lazin: +1

Ben Schroeter: +1

Matt Garrish: +1

Toshiaki Koike: +1

Wendy Reid: +1

Matthew Chan: +1

Masakazu Kitahara: +1

Brady Duga: +1

Shinya Takami (高見真也): +1

Resolution #1: Merge PR 1468

@dauwhe
Copy link
Contributor Author

dauwhe commented May 4, 2021

I have a new suggestion: What if we just remove the statement saying that an EPUB reading system "MUST be a conformant application as defined by XMLBASE"?

That doesn't prevent a reading system from supporting xml:base. But I think the fact that we currently mention it gives xml:base far more importance than it deserves. If I was building a reading system today, I would certainly not support it.

@iherman
Copy link
Member

iherman commented May 4, 2021

Do we have any idea whether any EPUB document ever used xml:base? My bet would be that it has never been used...

I would agree with @dauwhe's proposal...

@dauwhe dauwhe added the Agenda+ Issues that should be discussed during the next working group call. label May 19, 2021
@dauwhe dauwhe self-assigned this May 19, 2021
@dauwhe dauwhe removed the Agenda+ Issues that should be discussed during the next working group call. label May 21, 2021
@iherman
Copy link
Member

iherman commented May 21, 2021

The issue was discussed in a meeting on 2021-05-21

List of resolutions:

View the transcript

2. xml:base in package document and elsewhere

See github issue #1456.

Dave Cramer: XML base
… Have a pull request to remove a line

See github pull request #1678.

Dave Cramer: xml base is disappearing from the world at large
… It isn't forbidden, but not a useful requirement to support it
… Authors shouldn't do it, and we should make it so RSes shouldn't support it in the future

Proposed resolution: dump xml:base from the spec (Ivan Herman)

Brady Duga: +1

Dave Cramer: +1

Gregorio Pellegrino: +1

Toshiaki Koike: +1

Bill Kasdorf: +1

Masakazu Kitahara: +1

Ben Schroeter: 0

Dan Lazin: +1

Matt Garrish: +1

Resolution #2: dump xml:base from the spec

@mattgarrish
Copy link
Member

Closing with #1678

@mattgarrish mattgarrish added the EPUB33 Issues addressed in the EPUB 3.3 revision label Sep 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
EPUB33 Issues addressed in the EPUB 3.3 revision Spec-ReadingSystems The issue affects the EPUB Reading Systems 3.3 Recommendation Topic-ContentDocs The issue affects EPUB content documents Topic-PackageDoc The issue affects package documents
Projects
None yet
Development

No branches or pull requests

5 participants