Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify base iri of package document #1468

Merged
merged 7 commits into from
Apr 2, 2021
Merged

Conversation

mattgarrish
Copy link
Member

@mattgarrish mattgarrish commented Jan 14, 2021

Maybe not a complete fix for #1374 (and not for #1456), but I've taken the prose about resolving relative IRIs from the item element definition and created a section in the RS spec on resolving relative IRIs to cover this (resolving to absolute IRIs isn't an authoring concern).

See if there's enough of an explanation of the base IRI when the package document is zipped. I don't think we want to go too deep into resolving files within a zip container, as I can't find there's a single way of doing this. It looks like a "!" (jar scheme), "#", or OS-specific path separator followed by the relative path of the file in the container are viable.

Reading Systems: preview diff


Preview | Diff

Copy link
Member

@iherman iherman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be worth adding a note to the change section of the RS spec. It may be an important clarification relative to the previous version.

@llemeurfr
Copy link

In the EPUB3.3 Core spec, in the item section, we still read: "Each item element identifies a Publication Resource by the IRI [RFC3987] provided in its href attribute. The IRI MAY be absolute or relative. In the case of relative IRIs, the IRI of the Package Document is used as the base when resolving to absolute IRIs. The resulting absolute IRI MUST be unique within the manifest scope."

In this Core spec, the "IRI of the Package Document" is not defined: this was the original issue #1374. Tips must be given to EPUB authors on how to solve this issue, therefore something must be done first in the EPUB 3.3 Core document.

More:

a) In Zip packages, such "relative" strings must in fact be seen as relative paths. As it happens, both relative IRIs and relative Unix paths have the same syntax; but they're not the same thing. It is weird but still real: internal resource are referenced via relative paths, external resources are referenced via absolute URLs, and both can be values of the same href attribute.

The Zip AppNote uses the term path, never URL/URI/IRI. Section 4.4.17.1 of the Zip AppNote gives clear indications about the syntax, but does not define "relative path"; I suppose it was clear for the authors that this path was relative to the "root directory" (which is not a notion defined in the Zip AppNote).

b) I don't think RS developers need to resolve (if resolving means converting to absolute IRIs) relative IRIs/paths in Zip packages. They simply need to find and fetch these files using their relative path, which is a standard Zip API feature.

In consequence, I'd be in favor of

  1. changing the wording in the Core spec first, replacing "relative IRIs" by "relative paths" (relative to the Root Directory which is already defined there).
  2. reviewing sections 3.2.1 and 7.2.1 in the RS spec and thinking about removing them entirely.

Copy link

@llemeurfr llemeurfr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see my previous comment.

@mattgarrish
Copy link
Member Author

In the EPUB3.3 Core spec, in the item section, we still read:

I changed that if you look at the preview/diff links at the bottom of the first comment (pr-preview always creates those automatically):

Each item element identifies a Publication Resource by the IRI [RFC3987] provided in its href attribute. The IRI MAY be absolute or relative, but each IRI MUST be unique within the manifest scope after resolution to an absolute IRI [EPUB-RS-33].

The last few words link across the the RS section so authors interested in how resolution works can read more without overly complicating that section.

1. changing the wording in the Core spec first, replacing "relative IRIs" by "relative paths"

But isn't this why we have the abstract container defined separately from the zip container? It allows us to talk about the paths consistently as IRIs so we don't get into these kinds of technicalities. This sounds like it leads us to redefining all the standards we rely on to also be paths instead of IRIs simply because an EPUB may be zipped when consumed

I'd argue we're getting into too much detail with the package document base by delving into zip paths, but can live with it for thoroughness.

Base automatically changed from master to main February 4, 2021 05:42
@mattgarrish mattgarrish added the Status-NeedsDiscussion The issue requires further discussion in the working group label Feb 17, 2021
@mattgarrish
Copy link
Member Author

I don't think RS developers need to resolve (if resolving means converting to absolute IRIs) relative IRIs/paths in Zip packages.

Yes, I've fixed this in the latest commit as it's not a requirement in all cases. The intention is only that if you require an absolute you must use the base iri.

@dauwhe
Copy link
Contributor

dauwhe commented Apr 1, 2021

How would you use an absolute IRI in the href attribute of an item? What scheme would you use?

@mattgarrish
Copy link
Member Author

How would you use an absolute IRI in the href attribute of an item?

Which case are you referring to?

Absolute IRIs in general would be critical for any external resources.

Resolved from a relative one might be useful if you're serving it off the web, I suppose.

What scheme would you use?

We've avoided trying to list out allowed and disallowed protocols leaving it to whatever makes sense to use. The only requirement for relative IRIs is that they must resolve within the container. But I believe it's valid to use file: URLs if you want to write out a full path.

@dauwhe
Copy link
Contributor

dauwhe commented Apr 1, 2021

Which case are you referring to?

an item in the spine.

@mattgarrish
Copy link
Member Author

an item in the spine.

Still not sure I understand.

The spine itself only has itemref elements and the ref attributes are IDREFs.

What would be unique about an itemref referencing an item that is an external resource, if that's what you mean? You'd need a fallback, but otherwise there's nothing exceptional about it, is there?

@mattgarrish mattgarrish merged commit 3d908f1 into main Apr 2, 2021
@mattgarrish mattgarrish deleted the fix/issue-1456-packagedoc branch April 2, 2021 00:39
@iherman
Copy link
Member

iherman commented Apr 2, 2021

The issue was discussed in a meeting on 2021-04-01

List of resolutions:

View the transcript

2. Clarify base IRI

See github pull request #1468.

Dave Cramer: this is a PR about base IRI in package documents

See github issue #1374, #1456.

Matt Garrish: basically all the PR does is define base IRI for package because it wasn't clear how that was to be calculated
… it is defined for container.xml, but for package doc there was just a stray sentence that absolute IRIs are to be calculated from base IRI of document
… Ivan wanted some clarification
… PR pulls out that statement and elaborates on it
… Laurent suggested that maybe we define everything in core docs as paths rather than IRIs
… not sure why we'd want to do that since we already define abstract container to allow us to use IRI language
… how far do we want to get into relative paths vs absolute paths
… can we just clean up what we already have, or do we want to take up this IRI vs path question at this point?

Dave Cramer: i'm happy with PR
… worried about Laurent's idea because not sure we want to start talking about paths when all the other specs that we rely upon are already happy about how we define things
… also, this issue didn't come out of a concrete problem with a RS or similar, it came from abstract issue about spec language

Matt Garrish: the PR seemed to make Ivan happy
… from the perspective of what we need to describe here, I think we've done enough
… the question about changing to path language leads off into other areas

Dave Cramer: i think we should accept the PR, and then if Laurent wants to raise the other question, maybe he can come back with a more detailed rationale

Matt Garrish: there was an issue about whether relative IRIs MUST be resolved, but it was only ever the intention that it be possible if you need to do it

Proposed resolution: Merge PR 1468 (Wendy Reid)

Dan Lazin: +1

Ben Schroeter: +1

Matt Garrish: +1

Toshiaki Koike: +1

Wendy Reid: +1

Matthew Chan: +1

Masakazu Kitahara: +1

Brady Duga: +1

Shinya Takami (高見真也): +1

Resolution #1: Merge PR 1468

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status-NeedsDiscussion The issue requires further discussion in the working group
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants