Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions about page-list #1471

Closed
gregoriopellegrino opened this issue Jan 16, 2021 · 6 comments
Closed

Questions about page-list #1471

gregoriopellegrino opened this issue Jan 16, 2021 · 6 comments
Labels
Accessibility11 Issues addressed in the Accessibility 1.1 revision Cat-Accessibility Grouping label for all accessibility related issues

Comments

@gregoriopellegrino
Copy link
Contributor

gregoriopellegrino commented Jan 16, 2021

Hi,
working in accessible EPUB production I have some questions with respect to page-list:

  • do content creators have to list all pages, or can they list only some of them (e.g. the page number at the beginning of the chapter)?
  • how to deal with empty pages?
  • what should RSs do if the page-list has jumps in numbering? Interpolate?
  • are there any rules about how RSs should deal with non-Arabic page numbers?
@mattgarrish
Copy link
Member

  • do content creators have to list all pages, or can they list only some of them (e.g. the page number at the beginning of the chapter)?

Technically we don't require every page, as digital versions may not include all the same content as the print, but the example you give renders the feature mostly useless. It doesn't provide any value, as the table of contents already gets users to the start of chapters. Users typically want to reach other pages.

We might want to address this in the accessibility specification, but it could be tricky to word (e.g., something like "the publication must include page break markers for all print pages represented in the content")

  • how to deal with empty pages?

I believe best practice is to include these for completeness so that users don't get confused that there are pages missing.

I'd offer an alternative of omitting them and stating in the summary that blank pages are not included. Users aren't going to jump to blank pages except by accident. (But this could conflict with a requirement to represent all page breaks.)

  • what should RSs do if the page-list has jumps in numbering? Interpolate?

We generally give reading systems flexibility, but I wouldn't expect them to add missing pages. If the publication only consists of pieces of a work, the user will only want to know which pages are actually available. I'd expect if the user enters a page number that doesn't exist, the reading system would simply inform them of the fact.

  • are there any rules about how RSs should deal with non-Arabic page numbers?

Not sure what you mean here.

/cc @GeorgeKerscher @avneeshsingh @clapierre

@danielweck
Copy link
Member

In Thorium, there are 2 distinct implementation use cases:

  1. the nav@epub:type="pageList" in the XHTML Navigation Document => used to implement the "goto page" feature based on user input / "fuzzy" string matching (i.e. not accurate number comparisons), and of course also based on the exposed list of authored "pages" in the GUI (e.g. some kind of selector control for predefined "as-is" values). In both "goto page" cases, a link from the NavDoc is activated and followed into a specific XHTML Content Document at a specific anchor location (the expectation is that this lands on a page-break, see below).
  2. the *@epub:type="page-break" in the XHTML Content Documents => used by the reading system engine to figure out what the "current page" is during the reading experience. Put simply, given a DOM location (e.g. range of characters in a paragraph), the nearest preceding page-break is discovered by walking the DOM tree, and if any, it is reported to the application which displays its authored textual value (e.g. the contents of the span element that carries the page-break).

As you can see, this is all string-based, there are no number calculations, such as comparisons that would enable some ordering mechanism.

@gregoriopellegrino
Copy link
Contributor Author

My concern is that with the current specifications it seems difficult (both on the content creator side and on the reading solution side) to implement effective solutions with respect to the relationship between virtual and paper pages.

I propose to indicate clear guidance notes for content creators and reading solutions with respect to the above issues.

  • are there any rules about how RSs should deal with non-Arabic page numbers?

Not sure what you mean here.

Sometimes in the same book we can have the introductory pages in Roman numerals (I, II, III, etc.), then the main content in Arabic numerals (1, 2, 3, 4, ...), then the appendices with Latin letters (a, b, c, ...). I imagine that managing these different types of numbering in the same publication by reading solutions can be complex, but perhaps we don't have anything particular to report to implementers.

@mattgarrish
Copy link
Member

I imagine that managing these different types of numbering in the same publication by reading solutions can be complex

Sure, I'm just not sure what we can do about that. The page list is required to match the default reading order (not the order of content in the print source), so reading systems already have an ordering by default.

I don't know what we could tell them to do when it comes to internal reordering or optimization if they don't retain the provided ordering. That's beyond our control. All we could do is note to be aware that page numbering may not follow a single naming convention even within a single work.

@mattgarrish
Copy link
Member

I've broken this issue up into more atomic issues (@clapierre, @GeorgeKerscher and I have also been having similar discussions).

I'm going to close this one so we can continue the discussions in their respective issues.

@iherman
Copy link
Member

iherman commented Apr 2, 2021

The issue was discussed in a meeting on 2021-04-01

List of resolutions:

View the transcript

3. Pagelist Requirements

See github pull request #1588.

See github issue #1471.

Dave Cramer: we had required that the order of the nav elements had to match the order of the elements in the spine
… then when this was implemented in epubcheck, it turned out that it made a bunch of epubs invalid
… we eventually got that sorted out, but we never went back to address the corresponding issue in the pagelist
… mgarrish is this going to be covered in epub a11y instead of core now

Matt Garrish: yes. The issue is that this forces the pagelist to match the order of the digital version, which is not at all helpful when there is an expectation that the pagelist correspond to print

Brady Duga: why are we removing this requirement?
… the toc is not without controversy because it breaks UIs
… we didn't want to make publishers go back and fix their content
… is this similar?

Matt Garrish: here we're just removing the strict requirement that the pagelist match
… so we wouldn't be invalidating anything
… not the same as the toc because the toc is being used by RSes in various ways
… but we don't know that RSes are using pagelist in the same way
… allowing authors to reorder pagelist might actually make it align with expectations

Brady Duga: so, does this mean, for e.g., that page 7 could come after page 10?

Matt Garrish: yes, but you could still do that right now if that was the order of content in the spine

Wendy Reid: with the current requirements we might accidentally be forcing the page 7 after page 10 thing
… with the PR that might still happen, but it would be by choice, as opposed to being forced by spec

Ben Schroeter: you could also have a case where the print book has a sidebar that in the digital version comes after the body text
… in cases like that you might want the pagelist to come out of order intentionally

Matt Garrish: i'm not sure if that would be an example of where we'd want that for a11y purposes or not, but there are situations where you'd want the pagelist to match print content rather than spine order

Dave Cramer: i'm seeing several cases where we're making the choice that these "best practices" are better off being in the a11y spec when the directly impact a11y rather than try to bake good practice into the epub spec itself, especially when there is such a diversity of content

Ben Schroeter: but i've always wonder why pagelist is an a11y feature
… it is, but it is also useful for everyone

Matt Garrish: that's a possibility we could take up, i.e. just recommend something about page sequencing

Proposed resolution: Merge PR 1588 (Wendy Reid)

Ben Schroeter: +1

Matthew Chan: +1

Matt Garrish: +1

Shinya Takami (高見真也): +1

Wendy Reid: +1

Dan Lazin: +1

Toshiaki Koike: 0

Brady Duga: +0 as I don't fully understand all the implications

Masakazu Kitahara: 0

Resolution #2: Merge PR 1588

Wendy Reid: i understand your concern duga
… and this is something we can take a look at when we do virtual locators as well
… as there is crossover

Brady Duga: it may be fine, but i'd just like to think about it some more first

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Accessibility11 Issues addressed in the Accessibility 1.1 revision Cat-Accessibility Grouping label for all accessibility related issues
Projects
None yet
Development

No branches or pull requests

4 participants