Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(URGENT) Bilingual collection not being generated for v2.01 and v3.01 #242

Closed
manuelfuenmayor opened this issue Dec 11, 2024 · 16 comments
Closed
Assignees
Labels
bug Something isn't working

Comments

@manuelfuenmayor
Copy link
Contributor

When I execute:

bundle exec metanorma collection sources/si-brochure/2.01/collection.yml

A brochure/ folder is created where independent documents, English and French, generate properly. But the collection that contains both versions, not.

@manuelfuenmayor manuelfuenmayor added the bug Something isn't working label Dec 11, 2024
@ronaldtse ronaldtse changed the title Bilingual collection not being generated for v2.01 (URGENT) Bilingual collection not being generated for v2.01 and v3.01 Dec 16, 2024
@ronaldtse ronaldtse assigned opoudjis and unassigned ronaldtse Dec 16, 2024
@ronaldtse ronaldtse moved this to 🌋 Urgent in Metanorma Dec 16, 2024
@ronaldtse
Copy link
Contributor

The resulting file listing is like this:
Screenshot 2024-12-17 at 1 12 05 AM

I have attached the folder "brochure" as zipped below. Notice that under "brochure", these files are identical (i.e. contains nothing but just with a cover page:

  • collection_en.pdf (this should not be created)
  • collection_fr.pdf (this should not be created)
  • collection.pdf (this file should contain both the contents of English and french)

The way to build this is to use the PR #256, i.e.:

git checkout rt-test-collection
bundle exec metanorma site generate
# then you can open the `_site` folder
open _site

brochure.zip

@opoudjis
Copy link
Contributor

Note that collection_en and collection_fr were being created all along, they are not new.

@opoudjis
Copy link
Contributor

I'm going to have to pass this on to @Intelligent2013 . The Presentation XML for the bilingual brochure is being generated as normal, and Alex is in a better position than me to assess what has gone wrong. I am passing the artefact on to him.

@opoudjis
Copy link
Contributor

The PDF error file notes a very large number of unresolved anchors, but I do not know if that is enough to explain it.

@opoudjis
Copy link
Contributor

This was a bug in the PDF stylesheet from the Presentation XML refactor, but @Intelligent2013 is still reviewing reported unresolved IDs, and I may follow them up myself. However he is getting the PDF now compiling correctly.

@Intelligent2013
Copy link
Contributor

Intelligent2013 commented Dec 17, 2024

BIPM XSLT updated. PDF generation issue fixed.

Still there are 8 Unresolved ID reference issues in 3 categories:

1st category

  • Page 50: Unresolved ID reference "cgpm9th1948r7r7_si-brochure-fr" found.
    In the Presentation XML there is xref, but there isn't element with id:
<xref target="cgpm9th1948r7r7_si-brochure-fr">Résolution 7</xref>

2nd category

  • Page 118: Unresolved ID reference "_93cb06a5-66e8-4639-926f-cb193c5dbe62" found.
  • Page 118: Unresolved ID reference "_13daa7c1-71a5-4e7e-8fb5-bc0481530dfa" found.

In the table there are two <fn reference="b"> :

<fn reference="b">
	... Le radian est aussi l’unité cohérente d’angle<bookmark id="_da3c27e3-1a1d-4807-a3dd-1d610eb7bef0"/> de phase. Pour les phénomènes périodiques, l’angle<bookmark id="_fa209505-5cc5-4f9a-9ace-0062a888174e"/> de phase augmente de <stem block="false" type="MathML">
			...
</fn>
<fn reference="b">
	... Le radian est aussi l’unité cohérente d’angle<bookmark id="_93cb06a5-66e8-4639-926f-cb193c5dbe62"/> de phase. Pour les phénomènes périodiques, l’angle<bookmark id="_13daa7c1-71a5-4e7e-8fb5-bc0481530dfa"/> de phase augmente de <stem block="false" type="MathML">
			...
</fn>

First fn is displayed in PDF, but second is duplicate. But there are references to bookmark from duplicate fn.

  • Page 118: Unresolved ID reference "_005c42f4-03bf-490f-9a76-d740417e0116" found.
  • Page 118: Unresolved ID reference "_e6054dbd-2a8b-4e32-acdc-211615aeee8d" found.
    See above.

3rd category:

  • Page 119: Unresolved ID reference "regles_ecriture_si-brochure-fr" found.
<bookmark to="regles_ecriture_si-brochure-fr" id="_cd3b0f9e-16a8-4548-90e5-dcb402ebdbc7"/>
...
<xref target="_cd3b0f9e-16a8-4548-90e5-dcb402ebdbc7" to="regles_ecriture_si-brochure-fr" pagenumber="true">

@opoudjis how should I process bookmark/@to. Just add the anchor similar to bookmark/@id?

  • Page 119: Unresolved ID reference "symboles_recommandes_si-brochure-fr" found.
  • Page 120: Unresolved ID reference "symboles_ecriture_si-brochure-fr" found.
    see above.

@Intelligent2013
Copy link
Contributor

Just add the anchor similar to bookmark/@id?

No, it isn't working. I this case there are two elements with the same id cdm_si-brochure-fr for the this XML

<bookmark to="cdm_si-brochure-fr" id="_a8dfbb50-052f-49b4-8985-c64d668c57a1"/>
...
<clause id="cdm_si-brochure-fr" obligation="informative" displayorder="3">

Then, how to process bookmark/@to?

@opoudjis
Copy link
Contributor

cgpm9th1948r7r7_si-brochure-fr:

sections-a1-en/05-9th-cgpm.adoc has

[[cgpm9th1948r7r7]]
==== Resolution 7

Its French counterpart is:

[[cgpm9e1948r7r7]]
==== Résolution 7 (((litre (stem:["unitsml(L)"] ou stem:["unitsml(l)"])))) (((unité(s),symboles)))

So the French crossreference needs to be to cgpm9e1948r7r7, not cgpm9th1948r7r7. Fixing in source.

opoudjis added a commit that referenced this issue Dec 17, 2024
@opoudjis
Copy link
Contributor

2nd category. Indeed, we have a repeated table footnote, and the footnote repetitions contain index references that are expanded:

<fn reference="b"><p id="_f4133767-387c-a0a2-9c00-fe1ac9467d66_si-brochure-fr">
....
 Le radian est aussi l’unité cohérente d’angle<index><primary>angle</primary></index> de phase.
....
</p></fn>

This results in:

Le radian est aussi l’unité cohérente d’angle<bookmark id="_da3c27e3-1a1d-4807-a3dd-1d610eb7bef0"/> de phase
....
Le radian est aussi l’unité cohérente d’angle<bookmark id="_93cb06a5-66e8-4639-926f-cb193c5dbe62"/> de phase

Now, as it turns out, the footnote is only rendered once, not twice. So the references to the two table footnote instances from the index need to be collapsed into one.

But that I would argue is the responsibility of whoever is collapsing the two cross-references into one.

If Presentation XML were collapsing the two footnotes into one, then it would be responsible for collapsing any crossreferences to them, removing cross-references to the content of the duplicate footnote.

But we are leaving this as a format-specific collapse; it may not be observed by all downstream formats, because not all downstream formats may allow the one footnote to be shared among multiple cells. (I think we make it to anyway, but I'm arguing a principle here.) If HTML and PDF formatters are stripping footnote content as redundant, I think it is their responsibility to identify and remove anything cross-referencing that content.

The clean option is that we should search for and remove any xref pointing to duplicate footnotes which we eliminate, in HTML and in PDF processing....

... But first, that is going to be painful to do, without a lot of semantic annotation. Second, it is quite pedantic to insist on separate index entries for repeated footnotes, even if we do preserve the footnote. So I am going to go ahead, and detect and remove index entries from footnotes that I know to be duplicate of other footnotes.

@opoudjis
Copy link
Contributor

bookmark/@to results from an index-range:[] expression:

index-range:regles_ecriture[(((grandeurs,règles d’écriture)))]

translates to:

<bookmark to="regles_ecriture_si-brochure-fr" id="_cd3b0f9e-16a8-4548-90e5-dcb402ebdbc7"/>

which is pointed to by:

<xref target="_cd3b0f9e-16a8-4548-90e5-dcb402ebdbc7" to="regles_ecriture_si-brochure-fr" pagenumber="true">
<span class="fmt-element-name">chapitre</span> <semx element="autonum" source="chapter5_si-brochure-fr">5</semx></xref>

The intention is that the index entry "grandeurs,règles d’écriture" cover the span between where the index-range:[] macro is, and the anchor regles_ecriture. See https://www.metanorma.org/author/topics/inline_markup/index_terms/ , Entry ranges.

That means that bookmark/@to is redundant, as that information is added in xref/@to, and in fact has been added in error: it is not foreseen in the grammar, and we do not need milestone expressions indicating range. I will remove bookmark/@to, and you can ignore it.

@opoudjis
Copy link
Contributor

It is long overdue for me to collapse the ISO and BIPM indexing code, which are largely the same, and to move them to isodoc. BIPM differs from ISO in having subclauses, one per alphabetic letter; that will be treated as a BIPM specialisation.

@opoudjis
Copy link
Contributor

@Intelligent2013 Please confirm. Involves all of metanorma-standoc, isodoc, and metanorma-bipm

@opoudjis opoudjis moved this from 🌋 Urgent to 👀 In review in Metanorma Dec 17, 2024
@Intelligent2013
Copy link
Contributor

1st category
2nd category

Fixed.

3rd category

@opoudjis there are 3 errors:

Page 118: Unresolved ID reference "regles_ecriture_si-brochure-fr" found.
Page 118: Unresolved ID reference "symboles_recommandes_si-brochure-fr" found.
Page 119: Unresolved ID reference "symboles_ecriture_si-brochure-fr" found.

All these errors relate to the case when there is xref/@to but there isn't element bookmark with id @to:

<xref target="_badf9a9c-8ff6-484d-bee1-debb6546c2a6" to="regles_ecriture_si-brochure-fr" pagenumber="true">

I can't just ignore xref/@to, because it's using to show the page ranges in the index (see valeur numérique, 36–39):

image

Here is example with working xref/@to:

...
doivent être précisés. <bookmark id="valeur_numerique_si-brochure-fr"/>
...
<xref target="_f15fba7b-f078-4462-9471-2e06b01dd464" to="valeur_numerique_si-brochure-fr" pagenumber="true">

As I remember this behavior added 4 years ago for metanorma/metanorma-bipm#67 (comment)

How should I modify xref processing?

@opoudjis
Copy link
Contributor

I get it. Clearly for <xref target="A" to="B"> there needs to be an element with @id A, and an element with @id B. As documentation says, the index-range:to[] presupposes the existence of an element with @id = to.

And if there is no element with id = regles_ecriture, that is an editorial problem.

They are here:

[[chapter5]]
== Règles d’écriture des noms et symboles d’unités et expression des valeurs des grandeurs index-range:regles_ecriture[(((grandeurs,règles d’écriture)))] index-range:prefixes_si-3[(((préfixes SI)))] index-range:symboles_ecriture[(((symboles,écriture et emploi des)))] (((symboles,unités))) (((unité(s),noms)))(((unité(s),règles d’écriture)))

...

=== Règles et conventions stylistiques servant à exprimer les valeurs des grandeurs index-range:symboles_recommandes[(((grandeurs,symboles (recommandés))))] index-range:valeur_numerique[(((grandeurs,valeur numérique)))] (((symboles,unités (obligatoires))))

Note that other index-range macros work fine: index-range:prefixes_si-3[] works, because prefixes_si-3 is a bookmark given at the end of the subclause "Noms des unités". So there is nothing to fix in code: the error is in the text.

So these three are bookmarks that have been left out in error on the French side. The English side is indexed quite differently, so I cannot extrapolate them from there.

I am afraid therefore that I will need to refer you to the editors to fill these missing bookmarks in in the French document.

@manuelfuenmayor ? @anermina ?

@ronaldtse
Copy link
Contributor

ronaldtse commented Dec 18, 2024

The collection now generates properly. However I have not checked the issue you are discussing about the Index.

Screenshot 2024-12-18 at 8 28 58 PM

@ronaldtse
Copy link
Contributor

Closing this ticket for the original issue. The remaining issue of indexes is moved to #273 .

@github-project-automation github-project-automation bot moved this from 👀 In review to ✅ Done in Metanorma Dec 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: Done
Development

No branches or pull requests

4 participants