Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Harmonise content of the schema definition files #459

Closed
larsbarring opened this issue Feb 28, 2024 · 14 comments · Fixed by #468
Closed

Harmonise content of the schema definition files #459

larsbarring opened this issue Feb 28, 2024 · 14 comments · Fixed by #468
Labels
enhancement Enhancements to the website's presentation or contents

Comments

@larsbarring
Copy link
Contributor

larsbarring commented Feb 28, 2024

This is one in a string of issues that aims to improve the format of the XML version of the standard name table files, see #457 for background and overview.

This particular issue directly builds on, and implements the XML schema changes brought about by the the following issues:

The first three deals with the content of the XML files. Appendix B specifies that an XML schema file is to be specified in the XML file.

The changes introduced in the issues above require this schema file(s) to be updated. In addition, as the existing schema files (cf. #433) has not been fully implemented in the versions of the XML files, the proposal is to harmonise the various schemas to create a new version that can be used consistently in all published versions of the XML files until the schema may be changed in the future.

In essence, the associated PR #468 does the following:

  • allows an alias entry to point to two standard names (GDT files #509)
  • adds the <conventions> tag to the "header part" of the XML file (500)
  • adds the <first_published_date> tag to the "header part" of the XML file (restore .nojekyll, which I accidentally deleted last week #511)
  • adds references to the specific issues in the annotation> tag of the elements that have been added/changed
  • adds a comment (date, author) about these updates
  • increase readability by unifying tabs/spaces and newlines.
@larsbarring
Copy link
Contributor Author

The associated draft PR (#468) should be activated only after PR #458 has been merge because in that PR the directory structure is established.

But the actual content of the schema file, which is the focus of this issue, can be discussed independently of where the file eventually should end up.

@JonathanGregory
Copy link
Contributor

I haven't read an xsd file before, but as far as I can tell your new one looks fine. Thanks! Because of the change of location, the PR 468 creates a new file, which means we can't see the differences. Your replacement of TABs, which I think is helpful, is also an obstacle. Therefore in a repo of my own I have created a PR to convert the old version (1.1) modified to have to new-style spacing into your new one (2.0). This allows me to see the differences in content, which maybe you or others might find helpful. (Perhaps there's some neater way to do that?)

@larsbarring
Copy link
Contributor Author

larsbarring commented Mar 24, 2024

Thanks for checking this. Another "crash test" is to actually apply the schema file to one of the XML files and see that errors are as expected. That is, if there are any errors they must point at what we can identify as formatting issues in the XML file. On the other hand, if any error point at things we know are as intended it is an error in the schema file.

larsbarring added a commit to larsbarring/cf-convention.github.io that referenced this issue Mar 25, 2024
@larsbarring
Copy link
Contributor Author

Dear Jonathan,

As this is in the cf-convention.github.io repo it should (if I get the procedure right) be enough with support from one qualified person to start the countdown clock. Would your comment from March 24 enable us to start the countdown? Of course, it would as always be good with an extra pair of eyes hence ping @sadielbartholomew).

Many thanks,
Lars

@JonathanGregory
Copy link
Contributor

Dear Lars

I agree that it would be helpful if someone else had a look, especially if there's anyone who's familiar with xsd and xml. Could we accept this change in three weeks from now, provided that someone else has checked by then and thinks it's OK?

Best wishes

Jonathan

@sadielbartholomew
Copy link
Member

Hi Lars and Jonathan,

I am happy to take a look though it will probably not be until after Easter I have a chance to. Just to check precisely what is useful here, you want someone to review #468 so I should do that?

@larsbarring
Copy link
Contributor Author

larsbarring commented Mar 27, 2024

Hi Sadie,

Many thanks for this - much appreicated!!
After Easter will be fine :-)

Lars

@larsbarring
Copy link
Contributor Author

larsbarring commented Apr 6, 2024

In the PR there have been some comments that more evolved in to a conversation about technical scope of the PR, and thus this issue. Hence I am copying them over here:

  • @DocOtak (March, 27):
    "Should the xsd name space be versioned as 1.0 or 1.1? Looks like the spec says you don't need the version if you don't want to and that all 1.0 files should be valid 1.1."

  • @larsbarring (April, 2):
    "I am not sure about which version to use because I focussed on making as few changes as possible. And from my testing is seems that version 1.0 worked as expected. I have no deeper insight into these matters, and if there are reasons to change I it would be simple to implement."

  • @ethanrd (April, 2):
    "Are you asking about the version in the name of the .xsd file? There does not appear to be an XML Namespace defined (the XSD does not specify a targetNamespace and the XML files use a noNamespaceSchemaLocation attribute to reference the .xsd file).
    (I haven't been following this closely. Sorry if I missed earlier discussions.)"

  • @DocOtak (April, 4):
    "@ethanrd I was asking if the xmlns in line 8 in this PR should be versioned more specifically, but I don't know the trade offs or potential repercussions, so was hoping there is expertise here. https://www.w3.org/TR/xmlschema11-1/#langids"

  • @larsbarring (April, 5):
    "I am not (at all) an expert on these matters, but after having looked at https://www.w3.org/TR/xmlschema11-1/#ns-bindings, and it seems clear that all the namespace prefixes xs and xsi will have to be changed throughout the file. Of course this can be done, but I think that this is only warranted if we can establish a tangible advantage."

  • @ethanrd (April, 6):
    "Hi @DocOtak and @larsbarring - I think the only reason to look at updating the XML Schema versions would be if newer versions had new features that we wanted to use in the standard names XSD files. I suspect we use pretty standard XML Schema capabilities so probably don't have any need to update.

    Also, the referenced W3C document seems to be the latest version. And the "xs" and "xsi" namespace URLs in the CF standard name .xsd and .xml` files match what is listed in Section 1.3.3 Conventional Namespace Bindings:

  • @DocOtak (April, 6):
    "@ethanrd Only thing I can think of is if we decide that having some ordering to elements be in the schema, e.g. aliases must be in some lex sorted order, then I'm pretty sure that's a 1.1 feature"

  • @ethanrd (April, 6):
    "The referenced document is XML Schema 1.1 which is the latest version. So I would think we would be good with the namespaces specified in section 1.3.3. Though the "2001" in the URL when 1.1 was approved in 2012 seems odd. I expect it is a backward compatibility thing."

@larsbarring
Copy link
Contributor Author

larsbarring commented Apr 6, 2024

Hi @DocOtak and @ethanrd -- thanks for these useful comments :-) If I understand the comments the situation is as follows:

  • The current schema version in the PR is basically fine as is.
  • If we want to enforce sorting of the standard name entries and/or alias entries we have to update to a newer version that supports this.

In the xml files the current situation is that the standard name entries are already sorted (or at least it seems so -- I have not checked) but the alias entries are not. I fully agree that having both types of entries sorted would be helpful.

However, the sorting as such will have to be done when producing the xml file. For the already published standard name table files this will be done in #470 (or the associated PR (to come)). The xsd file we are dealing with here can at best only help to check and enforce such a sorting. We also have to consider that this schema file will have to work with new table version to be produced. Hence I suggest that we defer (but not forget!) enforcing this to a later issue.

@larsbarring
Copy link
Contributor Author

@ethanrd and @DocOtak, are you OK to move any work towards discussing implementation of ordering in the XSD file to a separate issue (which also was suggested by @JonathanGregory in a different issue). If so, are you otherwise happy with the current version in the PR?

@ethanrd
Copy link
Member

ethanrd commented Apr 16, 2024

Hi Lars @larsbarring - Yes, I'm good moving this issue/PR forward and moving any ordering discussion to a separate issue.

@sadielbartholomew
Copy link
Member

Hi @larsbarring, I am coming back to read this since I promised I would review the corresponding PR after Easter (which was a while back now). But there have been a lot of comments since. Please can I check what the status of #468 is, notably is it ready for review or should I wait for some update(s)?

@larsbarring
Copy link
Contributor Author

Hi Sadie,

I think that it is ready for final review, your sharp eyes are always appreciated. Ethan already supports the PR , as does @JonathanGregory conditional on another review (Ethan and you). If you do not find anything specific I think that the PR is good to be merged, would you then mind doing that?

Many thanks,
Lars

@sadielbartholomew
Copy link
Member

Thanks Lars for clarifying. In that case, I will try to review that (and merge assuming no issues found) this afternoon, if not by tomorrow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancements to the website's presentation or contents
Projects
None yet
4 participants