Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON to XML SSP transform injects a <br/> element in the output #1151

Closed
GaryGapinski opened this issue Feb 24, 2022 · 5 comments
Closed

JSON to XML SSP transform injects a <br/> element in the output #1151

GaryGapinski opened this issue Feb 24, 2022 · 5 comments
Assignees
Labels

Comments

@GaryGapinski
Copy link

GaryGapinski commented Feb 24, 2022

Describe the bug

CMS recently released a version of their Acceptable Risk Safeguards (ARS) v5.0 in JSON format.

I forked that repository, cloned the fork to a local copy, made a branch, and then invoked the oscal_catalog_json-to-xml-converter.xsl transform to obtain the XML equivalent. The command

java -jar /opt/saxon/saxon-he-11.2.jar -xsl:"oscal_catalog_json-to-xml-converter.xsl" -o:"CMS_ARS_5_0.xml" -it:"from-json" file="CMS_ARS_5_0.json"

was used.

When the transform result document was inspected using the OSCAL catalog schema, a syntax error was found at CMS_ARS_5_0.xml line 1761 which was associated with CMS_ARS_5_0.json line 4908.

(A separate syntax error was the result of an input document syntax error.)

A conversation thread on Gitter which began with "Any reason why a <br/> would end up in a transformation of CMS_ARS_5_0.json (line 4908) to CMS_ARS_5_0.xml (line 1761) using oscal_catalog_json-to-xml-converter.xsl?" elicited a suspicion that the input JSON and the transform were responsible.

Who is the bug affecting?

Users of oscal_catalog_json-to-xml-converter.xsl.

What is affected by this bug?

The transform emits an unacceptable element.

When does this occur?

When the above input and transform are used.

How do we replicate the issue?

Perform the steps in this issue's description.

If applicable, add screenshots to help explain your problem.}

Expected behavior (i.e. solution)

Disregarding input flaws, the output of the transform should be syntactically correct.

Other Comments

wendellpiez added a commit to wendellpiez/metaschema that referenced this issue Jul 26, 2022
wendellpiez added a commit to wendellpiez/metaschema that referenced this issue Jul 28, 2022
… successful unit tests demonstrating Markdown/markup conversion usnistgov#213 cf usnistgov/OSCAL#1151 (one XSpec each way); Obfuscated one of the unit tests
david-waltermire pushed a commit to wendellpiez/metaschema that referenced this issue Jul 28, 2022
… successful unit tests demonstrating Markdown/markup conversion usnistgov#213 cf usnistgov/OSCAL#1151 (one XSpec each way); Obfuscated one of the unit tests
david-waltermire pushed a commit to usnistgov/metaschema that referenced this issue Jul 28, 2022
… successful unit tests demonstrating Markdown/markup conversion #213 cf usnistgov/OSCAL#1151 (one XSpec each way); Obfuscated one of the unit tests (#218)
david-waltermire pushed a commit to usnistgov/metaschema that referenced this issue Dec 7, 2022
… successful unit tests demonstrating Markdown/markup conversion #213 cf usnistgov/OSCAL#1151 (one XSpec each way); Obfuscated one of the unit tests (#218)
aj-stein-nist pushed a commit to aj-stein-nist/metaschema that referenced this issue Jan 9, 2023
… successful unit tests demonstrating Markdown/markup conversion usnistgov#213 cf usnistgov/OSCAL#1151 (one XSpec each way); Obfuscated one of the unit tests (usnistgov#218)
aj-stein-nist pushed a commit to aj-stein-nist/metaschema that referenced this issue Jan 10, 2023
… successful unit tests demonstrating Markdown/markup conversion usnistgov#213 cf usnistgov/OSCAL#1151 (one XSpec each way); Obfuscated one of the unit tests (usnistgov#218)
aj-stein-nist pushed a commit to aj-stein-nist/metaschema that referenced this issue Jan 10, 2023
… successful unit tests demonstrating Markdown/markup conversion usnistgov#213 cf usnistgov/OSCAL#1151 (one XSpec each way); Obfuscated one of the unit tests (usnistgov#218)
aj-stein-nist pushed a commit to aj-stein-nist/metaschema that referenced this issue Jan 10, 2023
… successful unit tests demonstrating Markdown/markup conversion usnistgov#213 cf usnistgov/OSCAL#1151 (one XSpec each way); Obfuscated one of the unit tests (usnistgov#218)
aj-stein-nist pushed a commit to aj-stein-nist/metaschema that referenced this issue Jan 10, 2023
… successful unit tests demonstrating Markdown/markup conversion usnistgov#213 cf usnistgov/OSCAL#1151 (one XSpec each way); Obfuscated one of the unit tests (usnistgov#218)
aj-stein-nist pushed a commit to aj-stein-nist/metaschema that referenced this issue Jan 10, 2023
… successful unit tests demonstrating Markdown/markup conversion usnistgov#213 cf usnistgov/OSCAL#1151 (one XSpec each way); Obfuscated one of the unit tests (usnistgov#218)
@aj-stein-nist
Copy link
Contributor

aj-stein-nist commented Mar 1, 2023

  • AJ is to open a cross-repo report as this is an issue with Metaschema for the converter generator area, where the potential issue originates from
  • Work with testing the OSCAL content and determine root cause
  • If necessary, determine if there is a content work-around for NIST or community members to perform in the interim if it requires a long-term fix

david-waltermire pushed a commit to usnistgov/metaschema that referenced this issue Mar 9, 2023
… successful unit tests demonstrating Markdown/markup conversion #213 cf usnistgov/OSCAL#1151 (one XSpec each way); Obfuscated one of the unit tests (#218)
@nikitawootten-nist
Copy link
Contributor

  • AJ is to open a cross-repo report as this is an issue with Metaschema for the converter generator area, where the potential issue originates from

This bug is already reported in usnistgov/metaschema-xslt#21

@nikitawootten-nist
Copy link
Contributor

nikitawootten-nist commented Mar 23, 2023

If necessary, determine if there is a content work-around for NIST or community members to perform in the interim if it requires a long-term fix

After some brief testing, it seems the json-to-xml-converter pipeline is fairly finicky when it comes to markdown conversion of lists.

For now, it seems the converter is most happy with lists whose top level are separated by an extra line, and whose lower level is not separated by an extra line, e.g.:

* List item 1

* List item 2
  * Sublist item 2.1
  * Sublist item 2.2

* List item 3

Mixing * and - seems to trigger the insertion of <br/> elements, but other spacing variations lead to all sorts of other nasty behaviors.

@wendellpiez
Copy link
Contributor

This is awesome - the analysis confirms what we knew, namely that the implementation fails (both to conform to Commonmark-alikeness, and by some general definition) out where the specification gets soft. More importantly, it gives us a place to define 'correct' for a correct implementation. "Correct" here means both seeing to it that a known-invalid <br/> is never emitted, but also that Markdown inputs map to expected markup constructs (HTML) when converted (and not some other construct that is wrong but valid and thus hard to detect). The first thing, of course, would be easy to do without the second, but maybe not in the spirit of it.

Suggested next steps:

  • Discuss and potentially walk through the specs for list handling insofar as known and exposed - where is this documented in addition to tests themselves? let's consider from users' point of view and try to be as comprehensive as is practically possible
  • How do we operationalize this unit testing for the XSLT-based dev (and/or others)? Align with @david-waltermire-nist on CommonMark implementation testing and OSCAL schema alignment? Is this YAML something we could cast into XSpec or otherwise replicate, or do we just run it? (What's the model for devs?)
  • With @nikitawootten-nist and anyone else interested, we can (any time) pull out the current XSLT M4 markdown-to-markup XSLT and start shaking it out. It is not beginner-level XSLT but it is where our solution will eventually be (or part of it) and it includes its own testing (not useless). But there might not be any point destabilizing this until the testing is stabilized.

Also - should the specs include how to handle odd-but-not-illegal inputs or are we simply trying to feature-match the tool?

* List item 1
    * List item 1.i
  * List item 1.b

check out the Github rendering:

  • List item 1
    • List item 1.i
    • List item 1.b

Github sees a sublist but no sub-sub-list, and item 1.i is promoted a level.

This handling is not round-trippable and hence somewhat concerning. My guess is Commonmark does something similar?

@nikitawootten-nist nikitawootten-nist moved this from Todo to In Progress in NIST OSCAL Work Board Mar 27, 2023
@aj-stein-nist aj-stein-nist moved this from In Progress to Blocked in NIST OSCAL Work Board Mar 31, 2023
@aj-stein-nist
Copy link
Contributor

If necessary, determine if there is a content work-around for NIST or community members to perform in the interim if it requires a long-term fix

After some brief testing, it seems the json-to-xml-converter pipeline is fairly finicky when it comes to markdown conversion of lists.

For now, it seems the converter is most happy with lists whose top level are separated by an extra line, and whose lower level is not separated by an extra line, e.g.:

* List item 1

* List item 2
  * Sublist item 2.1
  * Sublist item 2.2

* List item 3

Mixing * and - seems to trigger the insertion of <br/> elements, but other spacing variations lead to all sorts of other nasty behaviors.

Thanks to @nikitawootten-nist and @wendellpiez for their work on this. As it stands, the quoted message from Nikita above documents a workaround for using the XSLT-M4 implementation. Nikita and Wendell confirmed for me today that the oscal-cli does not have the same issue. As far as the current specs of OSCAL are concerned, this is not a bug in said specs, but a known area of ambiguity in the Metaschema specs and how the XSLT M4 implementation has deferred enhancements until the under-defined mechanics of around the complexity of lists referenced in that Metaschema issue.

We recommend you follow the workaround for now if you use the XSLT-M4 toolchain, or use the oscal-cli as an alternative. When the Metaschema issue link tracks enhanced testing and development regarding Commonmark to HTML processing, follow-on work for M4 can be revisited.

@github-project-automation github-project-automation bot moved this from Blocked to Done in NIST OSCAL Work Board Mar 31, 2023
nikitawootten-nist pushed a commit to nikitawootten-nist/metaschema-xslt that referenced this issue Jul 21, 2023
… successful unit tests demonstrating Markdown/markup conversion #213 cf usnistgov/OSCAL#1151 (one XSpec each way); Obfuscated one of the unit tests (#218)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Done
Development

No branches or pull requests

4 participants