-
Notifications
You must be signed in to change notification settings - Fork 0
Getting started
By default, and with no options specified, the tool emits the document passed in verbatim to standard output, encoded as UTF-8. This can be turned off by specifying option -q
.
An XML report of any errors encountered by the (non-validating) parser, together with details of any DOCTYPE
declaration, is emitted to standard error.
The report takes this format:
<report uri='my-file.xml'>
<doctype root='foo' systemID='bar' publicID='blort'/>
</report>
The <doctype>
attributes occur only if the corresponding information is present in the DOCTYPE
declaration.
If there is no DOCTYPE
declaration, <doctype>
is omitted from the report.
Any errors appear, for example, as:
<report uri='my-file.xml'>
<error>my-file.xml:1:0: no element found</error>
</report>
Note that if a parseable DOCTYPE
declaration is encountered and the overall document is not well-formed, the declaration will still be reported, along with the well-formedness error.
Other options amend the DOCTYPE
declaration in the emitted document as follows. Options may be combined (except for combinations which would both specify and omit SYSTEM
and PUBLIC
identifiers).
Specifying option -P
changes e.g. this:
<!DOCTYPE foo PUBLIC "somePublicID" "my.dtd">...
to this:
<!DOCTYPE foo SYSTEM "my.dtd">...
Specifying option -p<value>
changes e.g. this:
<!DOCTYPE foo PUBLIC "somePublicID" "my.dtd">...
to this:
<!DOCTYPE foo PUBLIC "otherPublicID" "my.dtd">...
Specifying option -r<value>
changes e.g. this:
<!DOCTYPE foo ...>
to this:
<!DOCTYPE bar ...>
Specifying option -S
changes e.g. this:
<!DOCTYPE foo PUBLIC "somePublicID" "my.dtd">...
to this:
<!DOCTYPE foo>...
Specifying option -s<value>
changes e.g. this:
<!DOCTYPE foo SYSTEM "my.dtd">...
to this:
<!DOCTYPE foo SYSTEM "your.dtd">...
Arguments containing spaces must be escaped in the usual way at the command line.
N.B. If the DOCTYPE
declaration is amended by the options specified, then the resulting report features the updated values, rather than those found in the original document.
Currently, the internal subset is processed and any entities resolved in the resulting document, but is not preserved lexically in the output.
The application makes no attempt to check whether values passed in would result in well-formed output. Non-well-formed input will be emitted as non-well-formed output.