-
-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge KDL v2 #286
Merged
Merge KDL v2 #286
Changes from all commits
Commits
Show all changes
106 commits
Select commit
Hold shift + click to select a range
910f6e9
Do not escape / (Solidus, Forwardslash) (#197)
danini-the-panini 69ac280
KQL: require operator and change operator grammar a bit (#221)
zkat 2d5e543
KQL: remove map operator and accessors (#222)
zkat 1bf4d74
Allow "empty" single line comments in the spec (#234)
basile-henry 78a2d5f
Draft changelog
zkat f38edc7
add failing test for removed solidus escape
zkat ffeea8e
Use forward slash in solidus-escape test (#288)
bgotink 337bd1b
Update expected output of test with changed input (#289)
bgotink 825ff2c
Add escaped whitespace to KDL strings (#290)
Lucretiel 0a4a14d
Add escaped whitespace note to v2 changelog (#291)
hkolbeck d437cf2
Add test for empty single-line comment (#292)
bgotink 06d1d67
Add draft grammar for KQL 1.0.0 (#303)
larsgw 3b39e29
Add vertical tab to whitespace. Closes #331
tabatkins 568c096
Document the vertical tab addition.
tabatkins 0836df1
Restrict idents from looking like raw strings. Closes #200, closes #2…
tabatkins eb55930
Update formal grammar for KDL 2.0 (#285)
CAD97 99abeef
fix some confusion in grammar syntax, and actually specify the syntax…
zkat e6356d5
allow ,<> as identifier characters since they no longer need to be re…
zkat 85aa3a0
treat bare identifiers and strings in value locations (#358)
zkat 2694146
# is just plain illegal now
zkat 5e89c45
Update all examples to use most changes
zkat fada1fc
Update KQL text, too
zkat 63feef7
Update schema spec
zkat 31fd7bd
Update JiK and XiK too
zkat b42b6c8
Clarify that multiline comments are allowed after line continuations,…
zkat 5a7b339
Constrain code points to unicode scalar values
zkat c8488db
Make last semicolon optional for inline nodes
zkat 13799de
Allow whitespace in more places
zkat 49402cc
allow BOM only in the first unicode scalar in a document
zkat fc1b594
add support for dedented multi-line strings and raw strings
zkat 7790505
Merge branch 'main' into kdl-v2
zkat 8de7df6
formatting
zkat a0d5030
Release 2.0 draft 1
zkat 54df7f0
Update README
zkat 817a7dc
fixes from review
zkat 9f06153
Add explicit attribution for logo
zkat 56f399b
Add \s to the list of escapes
zkat b51859e
update tests
zkat 50d378f
update readme a bit
zkat 90cd0b1
make unicodey equals signs valid property assignment characters
zkat 0022536
small rewording
zkat 39b9fac
fix stray quote
zkat 055de4e
better organization of how we talk about identifiers/strings and comm…
zkat 511ab6b
missed a spot
zkat d433332
Add LRM/RLM to the direction control char list
zkat d53d99f
test fixes
zkat 057e8c8
Rewrite intro paragraph for strings to make their usage clearer.
tabatkins 419995f
typos
tabatkins 6d359d2
Remove now-irrelevant comment about idents acting like strings (they …
tabatkins b635470
be more specific
tabatkins 491cc46
Fix the disallowed low ASCIIs
tabatkins 6d091fd
Use consistent codepoint spelling
tabatkins f02ba59
Make multi-line ws prefix determined by the last line.
tabatkins 935d054
Fix more multiline tests
tabatkins 1294f97
Fix tests about # in an ident string
tabatkins 094a615
Tests are invalid (contained U+FFFD, not surrogates) and are in gener…
tabatkins c273d24
Dang it, forgot to save README when fixing multiline earlier.
tabatkins de37e11
Comments are now allowed in and around types (along with other types …
tabatkins 24cd214
Disallow idents like '.1' to avoid footguns
tabatkins bc2b995
Rename/rearrange the string productions to match the spec text better.
tabatkins 1f28fb0
[editorial] Move keyword production to a better spot. Rephrase bool/k…
tabatkins 1d6809e
Whoops, missed allowing '+.'
tabatkins af91cc6
Add tests for .1 and general 'ident ambiguous with a number' cases.
tabatkins 2949500
KDL V2 Test Fixes (#368)
IceDragon200 c15b5c2
make note of .1/+.1 illegality in the changelog
zkat 172c67b
Release 2.0.0 draft 2
zkat 522ce85
clarify multi-line strings further
zkat 35ac19b
fix stray legacy bool in example
zkat 2d4bcd0
Release 2.0.0 draft 3
zkat f767472
small readme improvements
zkat 40d8c83
unicode character support clarifications
zkat b1163e1
more small fixes
zkat f81fcfa
minor reword
zkat f0f9589
example tweaks
zkat 793a9d4
normalize literal newlines in multiline strings
zkat abae1f9
more fixes
zkat 7ab8658
iterate a bit on KQL
zkat ec7880d
Fix broken formatting in grammar language example (#375)
wackbyte 9212117
Remove extra indent in CI example (#376)
wackbyte 631ec14
allow /- at the very beginning of a document
zkat fa816ca
add floats
zkat e773747
Release 2.0 draft 4
zkat 2710c90
facepalm: forgot the full grammar change for float keywords
zkat 2fcf6d4
Update tests/test_cases/expected_kdl/multiline_string_indented.kdl
zkat dadcfdf
Update tests/test_cases/expected_kdl/multiline_raw_string_indented.kdl
zkat 9132a96
Quote identifiers that contain an equals sign (#381)
bgotink 9e7b958
Ensure spec allows slashdash right after node separator (#382)
bgotink b294e9c
Update README.md
zkat 2de2ddc
Update README.md
zkat aeb41cc
Update examples/ci.kdl
zkat d0b30c3
Update SPEC.md
zkat 281de7e
review fixes
zkat d064bc9
clarify multi-line strings and escapes interaction
zkat fa9d303
remove duplication of keyword-number
zkat bea0f67
turn it around: escapes should be resolved _before_ dedenting
zkat c9134e3
change escape resolution order again
zkat fa204ce
unicode was not defined in grammar
zkat 6a77436
kql: only allow top() at start of selector (#388)
alightgoesout bcfb332
Tweak rules for escaped whitespace in multi-line strings (#392)
tjol 1e924bc
clarifications around multiline prefixes
zkat 93c4400
clarify that numbers don't need to be IEEE 754 floats
zkat fa3050c
add 128-bit ints
zkat 1588b1f
get rid of syntactically significant unicode equals signs (#400)
zkat 90e22bc
[v2] more predictable slashdash (#407)
zkat 76a1de5
Release 2.0.0 draft 5
zkat 8aa4c15
prep readme for merging to main
zkat File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,95 @@ | ||
# KDL Changelog | ||
|
||
## 2.0.0-draft.5 (2024-11-28) | ||
|
||
* Equals signs other than `=` are no longer supported in properties. | ||
* 128-bit integer type annotations have been added to the list of "well-known" | ||
type annotations. | ||
* Multiline string escape rules have been tweaked significantly. | ||
* `\s` is now a valid escape within a string, representing a space character. | ||
* Slashdash (`/-`)-compatible locations and related grammar adjusted to be more | ||
clear and intuitive. This includes some changes relating to whitespace, | ||
including comments and newlines, which are breaking changes. | ||
* Various updates to test suite to reflect changes. | ||
|
||
## 2.0.0 (Unreleased) | ||
|
||
### Grammar | ||
|
||
* Solidus/Forward slash (`/`) is no longer an escaped character. | ||
* Space (`U+0020`) can now be written into quoted strings with the `\s` | ||
escape. | ||
* Single line comments (`//`) can now be immediately followed by a newline. | ||
* All literal whitespace following a `\` in a string is now discarded. | ||
* Vertical tabs (`U+000B`) are now considered to be whitespace. | ||
* The grammar syntax itself has been described, and some confusing definitions | ||
in the grammar have been fixed accordingly (mostly related to escaped | ||
characters). | ||
* `,`, `<`, and `>` are now legal identifier characters. They were previously | ||
reserved for KQL but this is no longer necessary. | ||
* Code points under `0x20` (except newline and whitespace code points), code | ||
points above `0x10FFFF`, Delete control character (`0x7F`), and the [unicode | ||
"direction control" | ||
characters](https://www.w3.org/International/questions/qa-bidi-unicode-controls) | ||
are now completely banned from appearing literally in KDL documents. They | ||
can now only be represented in regular strings, and there's no facilities to | ||
represent them in raw strings. This should be considered a security | ||
improvement. | ||
* Raw strings no longer require an `r` prefix: they are now specified by using | ||
`#""#`. | ||
* Line continuations can be followed by an EOF now, instead of requiring a | ||
newline (or comment). `node \<EOF>` is now a legal KDL document. | ||
* `#` is no longer a legal identifier character. | ||
* `null`, `true`, and `false` are now `#null`, `#true`, and `#false`. Using | ||
the unprefixed versions of these values is a syntax error. | ||
* The spec prose has more explicitly stated that whitespace and newlines are | ||
not valid identifier characters, even though the grammar already expressed | ||
this. | ||
* Bare identifiers can now be used as values in Arguments and Properties, and are interpreted as string values. | ||
* The spec prose now more explicitly states that strings and raw strings can | ||
be used as type annotations. | ||
* Removed a statement in the spec prose that said "It is reasonable for an | ||
implementation to ignore null values altogether when deserializing". This is | ||
no longer encouraged or desired. | ||
* Code points have been constrained to [Unicode Scalar | ||
Values](https://unicode.org/glossary/#unicode_scalar_value) only, including | ||
values used in string escapes (`\u{}`). All KDL documents and string values | ||
should be valid UTF-8 now, as was intended. | ||
* The last node in a child block no longer needs to be terminated with `;`, | ||
even if the closing `}` is on the same line, so this is now a legal node: | ||
`node {foo;bar;baz}` | ||
* More places allow whitespace (node-spaces, specifically) now. With great | ||
power comes great responsibility: | ||
* Inside `(foo)` annotations (so, `( foo )` would be legal (`( f oo )` would | ||
not be, since it has two identifiers)) | ||
* Between annotations and the thing they're annotating (`(blah) node (thing) | ||
1 y= (who) 2`) | ||
* Around `=` for props (`x = 1`) | ||
* The BOM is now only allowed as the first character in a document. It was | ||
previously treated as generic whitespace. | ||
* Multi-line strings are now automatically dedented, according to the common | ||
whitespace matching the whitespace prefix of the closing line. Multiline | ||
strings and raw strings now must have a newline immediately following their | ||
opening `"`, and a final newline plus whitespace preceding the closing `"`. | ||
* `.1`, `+.1` etc are no longer valid identifiers, to prevent confusion and | ||
conflicts with numbers. | ||
* Multi-line strings' literal Newline sequences are now normalized to single | ||
`LF`s. | ||
* `#inf`, `#-inf`, and `#nan` have been added in order to properly support | ||
IEEE floats for implementations that choose to represent their decimals that | ||
way. | ||
* Correspondingly, the identifiers `inf`, `-inf`, and `nan` are now syntax | ||
errors. | ||
* `u128` and `i128` have been added as well-known number type annotations. | ||
* Slashdash (`/-`) -compatible locations adjusted to be more clear and intuitive. | ||
|
||
### KQL | ||
|
||
* There's now a _required_ descendant selector (`>>`), instead of using plain | ||
spaces for that purpose. | ||
* The "any sibling" selector is now `++` instead of `~`, for consistency with | ||
the new descendant selector. | ||
* Some parsing logic around the grammar has changed. | ||
* Multi- and single-line comments are now supported, as well as line | ||
continuations with `\`. | ||
* Map operators have been removed entirely. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry if this has been explained somewhere before, a 100+ comment discussion can be hard to follow.
Speaking purely from a user's perspective, this specific change feels a bit unnecessary. I assume it's being done to prevent ambiguity, but
null
,true
andfalse
are keywords common enough, that I don't think anyone with the slightest experience writing code would be surprised if they have special meanings unlike normal identifiers. If anything, being a Rust user, I would come into this expecting the exact opposite: thattrue
means the boolean and#true
means the raw identifier, similar to how it works in Rust sans ther
.Is there something I'm missing here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We decided this would be enough of a potential footgun since the change to allow unquoted strings. Programming languages like Rust have other defenses against this kind of confusion, but kdl would need to do something different (like this prefixing) to prevent, say, the kind of things you see happen in plain JavaScript
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation, and fair enough, if that's the point of balance you've decided on for your project.