-
Notifications
You must be signed in to change notification settings - Fork 859
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A Proposal for Keys #220
Comments
Of all the proposals I've seen for the "dots in key names" problem, I like this the best so far. One question: why not allow any character except |
You would also have to disallow white space and |
I think this is a pretty graceful way to support more flexible key names, and your examples with the IP addresses and file names are compelling. But I'd like to include @mojombo's addendum. (The addendum makes this an almost-backwards-compatible change. But this would only affect users using @wycats Would you like to submit a PR? I'm happy to do it otherwise. I think it should specifically state that key names may be in the string syntactic category, so that other types of strings may be used if they are added. (And I hope they are, e.g., raw strings.) This keeps the spec simple. |
I can submit a PR, sure. |
Copying my comment from another of the bugs: I'd be strongly in favour of allowing spaces in keys. They are very commonly used in XDG desktop files and the like (example below), and since TOML is somewhat of a superset, this makes the formats implicitly compatible.
|
@jleclanche Keys with spaces in them is allowed in the TOML spec right now. The rules defined for value keys are:
And the rules defined for table keys are:
That leaves open the ability to have spaces in keys. @BurntSushi's |
Ah! Great. Then I'm happy with this. :) |
So the idea is to allow all characters but a few up until the first equals sign, but trimmed for whitespace on the right? |
Yup, that sounds right to me. (But also trimmed for whitespace on the left.)
|
Trimmed for whitespace on both sides, I think. But whitespace is still allowed in the middle of the key. So this key:
...becomes |
My goal is the principle of least surprise. As such, whitespace and whitespace trimming should act to be as unsurprising as possible. Some examples may serve best: # key names
abc def = 1 #=> {"abc def": 1}
abc def = 1 #=> {"abc def": 1}
abc def = 1 #=> {"abc def": 1}
# table names
[foo bar] #=> {"foo bar": ...}
[foo bar.baz] #=> {"foo bar": {"baz": ...}}
[foo bar.baz] #=> {"foo bar": {"baz": ...}}
[ foo bar.baz ] #=> {"foo bar": {"baz": ...}}
[ foo ] #=> {"foo": ...}
[foo . bar] #=> {"foo ": {" bar": ...}} That last one is pretty funky, so sane people will probably use the quoted syntax to clarify: ["foo "." bar"] #=> {"foo ": {" bar": ...}} |
@mojombo This list is great, and super helpful! I do question the last one, though ( [foo . bar] #=> {"foo": {"bar": ...}} If the goal is to include the whitespace in the keys, then users can fallback on your alternate: ["foo "." bar"] #=> {"foo ": {" bar": ...}} |
@mojombo Some of those definitely aren't clear from the spec, particularly the table names. I don't think the spec mentions anything about whitespace in table names, so, e.g., (Of course, this wouldn't apply to quoted keys.) |
👍 for clarification of table names and having each table name component be trimmed on both sides for whitespace. Since those are the rules for value keys, I think it'll keep it consistent for the TOML user. |
👍 This looks fine to me: ["some.host.tld"]
region = us-east
["domain.tld".us-west]
nameservers = [
"ns-1.domain.tld",
"ns-2.whatever.tld"
]
[us-west.dc-1a."host.domain.tld".master]
healthcheck = true |
Found this issue looking for a way to do exactly what @lra wants to do: use FQDNs as keys leading to tables. 👍 from me. |
Mostly nice, but two questions:
I guess this is less about name collisions than it is about whether keys can be used as values (being able to extract |
@wycats I like the proposal you make, but I think (like @dhardy) that it is alrgely a trade-off between TOML's syntactic complexity and verbosity of the a TOML document (in the case you describe). In the following example I show the verbose document that fits your example: [[dependency]]
packageName = "hammer.rs"
version = "1.0.0" @lra As with the example of @wycats, I also translated your example to a slightly more verbose document, which keeps the syntax of TOML lean. [[node]]
hostname = "some.host.tld"
region = "us-east"
[[node]]
hostname = "domain.tld"
region = "us-east"
nameservers = [
"ns-1.domain.tld",
"ns-2.whatever.tld"
]
[us-west.dc-1a."host.domain.tld".master]
healthcheck = true @mojombo I think that whitespace in table header speficiations should best be forbidden. And when allowed I prefer that it needs to be "string'ed". But as I make my case above, I rather not deal with the syntactic overhead of difficult keys. I rather have keys to be "easy on the eyes". In my example abobve I show how easy it is to move difficult values out of the key/tableheaders into the values. |
@wycats do you even want to use this syntax now? The manifest lists a somewhat different syntax. |
@dhardy what do you mean? See http://crates.io/manifest.html#the-[dependencies.*]-sections |
@dhardy somehow i hope that toml can remain as K.I.S.S.-able as it is in it's current shape (or more KISSable, by further restricting "difficult" stuff). |
The KISS approach would be to restrict keys to |
@dhardy indeed, I would like to keep allowing unicode. but no another way, still quite KISS, would be |
If we want to restrict keys, we should at least allow arbitrary Unicode letters and numbers, since the world isn't English-speaking only. The JavaScript definition of identifiers could serve as an example, except that there is no need to restrict the first character further. Or, to keep it simple: key parts must be comprised of arbitrary sequences of characters belonging to the Unicode Categories Letter (L.), Number (N.) and Mark (M.), as well as |
@ChristianSi , if you do that I think you already need normalisation. There's quite a discussion on that in #65. |
@dhardy I would leave that to applications rather than prescribing anything in the spec. JSON, I think, does the same. |
So with current TOML, can I have unicode characters 0 to 31 a key name? I'm not familiar enough with TOML yet to guess about
Probably well intentioned, but I'd still favor a whitelist approach. With a blacklist, many webdevs might remember to care about non-breaking space, some who know that JSON is not a JavaScript subset may even remember about U+2028 (line separator) and U+2029 (paragraph separator), and might hope that Unicode Consortium won't add too fancy new whitespace ever.
I couldn't find a good enough twin for |
thanks! |
Resolved by #283. |
This is related to #65, #67, #185, #90, #180, #62, #126, #83, and probably others.
Motivation
The TL;DR is that keys are currently slightly ambiguous, but also don't support a number of commonly desired characters.
A couple of examples:
In Cargo, this comes up because we use dependency names as keys:
Proposal
-
,_
,$
,!
or?
characters.So this would be valid:
I know there have been many discussions on this topic before. I have read them and have tried to include their considerations in this proposal.
One note: I think including
-
as a valid unadorned identifier character is extremely important.The text was updated successfully, but these errors were encountered: