-
Notifications
You must be signed in to change notification settings - Fork 859
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add identifier like Unquoted Strings. #62
Conversation
Allow identifiers as a special low-effort string. It makes the format easier to parse (well, tokenise), less weird edge cases about what can appear in keygroup/key names. It's backwards compatible, and lets people deserialize things where they have been bad and put a . in the name. i.e. ["foo.bar"] isn't [foo.bar]
What about barewords with the same name as an existing key? |
Should have the same behaviour as a duplicate key, which afaik is to break. i.e.
is just as broken as
|
I mean something like this: [foo]
foo = bar
bar = foo |
Barewords was a bad choice of word. I originally called them unquoted strings.
Should be the same as writing
It isn't really anything other than a way to write simple strings without quotes. So, the following should be equivalent too:
|
Unfortunately this will break my implementation |
If it breaks the implementation where you run a regex over it and eval the output, I'm even more for this change p.s. i've updated the first comment to be clearer and unfuck my markdown errors. |
``` | ||
|
||
You can indent keys and their values as much as you like. Tabs or spaces. Knock | ||
yourself out. Why, you ask? Because you can have nested hashes. Snap. | ||
|
||
Nested hashes are denoted by key groups with dots in them. Name your key groups | ||
whatever crap you please, just don't use a dot. Dot is reserved. OBEY. | ||
Nested hashes are just keygroups with more than one string seperated by a dot (.). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/seperated/separated/
-1 This feels like creeping yaml-ization. Regarding quotes-less strings as values: "There should only be one way to do anything" means that strings should require quotes. Regarding quoted strings in keys: Rather than figuring out how to robustly include arbitrary characters in keys, arbitrary characters should be banned. If we say that a key is one or more letters, numbers, and underscores, and must not start with a number; and that a key group is one or more keys, joined by dots, that would also address the ambiguities you mentioned. |
-1, agree with @mrflip |
I disagree with this proposition. I think that each type of thing should have at most one representation (with the possible exception of hex floats, which add functionality). Let keys be bare words (just forbid the empty key), and mandate quotes around string values. |
@mrflip "There should be only one way to do anything". Does this include translating hashes back into TOML? For ex, right now: {"foo":1}, {"foo=bar":{}} can be deserializied. {"foo.1":{}} can't, {"foo=bar":1} can't. With this change, if you can represent it as a string, you can use it as a key name, keygroup name, or string. Removing three different ways to specify a string value depending on position, and replaces them with two simpler ones. Your counter proposal also ends up with two rules for strings, but doesn't indicate how to handle keys with spaces in (which is currently supported), or key group names with dots in (currently unsupported). It would add the same complexity to implementation, but with none of the ability to handle the deserialization cases above, except {'foo":1}. I do agree with your pull request to make key names, key group names follow one set of rules for allowable characters, but I do not think letting keys and strings be interchangeable counts 'YAMLization'. I think i'd need to be adding special cases rather than getting rid of them. |
@tef My proposal hinges on banning keys with spaces, dots or anything special in them. If so, then there's only one rule for strings, because keys are not strings: they can only be The argument for this hinges on the presumption this is a configuration file format: primitives must be comprehensive, but the overall data structure should be locked the hell down. My proposal is to use pretty much the same rules ruby requires with its new-school symbol hash shorthand. |
I think we're both in agreement in terms of adding identifiers to the format, and using them for key, key group names, but I'm still a bit of a weenie, in that I think strings should be valid too. (The older issue #27 suggests your presumption about config over interchange may be right, but I don't see much added complexity by letting string keys be strings). I keep saying they're strings, because it's implied by the spec when equivalent JSON snippets are shown alongside. I have a suspicion that in the wild, some strings have dots in them, and they are used as hash keys. Don't get me wrong, I'm not suggesting that this should be a subset of JSON (someone else can argue for that, and that is yamlization...). |
I'd prefer to make the types as unambiguous and "one-way" as possible, so I can't support having both quoted and unquoted strings. I do see the point about quoted keys, but I think that would only be valuable if keys could be anything (a hash, or whatever). |
-1, totally agree with @mrflip |
Thanks for the suggestion! However, I think this adds too much ambiguity to TOML. This can be evidenced by having to say that |
@mojombo 👍 |
Allow identifiers as a special low-effort string. Make foo and "foo" the same. Make key and keygroup names be strings.
i.e
and
Are the same thing.
Unquoted strings cannot have dots, square brackets or spaces in, or start with a number.
Note: ["foo.bar"] isn't the same as [foo.bar]. Dots inside quotes do not count.
Rationale:
Additionally, eliminates a whole slew of stupid edge cases in current spec, i.e
This should make the format easier to write a parser for, and lets people have strings as keys.