Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add inline table syntax. #235

Merged
merged 1 commit into from
Feb 7, 2015
Merged

Add inline table syntax. #235

merged 1 commit into from
Feb 7, 2015

Conversation

mojombo
Copy link
Member

@mojombo mojombo commented Jul 16, 2014

Concrete proposal for the inline table syntax I first proposed in Issue #219. Here's a quick taste:

# They should look familiar and obvious
name = { first = "Tom", last = "Preston-Werner" }
point = { x = 1, y = 2 }

# They make tuples unnecessary by allowing mixed types (they are just tables, after all).
# Unlike tuples, they are self-documenting via the key names.
address = { proto = "http", ip = "10.0.0.1", port = 8080 }

# Arrays of tables with less hassle
points = [ { x = 1, y = 2, z = 3 },
           { x = 7, y = 8, z = 9 },
           { x = 2, y = 4, z = 8 } ]

This is about as brief as I could think to make it, so if there are specific issues that need clarifying, please bring them up.

@BurntSushi
Copy link
Member

I think the only thing I'm wondering is whether they can be arbitrarily nested. Perhaps an example clarifying?

Otherwise, LGTM.

@mojombo
Copy link
Member Author

mojombo commented Jul 17, 2014

@wycats Can you take a look at this? I want to make sure it will meet your needs as stated.

@ChristianSi
Copy link
Contributor

@mojombo @BurntSushi How about adding a modified sentence and an example from @mojombo 's original proposal to address @BurntSushi 's question?

Inline tables can be nested, as long as the outer table fits in a single line:

person = { name = "Tom", geography = { lat = 1.0, lon = 2.0 } }

@wycats
Copy link
Contributor

wycats commented Jul 17, 2014

@mojombo I looked it over with @alexcrichton. Everything looks pretty good.

One point worth making, I think: I think people may have a tough time learning how to expand inline tables into sections. Toml parsers should probably provide somewhat detailed error messages when multi-line tables are encountered, along with a suggestion on expansion.

@BurntSushi
Copy link
Member

@wycats @ChristianSi 👍

@mirhagk
Copy link

mirhagk commented Aug 5, 2014

I'm worried that this change essentially encapsulates the JSON format within TOML. Obviously it's not exactly JSON, but that's what concerns me, the fact that it's 95% JSON may confuse a lot of users into thinking they can just read JSON with TOML (I could imagine projects upgrading from TOML to JSON and thinking they just need a TOML parser now, and not actually change their config file to take advantage of stuff TOML is good at)

The decision to disallow newlines to try to encourage users to not use it for complex scenarios seems pretty arbitrary. It's not intuitive at all that you can't put each key on a new line, especially when arrays allow newlines.

I can't see any scenarios where nesting would be a good idea, and it complicates the parsers by a lot to have it. From the discussion in #219 it looks like nesting is allowed merely because JSON/Ruby allow it, in which case why not allow newlines between keys?

Personally I think the proposed alternative of allowing keys on the same line with a different separator than a newline is much more intuitive, and much easier for the implementer and reader. The proposal went too far with allowing , to be elided in arrays, but I think adding the following sentence in the Table section would suffice:

Multiple Keys can be used on the same line by separating them with a ,

[name]
first = "Tom", last = "Preston-Werner"
[point]
x = 1, y = 2

(interestingly I couldn't even see the part that said that it was one key per line)

The given examples look pretty much identical, can be understood by both the reader and implementer as just replacing the , with a \n, instead of just relying on the user to have seen programming languages with JSON-like notation.

@ChristianSi
Copy link
Contributor

@mirhagk Incidentally, that's half of a proposal I made before which wasn't met with universal acclaim. (The other half was to allow newline as alternative element separator in arrays.)

Personally, I still think it wouldn't be a bad idea, though I'm also happy with the more JSON-like syntax proposed by this pull request.

This issue has already taken a lot of discussion, so I think it would be best to just merge this pull request as it is instead of re-starting the discussion from (more or less) scratch.

@mirhagk
Copy link

mirhagk commented Aug 6, 2014

I think in cases like this, it's much easier to apply changes at a later date than trying to undo something the project no longer wants.

If this proposal must go through, then at least add the ability to put newlines between keys. It's such an arbitrary and non-obvious restriction that it will only cause problems. Not to mention that there are in fact workarounds for people that want it, and the workarounds will be way worse than having the language support it.

For the record, the 2 workarounds I am aware of are:

  1. If a parser supports choosing the newline type (\n, \r, \n\r, \r\n) then choose one that looks for windows style newlines, and then use a simple newline for these keys. That is of course an awful idea but it would work.
  2. Have the application have a post-processing step that merges objects in an array into a single object. i.e. the following:
[ {first = "Tom"},
  {last="Preston-Werner"},
]

will be converted into the same as above. Again an awful hack, but if JSON is being allowed, someone will decide to use it, and then decide that they want to separate content on newlines.

The purpose of the newline restriction seems to be so that people don't use complicated inline tables, but IMO this simply makes it harder to read, and relies on the reader and writer to have the same word wrap settings and definitions of what is too long. Plus if inline tables are used commonly, the writer may not even know how to convert it from the inline format.

The much more sensible restriction is to not allow nesting. That prevents users from writing complex objects, rather than simply making it harder for the reader to understand what's going on. The given example of nesting is probably as short as one could possibly imagine an example, and it's still at the threshold were some would decide it should be on a newline. I really can't see any real world examples where nested inline tables should be encouraged.

@wycats
Copy link
Contributor

wycats commented Aug 6, 2014

I'm worried that this change essentially encapsulates the JSON format within TOML. Obviously it's not exactly JSON, but that's what concerns me, the fact that it's 95% JSON may confuse a lot of users into thinking they can just read JSON with TOML

I'm extremely confused about how you could think this. Basically no JSON is valid Toml with this proposal: keys are not quoted and the separator between keys and values is = not :.

@mirhagk
Copy link

mirhagk commented Aug 6, 2014

@wycats Alright yes you couldn't just use JSON verbatim (although at one point in the other discussion the proposal was to use :). That doesn't really change much though, as it still becomes something someone could just copy and paste into a config, just instead of javascript it's now C# object notation.

And honestly even with the JSON completely breaking, I can nearly guarantee someone will just find-and-replace : with = and think that's good enough to be TOML.

@wycats
Copy link
Contributor

wycats commented Aug 6, 2014

@mirhagk They would also have to unquote all their strings. I really do not understand what you're getting at.

@wycats
Copy link
Contributor

wycats commented Aug 6, 2014

From the discussion in #219 it looks like nesting is allowed merely because JSON/Ruby allow it, in which case why not allow newlines between keys?

That is not the rationale given in #219. Instead, there was a significant amount of discussion (and back-and-forth with different options) about specific use-cases that are not well-served with the previous syntax. The proposal was motivated by a real-world scenario (Cargo).

@wycats
Copy link
Contributor

wycats commented Aug 6, 2014

Specifically, @mojombo directly addressed nesting and newlines here and his rationale was not that Ruby and JSON allow it. It was an intentional tradeoff that was reasonably well-motivated in my view.

@mirhagk
Copy link

mirhagk commented Aug 6, 2014

Can inline tables be nested?

I'm going to say yes. It's the answer that's most obvious

It was said this was chosen since it was the most obvious. Maybe I'm reading into it too much, but to me that's basically another way of saying that if a programmer looked at the inline table syntax, they'd expect you to be able to nest since that's what everything else allows.

Regardless it's definitely non-obvious why you can't have newlines to separate keys, and the reasoning given

If we want to prevent abuse, then this seems like the best way to do it. I think people will naturally avoid long lines, and thus avoid using inline tables for undesirable nesting and other inappropriate uses

Isn't a very compelling reason, because a lot of people use editors with word wrap and actually don't care that much about long lines, especially novice developers or non-developers. It doesn't prevent anyone from using near-JSON notation.

I really do not understand what you're getting at.

What I'm getting at is if we think that the proposed {}-heavy JSON-like notation is better than the alternative more TOML like syntax: (the thing that's TOML by simply replacing , with \n)

[dependencies.hammer]
version = "1.0.0", git = "https://github.com/wycats/hammer", branch = "wip"

Then the project is going back on the very reason it was created (from #2):

{ 'because': { '80': 'percent' }, {'of': 'JSON', 'is': 'brackets' } }

and

"and" : { "json" : { "is" : "prone" , "to" : [ "syntax", "errors" ] } } }

which with minor changes (removing quotes, replacing : with =) is now perfectly valid TOML.

If the decision is that the problem with JSON is that you use : instead of = and need quotes for keys, then why is TOML here in the first place? And if the issue is that JSON forces all your file like that, while TOML only allows it if you want, why not just allow embedding JSON inside of the config file? The JSON spec is pretty tiny, so there wouldn't really be any major issues. Allow the quotes to be removed for keys and you get essentially this pull request along with the benefit that now you can copy and paste JSON, allowing conversion to happen slowly, or allowing the user to choose which syntax they prefer.

@ChristianSi
Copy link
Contributor

What's the status of this? Wasn't it ready for merge weeks ago?

@mirhagk
Copy link

mirhagk commented Sep 2, 2014

Well it's a pretty big change, and the syntax and restrictions should be really nailed down before it's merged. Personally I think the alternate syntax of saying that , is replaced with \n should be allowed, perhaps both of them would be allowed, allowing the user to choose which they prefer?

@ChristianSi
Copy link
Contributor

@mirhagk

Personally I think the alternate syntax of saying that , is replaced with \n should be allowed

I wouldn't mind that one either (I was one of those who proposed it) and it would be a smaller change than allowing fully-fledged and nestable {...} blocks, but the project founders were against it, I think.

@mirhagk
Copy link

mirhagk commented Sep 2, 2014

But there's no real downside to supporting it other than a tiny bit of additional complexity, but looking at an example of it makes it very obvious what's happening, so the complexity is very minimal. It could at the very least support both.

@ChristianSi
Copy link
Contributor

Supporting both would violate the goal of minimalism.

@mirhagk
Copy link

mirhagk commented Sep 2, 2014

Well so would accepting this pull request which essentially encompasses the entire complexity of json within TOML.

@ChristianSi
Copy link
Contributor

The entire complexity of json? Keep in mind that JSON is, quite purposefully, about the most simple data format even conceived.

Anyway, I think it's pretty much a bikeshed decision whether to wrap inline tables in {...} or not, and whether to allow nesting or not. So I'm happy letting the founders decide.

But I do think it would be really useful to have an accepted syntax for inline tables (whatever exactly it is), so that implementors can add support for it.

@mirhagk
Copy link

mirhagk commented Sep 2, 2014

JSON is only really easy to understand for programmers. Show it to non-programmers and they get really confused.

That is the really nice thing about TOML. Anyone, even people barely computer literate gets how to write it. It's extremely intuitive.

@ChristianSi
Copy link
Contributor

Ah, now I get your point.

@wycats
Copy link
Contributor

wycats commented Sep 2, 2014

Well so would accepting this pull request which essentially encompasses the entire complexity of json within TOML.

This is just not accurate, for the reasons I described above.

What kinds of cases are you using Toml for?

@BurntSushi
Copy link
Member

I've thought about this a lot, and honestly, I disagree with this change. It violates the principle of minimalism, which is this project's primary goal. (Of course, all additions could be said to violate minimalism in some way. So what I'm really saying is that I think the benefit of this change doesn't outweigh its cost.) We've already hashed a lot of this out, so I'll be brief.

I mostly agree with @wycats that the inline table syntax is less cumbersome and looks quite a bit better than expanding it into a new section. On the other hand, I just don't think it's worth it. By now, I've written a few Cargo.toml files for Rust projects, and I'm just not at all offended by writing a new section for each dependency.

IMHO, here are the trade offs of this PR as it stands:

Pros: aesthetic appeal. Less burdensome. Makes particular use cases more convenient.

Cons: Complex. Not minimal. Arbitrary (no new lines).

My intention is not to downplay the pros---convenience is really important. My intention is to highlight the project's primary purpose (minimalism) and point out that I think the convenience afforded by this syntax just does not outweigh the cost of an additional syntactic category.

I won't be completely bummed if this gets merged (because I think @wycats has a point---I just don't share his fervor), but I don't think I can give it my blessing either.

@BurntSushi
Copy link
Member

FWIW, I think the whole "it looks like JSON" is not a technical problem with this proposal, but I do think it's a marketing problem. I can imagine it now: "TOML is supposed to be minimal and obvious, but it contains something that resembles JSON. Except no new lines. WTF."

Whether marketing should play a role in whether this gets merged, I don't know. (I don't think I really care personally.) But I did want to at least address it.

@mirhagk
Copy link

mirhagk commented Sep 3, 2014

@wycats Well with on of the options suggested in #229 that would allow [.color] you wouldn't have that problem.

@redhotvengeance
Copy link
Contributor

@wycats It seems like a bit of a stretch to say that:

[dependencies.color] version = "1.0", optional = true
[dependencies.debugging] version = "1.6", optional = true

...is not really sufficient for reducing this:

[dependencies.color]

version = "1.0"
optional = true

[dependencies.debugging]

version = "1.6"
optional = true

...just because you still have to include the key "dependencies" on each line. Unless I'm mistaken, one of the original intents of #219 was to reduce the number of lines needed for repetitive/verbose data. So this:

[dependencies.color] version = "1.0", optional = true
[dependencies.debugging] version = "1.6", optional = true

...is actually an improvement over this:

[dependencies]
color = { version = "1.0", optional = true }
debugging = { version = "1.6", optional = true }

...since it is achievable in two lines rather than three.

I still agree with @BurntSushi in that I don't really mind having more lines, so I'm still not really onboard with inline tables. But if we do add them as a feature, I'd prefer a method that doesn't introduce {} into the syntax.

@mojombo
Copy link
Member Author

mojombo commented Jan 22, 2015

This feature is the last big thing to make a decision on before TOML 1.0. I've been hesitant to merge it because of the reasonable feedback and because it still doesn't feel quite as elegant as I strive for. That being said, it's still very likely to go in.

It's been a while since this was proposed. @wycats @alexchrichton - Has anything changed from a Cargo perspective? Have the last 6 months given you any more precise insight into whether this proposal still meets your needs well? I'd love an updated comment about the matter.

@alexcrichton
Copy link
Contributor

I don't think we've necessarily gained much more insight since 6 months ago, but our usage pattern has definitely changed quite a bit. Six months ago most Cargo.toml manifests looked like:

[dependencies.foo]
git = "https://github.com/foo/foo"

[dependencies.bar]
git = "https://github.com/bar/bar"

Most Cargo.toml manifests today look like:

[dependencies]
foo = "0.1"
bar = "0.2.3"

The actual syntax of declaring dependencies hasn't actually changed (for us these differences are just declaring dependencies from different sources). One interesting point we've seen, though, is that in isolation both of these syntaxes are pretty reasonable, but when put together the result is a little jarring:

[dependencies]
foo = "0.1.0"

[dependencies.bar]
git = "https://github.com/bar/bar"

For contrast, the mixed-source dependencies with inline tables would look like:

[dependencies]
foo = "0.1.0"
bar = { git = "https://github.com/bar/bar" }

As a personal opinion, I find the second inline table syntax flows quite nicely as it's easy to scan what the list of dependencies are without having to worry about where they're actually coming from.

So all in all, in terms of insights I don't think we've turned up much new (our syntax hasn't changed much since day 1). I think it'd be summed up by "this would still be quite nice for us" :)

@wycats
Copy link
Contributor

wycats commented Jan 23, 2015

Just to echo what @alexcrichton said, the migration towards a registry for Cargo has had the expected results here. Mixing in a few git dependencies with a larger number of registry dependencies is awkward (as expected), and this would be very nice for us.

@mwanji
Copy link
Contributor

mwanji commented Jan 23, 2015

As it looks like this might go in, I'm going to argue against it, for 3 reasons: overloaded tables, complexity and I don't think even Cargo needs it.

Cargo
The driving usecase is Cargo, but its own syntax has improved enough that inline tables might not add much benefit, but would add a lot of burden to TOML. I've only just read the Cargo manifest definition, but it seems like at least some of the akwardness could be fixed by making Cargo a bit more flexible (please pardon my backseat programming).

For example:

[dependencies]
foo = "0.1.0"

[dependencies.bar]
git = "https://github.com/bar/bar"

[dependencies.baz]
path = "./path/to/baz"

could be

[dependencies]
foo = "0.1.0"
bar = "https://github.com/bar/bar"
baz = "./path/to/baz"
hammer = "github:wycats/hammer/wip" # I'm making this one up: scheme + repo ID + branch or sha-1

This seems fairly clear and in line with Cargo's evolution, without requiring the entire TOML ecosystem to support inline tables. Further, quoted keys can help provide support for dependency names with dots in them.

Overloaded tables
There are several ways of defining tables (normal, implicit, table arrays). The combinations can already be a bit tricky for a TOML file author to understand/predict, adding another way makes it even trickier.

Complexity
Datetimes are seriously being considered for removal, yet seem far simpler than inline tables.

@kardianos
Copy link

I have looked into toml for my line of business configuration files. I intend to use it, especially once the spec gets nailed down to v1. I'm a fan of how Go is optimized readability over totally simplest way to write something.

I did a comparison with the above examples in my text editor. I think toml's advantage is that it repeats itself. If you don't want to repeat section headers you should use json or xml. The repeated section names were much easier to read.

I'm strongly in favor of not including this.

My use case is primarily application configuration files. Example information include http ports, database connection strings, host name binding, and filepaths. I don't want the user to see or use "{}" or have someone later decide to "take advantage" of this feature, confusing end users later. That's why I'd choose toml in the first place. Believe it or not, nesting and comma separation is surprisingly confusing to many people.

It sounds like cargo may be better off with json then toml given the arguments. After all, programmers will be using cargo, not users.

-Daniel

@pnathan
Copy link

pnathan commented Jan 31, 2015

I don't want the user to see or use "{}" or have someone later decide to "take advantage" of this feature, confusing end users later.

+1. A major use case for me is being able to build tools and utilities, then distribute them to people without having to worry that "forgot to put a }" becomes a support case root cause. INI is a revealed preference for many many people; leaning towards that is very nice.

@wycats
Copy link
Contributor

wycats commented Jan 31, 2015

Encoding all of the options for a dependency in a cryptic string micro-syntax is not an overall simplification.

You can, of course, encode arbitrary data in strings, but when I start going down that path, I always ask myself whether an options hash would would be better.

@mwanji
Copy link
Contributor

mwanji commented Feb 1, 2015

Encoding all of the options for a dependency in a cryptic string micro-syntax is not an overall simplification.

Sure, the last example probably goes overboard. However, I think the other three are in line with Cargo's evolution. In any case, my main point was to try to show that having to list a dependency's properties explicitly could be made a relatively rare situation.

@mojombo
Copy link
Member Author

mojombo commented Feb 7, 2015

One of the most important things that TOML will ever have is readability. In most cases obviousness and minimalism will align with this goal. But sometimes minimalism must be sacrificed in order to create something that is powerful and expressive enough to handle reality. Cargo represents one very powerful use case, that of a structure like A → B → C, where A is a table of many Bs, and each B is a table of few Cs. This circumstance is not currently handled well by TOML. And that's a shame, because I anticipate that the pain this creates will affect a lot of people. After spending some time trying out various config files with and without this PR, I've decided that TOML needs it. Fortunately, those that don't need this extra bit of power don't need to use it or present it to users. But those that DO need it will find their config files much more delightful and intuitively grouped.

Thank you to everyone that weighed in on this issue. Your insights and experiences have been extremely valuable.

mojombo added a commit that referenced this pull request Feb 7, 2015
@mojombo mojombo merged commit 24539a5 into master Feb 7, 2015
@mwanji
Copy link
Contributor

mwanji commented Feb 9, 2015

Do inline tables support trailing commas, like arrays?

@mojombo
Copy link
Member Author

mojombo commented Feb 10, 2015

@mwanji No, not with the current spec.

@mwanji
Copy link
Contributor

mwanji commented Feb 10, 2015

Is it necessary to mention that bare & quoted keys can be used within inline tables, or is that too obvious?

@mojombo
Copy link
Member Author

mojombo commented Feb 11, 2015

@mwanji I think that's suitable addressed by "Key/value pairs take the same form as key/value pairs in standard tables."

@mwanji
Copy link
Contributor

mwanji commented Feb 14, 2015

Nested tables make seeing which tables have been defined a lot less obvious.

It's an issue that I don't think has been raised and that I came across while adding inline tables to my parser. For example:

a = { a1 = 1, b = { b1 = 2, c = { c1 = 3 }}}

[a.b] # this is now illegal
[a.b.c] # this is now illegal

Are there cases where a nested inline table looks better than the non-nested syntax?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.