Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add mult-line and literal strings #232

Merged
merged 5 commits into from
Jul 11, 2014
Merged

Add mult-line and literal strings #232

merged 5 commits into from
Jul 11, 2014

Conversation

mojombo
Copy link
Member

@mojombo mojombo commented Jun 30, 2014

Extension of #228.

  • Clarifies some of the language.
  • Allows for """\ as opening delimiter for multi-line basic strings.
  • Renames "raw string" to "literal string".

@BurntSushi
Copy link
Member

LGTM.

@mojombo
Copy link
Member Author

mojombo commented Jun 30, 2014

@wycats I'd love your input on this one, if you have a minute. It's a significant (but backwards compatible) addition to the spec.

@wycats
Copy link
Contributor

wycats commented Jul 3, 2014

@mojombo I took a look and this seems pretty reasonable. I wonder if the difference between single- and double-quoted strings is too subtle, but Toml is a pretty simple language and people can read the docs if they're confused.

Any Unicode character may **be used except those that must be escaped

I assume this doesn't mean to disallow UTF-8 encoded Unicode characters?

If the first character after the opening delimiter is a newline (0x0A), then it is trimmed

I'm inclined to suggest the same treatment for a trailing newline to enable:

key3 = """
One
Two
"""

Since there is no escaping, there is no way to write a single quote inside a literal string enclosed by single quotes

An alternative solution to this problem would be to allow two adjacent ' characters to represent an escaped single-quote. I can't think of any obvious downsides to that approach.

Luckily, TOML supports a mult-line version of literal strings that solves this problem

Typo in mult-line.


One specific use-case worth considering explicitly is shell-escaping in commands embedded in string literals. I think that single-quote raw strings solve this use-case.

@BurntSushi
Copy link
Member

I'm not really a fan of trimming the trailing new line since it isn't usually what you want. If we did, the common case would be:

key3 = """
One
Two

"""

I'm pretty ambivalent about ''. I can definitely see its benefits. But it's also another rule. I'm fine either way I think. (We could opt to leave out the '' escaping for now and add it later. It would be backwards compatible because ' is otherwise outlawed in the current proposal.)

@wycats
Copy link
Contributor

wycats commented Jul 4, 2014

@BurntSushi Sounds fine re: trailing newline. I don't see a good reason not to allow ' escaping, to avoid requiring ''' just to use a single quote in raw strings.

@ChristianSi
Copy link
Contributor

I don't think the "double ' to escape it" rule would be very useful. Writing '''... don't ...''' instead of '... don''t ...' shouldn't cause serious inconvenience.

On the other hand, allowing '' in '-strings to mean ', while in '''-strings it would (of course) still mean '' could be confusing, I think.

@ChristianSi
Copy link
Contributor

One issue that might deserve an explicit clarification is whether, if a triple-quoted strings ends with 4 or 5 quotes, the first 1 or 2 of them are part of the content of the string. If yes, the string

"This", he said, "is the most awful thing I've heard in my life."

could be written like this:

""""This", he said, "is the most awful thing I've heard in my life.""""

If not, the final quote must be escaped. That's what Python seems to expect.

I'm not biased on way or the other but think a clarification would be in order to help parser writers.

@mojombo
Copy link
Member Author

mojombo commented Jul 11, 2014

Any Unicode character may **be used except those that must be escaped

I assume this doesn't mean to disallow UTF-8 encoded Unicode characters?

Correct. The intention is that any UTF-8 encoded character is ok to use except those explicitly mentioned.

Since there is no escaping, there is no way to write a single quote inside a literal string enclosed by single quotes

An alternative solution to this problem would be to allow two adjacent ' characters to represent an escaped single-quote. I can't think of any obvious downsides to that approach.

I'd prefer to keep literal strings 100% literal. It makes it easier to decide which string type to use. It also prevents any ambiguities, for instance, what does '''foo''' represent? With '' escaping, it could mean either 'foo' as a literal string or foo as a multi-line literal string. The current proposal always gives you a good, clear alternative when you run up against a string constraint, and the number of rules is small. I like that.

One issue that might deserve an explicit clarification is whether, if a triple-quoted strings ends with 4 or 5 quotes, the first 1 or 2 of them are part of the content of the string. If yes, the string

"This", he said, "is the most awful thing I've heard in my life."

could be written like this:

""""This", he said, "is the most awful thing I've heard in my life.""""

If not, the final quote must be escaped. That's what Python seems to expect.

I think it's reasonable and straightforward to require that quotes be escaped in that scenario. The delimiter is """ and once the parser finds it, that should be it. I'll work on clarifying the text to communicate the fact.

Any other final concerns from anyone? I'd like to merge this soon.

@mojombo
Copy link
Member Author

mojombo commented Jul 11, 2014

@ChristianSi I just pushed a commit (7f33170) that should address the quotation mark escaping you brought up.

@BurntSushi
Copy link
Member

@mojombo 👍

1 similar comment
@ChristianSi
Copy link
Contributor

@mojombo 👍

mojombo added a commit that referenced this pull request Jul 11, 2014
Add multi-line and literal strings
@mojombo mojombo merged commit 70b6132 into master Jul 11, 2014
@mojombo
Copy link
Member Author

mojombo commented Jul 11, 2014

Merged!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants