Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New lines in multi-line literal strings trimmed incorrectly #68

Closed
SergioBenitez opened this issue Dec 2, 2016 · 6 comments
Closed

Comments

@SergioBenitez
Copy link

The library is incorrectly parsing/generating code from multi-line literal strings by condensing multiple new lines in the string into a single one.

Example:

EXPECTED:

import toml
toml.loads("a = '''\nhello\nworld\n\n\nbye'''")
{u'a': u'hello\nworld\n\nbye'}

ACTUAL:

import toml
toml.loads("a = '''\nhello\nworld\n\n\nbye'''")
{u'a': u'hello\nworld\nbye'}

We expect two new lines between world and bye, but we get one. According to the specification (emphasis mine):

Multi-line literal strings are surrounded by three single quotes on each side and allow newlines. Like literal strings, there is no escaping whatsoever. A newline immediately following the opening delimiter will be trimmed. All other content between the delimiters is interpreted as-is without modification.

I'll take a look at the code and propose a fix.

@SergioBenitez
Copy link
Author

SergioBenitez commented Dec 3, 2016

It looks like there's a test that contradicts the specification in @BurntSushi's tests. In particular, this test with this expected result. @avakar doesn't have a test case for this. I've fixed this issue in the Python script and will submit a PR. I believe @BurntSushi's test case is incorrect.

Edit: Actually, both the equivalent_two and equivalent_three tests are incorrect.

SergioBenitez added a commit to SergioBenitez/toml that referenced this issue Dec 3, 2016
Previously, the library would trim all whitespace in multiline strings, even
when the string was a multiline _literal_ string. This resolves that issue,
leaving all whitespace in multiline literal strings.
SergioBenitez added a commit to SergioBenitez/toml that referenced this issue Dec 3, 2016
Previously, the library would trim all whitespace in multiline strings, even
when the string was a multiline _literal_ string. This resolves that issue,
leaving all whitespace in multiline literal strings.
@noqqe
Copy link
Contributor

noqqe commented Jan 10, 2017

Indentation is also gone if inserted as a multiline string. See #71

@uiri
Copy link
Owner

uiri commented Feb 27, 2017

From the spec:

For writing long strings without introducing extraneous whitespace, use a "line ending backslash". When the last non-whitespace character on a line is a \, it will be trimmed along with all whitespace (including newlines) up to the next non-whitespace character or closing delimiter.

Closing as WONTFIX due to compliance with the spec.

@uiri uiri closed this as completed Feb 27, 2017
@SergioBenitez
Copy link
Author

@uiri That line of the spec is entirely irrelevant to this issue.

@noqqe
Copy link
Contributor

noqqe commented Feb 27, 2017

@uiri I totally aggree with @SergioBenitez

@uiri
Copy link
Owner

uiri commented Feb 27, 2017

I'm sorry. You're right; it is relevant to the associated PR(s) but not to this issue.

@uiri uiri reopened this Feb 27, 2017
uiri pushed a commit that referenced this issue Apr 9, 2017
Previously, the library would trim all whitespace in multiline strings, even
when the string was a multiline _literal_ string. This resolves that issue,
leaving all whitespace in multiline literal strings.
@uiri uiri added the bug label Apr 12, 2017
@uiri uiri closed this as completed in 65b8086 Apr 12, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants