Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update valid comment characters in toml v1.1.0 #159

Conversation

ToruNiina
Copy link
Contributor

Hi,

First of all, thank you for maintaining this repository and updating it so frequently. This repository is really helpful for identifying bugs in the parser/serializer and for understanding and explaining the specifications.

Recently, I noticed that some test cases are outdated in v1.1.0.

After the commit ab74958, TOML allows most control characters except for 0x00, 0x0A, 0x0B, 0x0C, and 0x0D.

Thus, although the following cases are invalid in TOML v1.0.0, they become valid after TOML v1.1.0:

  • comment_del.toml: 0x7f (DEL)
  • comment-lf.toml: 0x10 (DLE)
  • comment-us.toml: 0x1f (US)

These characters are also included in allowed-comment-char defined in toml.abnf at the current commit in toml-lang/toml.

https://github.com/toml-lang/toml/blob/main/toml.abnf#L40-L43

So I removed those files from the list of invalid test cases defined in tests/files-toml-1.1.0.

Please check it whenever you have time.
Thank you.

@arp242
Copy link
Collaborator

arp242 commented Jun 15, 2024

Thanks for your effort, but I intentionally left this out because 1.1 isn't released yet and I feel pretty strongly this shouldn't make it to TOML 1.1, so seems pointless to include it. Nothing is really "outdated" because 1.1 is not a released version.

Maybe I should remove the references to it in the README and such; when I added that I felt it should be released "soon", but that's been over a year, and who knows how long it will take...

@ToruNiina
Copy link
Contributor Author

Ah, sorry, I didn't noticed that you have already sent a Pull Request to revert this feature.

I personally have never felt the need to include control characters in my comments. And implementing this feature is not difficult. Therefore, I am fine whether it is included or not. Rejecting this request is acceptable to me.

However, I am a bit concerned that toml-lang/toml-test is in a different state than toml-lang/toml:master, considering it is, in some sense, an official repository. I think it would be clearer to synchronize this repo with toml-lang/toml:master (or the latest release, v1.0.0), even if we later would revert some features.

Of course, those are just my personal opinion and it makes sense that the more active contributor decides whether to add changes, considering the cleanliness of the history and the effort required.

In any case, thank you for your response.

@arp242
Copy link
Collaborator

arp242 commented Jun 18, 2024

Well, we can't actually add proper tests here until the TOML library it uses gets updated, and that is actually somewhat non-trivial due to how the code was written (it's not rocket science, but also more than "just update this list of characters"). This is also why this needs more tests IMO, because it's not always a trivial change for every implementation. So in that sense your PR is incomplete.

Like I said in the revert PR, this entire change doesn't even fix the reported problem of "this is hard to parse". And I think it's very hard to make the case that one variant is objectively better than the other.

If TOML was a new spec I wouldn't really care one way or the other: it's the proverbial bikeshed. But it's not a new spec. And I'm the one having to do the work. I'm not seeing anyone else maintain toml-test. All the people with oh-so-strong opinions on this have basically no contributions here, or anywhere else. Easy to propose changes if you're not the one having to do the work...

(Or we can just remove all these tests – looking at https://arp242.github.io/toml-test-matrix/ these are very common test failures among different libraries, and IMHO the restriction on any control character other than perhaps the null byte is of rather dubious value, as is testing it here – who really cares if some implementation "erroneously" accepts some control character?)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants