-
Notifications
You must be signed in to change notification settings - Fork 191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
not a compliant parser #4
Comments
Thank you, I intended to do this at some point. Some of the errors seem to have to do with errors in your code translating from my lib to a format which your tester understands (e.g. arrays without opening brackets, the UnicodeEncodeError). Some of them have to do with being too liberal with the input that my parser accepts. Others have to do with changes I made and weren't adequately tested (if they were adequately tested they wouldn't cause test failures, obviously). I'm not sure how much I ought to enforce homogeneity in arrays even though it is in the spec because toml is primarily for configuration rather than data interchange. |
No problem. I'm trying to get people moving and getting their parsers tested :-)
The only such test is
Why would you release a "parser for TOML" and purposefully not be compliant with the spec in such a big way? Homogeneity isn't just for data exchange, it's to ensure well-typed structured data. This is especially useful for static languages. Python isn't a static language, but that doesn't mean dynamic languages get its own version of the TOML spec. Moreover, true homogeneity is hopefully coming soon. If you won't implement the spec, I suggest that you make that clear somehow. |
Yes, I realize now that I've been chopping off the opening brackets in arrays. How embarassing. I fixed that, will push a commit to github in a second.
This isn't some sticking point for me where I disagree with the spec and will refuse to enforce type homogeneity in arrays but rather, due to the way I threw together the parser, adding in enforcement for homogeneous arrays is something which I need to be careful about implementing. If someone is going to beat me to the punch and send in a pull request, I'll gladly take it :). I'll see if I can get around to implementing it this week or next regardless of other people's contributions. Especially if tuples make their way into the spec. |
No longer chop off opening brackets for nested arrays Ignore trailing backslashes
@uiri Ah, fair enough. I misinterpreted your intentions! I've already written an article that I intend to publish if and when true homogeneous arrays/tuples lands that talks about type checking everything (including the elusive empty list). Hopefully that will help. Also, I've updated the original gist with results from the latest pull. I've also fixed the floating point bug (temporarily). Things are looking better :-) |
Treat keynames as keygroups with respect to # Fixed bug with regards to escaping backslashes
I'm fairly certain that duplicate keys and empty keygroups are allowed. For the former, the latter key takes precedence and for the latter, the result is supposed to be an empty hash. The implicit-and-explicit-after seems to me to be a nontrivial fix like keeping track of the type of arrays. I'm not sure why string no close isn't raising an error and key-no-whitespace looks like something which should be allowed but I guess the spec doesn't agree with me on that one. Maybe some clarification is in order… I'll put in some more fixes tomorrow if I find the time. I think I touched on or fixed all the tests I was failing? |
From the spec:
Duplicate keys aren't allowed.
This is possible. The spec doesn't clarify it. I believe there are some open issues regarding it. I might remove the test until it's clarified. But since duplicate keys are allowed, there could only be one empty keygroup.
Not sure. You just need to keep track of whether a hash was created implicitly or explicitly. Sounds like an extra dictionary (or even better, a set) to me.
Mostly. There are a few others: string-escapes (something weird is going on there), key-with-pound, key-special-chars, float-no-leading-zero and float-no-trailing-digits. Thanks for your patience and helping me get the kinks worked out of |
I was under the impression that what is forbidden by that is strictly that example - overwriting a key with a keygroup. I can't find it now for some reason, but I could have sworn I saw it clarified that in a badly done config file, the latter version of a key overrides the previous version.
Maybe the spec should be editted to clarify this, but see toml-lang/toml#30 (comment)
Yes, another thing to track in a parser which is relatively stateless…
I fixed those except the floats. I need to add some checks in for them, I think, but it doesn't seem to be a big deal.
Thank you for writing it; it is a lot more thorough than my own simple testing. |
See the end of issue #81. Part of the problem with file inclusion was one of keys being overwritten. mojombo seemed to implicitly reinforce this notion.
An empty key group can only be represented as
Part of the problem might be the combination of lexing and parsing into one process. A finite state machine (a lexer) works great for picking out syntactic categories, while a parser can use that information to enforce global constraints like duplicate keys, type restrictions, etc. |
Not sure. I think it was that the included file would go under a keygroup which would introduce another possibility of overwriting a key with a keygroup.
I think you mean All the tests should now be passing except for the implicit-and-explicit, the mixed arrays and the empty keygroup ones. |
I don't know. My tests check that The latter is absolutely correct according to the current spec. The only point of debate here is whether |
All of the tests except the implicit and explicit after should be passing now... |
Nice work :-) Although this is what I get after a pull:
|
Fixed this just now :)
These are bogus tests. The spec only specifies integer size as 64-bit minimum. In fact, I think these should be in the valid tests to make sure people aren't using 32-bit integers.
I'm just incredulous that equal signs are allowed in key names. It just never occured to me how wacky |
!!! I missed the word "minimum" in the spec. Very nice catch. Thank you :-) |
Wow nice work! All looks well, except for #3 in toml-test. I've been convinced that key= 1 is allowed. So your parser only fails one last test (which I added today): Keep in mind that the implicit/explicit group stuff is still open to interpretation. You don't have to listen to me. It's just my understanding and it could be wrong. We're still waiting for a clarification. (What we really need is an EBNF, and we can all stop trying to interpret the subtleties of English.) |
This sentence from the spec is starting to look like one of those pictures with two interpretations to me. The answer is, of course, to support both!
I'm not sure… it would seem to make sense that you can define a parent keygroup after it has children. Covering more cases can't be bad, right? |
Emphatically yes. One of the hardest parts of maintaining a project like TOML—and I don't envy @mojombo for this—is refraining from adding too much complexity. Simplicity is the ultimate sophistication. :-) |
The only whitespace needed on a key value pair is between the equal sign and the value
Can you send a pull request with your test script? Or is it OK if I include it? The license is MIT, you should probably be added to the copyright notice if your code is included. |
Nice! You pass all tests :-)
While I didn't explicitly specify a license with this test inferface, all of my hobby code is released under the WTFPL. And it's just a 30 line test interface. It was in the gist (although my pull request might have a few tweaks since then). I used a virtually identical interface to test the rest of the Python TOML parsers :-) In short: do whatever you want. My name in the commit logs is good enough for me. |
Closing this since it passes all the tests :) |
I've created an interface for your parser to work with toml-test, which automatically tests your parser with a whole bunch of test cases. (I'm at 50 right now.)
Your parser fails quite a few tests.
(One of the tests I know is bogus because of floating point stuff. I'm trying to figure out how to fix that.)
The text was updated successfully, but these errors were encountered: