-
-
Notifications
You must be signed in to change notification settings - Fork 511
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🐛 JSON Parser: Unterminated string literal can make inner_string_text
panic
#2357
Comments
inner_string_text
panicinner_string_text
panic
Wow, that's surprising! |
Some ideas:
Any opinion? |
I myself prefer the first option because I was expecting the parser to be strict and conformant to the spec, and leave ambiguous tokens as bogus nodes just as what Biome claims to do. But I also think the bogus part should be as small as possible so to not propagate the bogus state to the outer tree, and the parsing can be recoverable as soon as possible. #2606 also reports other fail cases to trigger the error, that's also another reason why I think option 2 is not ideal because we'll have to evaluate all the possible fail cases and draw a line somewhere to decide whether something is unrecoverably malformed, which I think can be error-prone. |
Same there. The first approach seems better because I think it is closer to the user's intention. |
I have started to work on this. However, I wonder: why is it actually wrong to treat an underlined string as a string? The lexer gives an error. Could we just change |
Maybe because it can cause som unstable or unexpected behavior? I haven't try if this would be an issue, but an example I can think of is when the string is ended with a newline without the quote, it is regarded as an unterminated string, but a formatter will still format it and it may remove the new line, and that unterminated string may then become something else and so the tree may be changed accidentally? |
I am not familiar with the JSON formatter. However, the newlime is part of the string. Thus, the formatter should not remove ir? EDIT: Ok, I was wrong, the lexer doesn't include the newline in the string. See an example in the playground. |
Environment information
What happened?
Our JSON parser allows
JSON_STRING_LITERAL
to include an unterminated string literal:biome/crates/biome_json_parser/src/lexer/mod.rs
Lines 573 to 579 in a5a67ed
biome/crates/biome_json_parser/src/lexer/mod.rs
Lines 620 to 628 in a5a67ed
This will make the function
inner_string_text
panic when the unterminated string literal is just a single doublequote"
:biome/crates/biome_json_syntax/src/lib.rs
Lines 118 to 123 in a5a67ed
The function
inner_string_text
is called in the JSON nursery rulenoDuplicateJsonKeys
:biome/crates/biome_json_analyze/src/lint/nursery/no_duplicate_json_keys.rs
Line 58 in a5a67ed
And this nursery rule is enabled by default when debugging or in a nightly build. So when changing the VS Code extension
lspBin
to a local debug build, and when a JSON file containing content like this is scanned by the extension, Biome will panic and keep reloading until it's killed permanently:The above erroneous JSON can appear quite frequently when using code completion so it will kill Biome often.
Expected result
Unterminated string literal in an
AnyJsonValue
place should be aJsonBogusValue
instead of aJsonStringLiteral
.However I don't know how to treat it when the unterminated string literal appears in the
JsonMemberName
place. The JSON grammar may need changing. And I don't have enough knowledge to modify the grammar.Code of Conduct
The text was updated successfully, but these errors were encountered: