Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix parsing of long chars #286

Merged
merged 1 commit into from
Jun 18, 2021
Merged

fix parsing of long chars #286

merged 1 commit into from
Jun 18, 2021

Conversation

pfitzseb
Copy link
Member

Fixes #.

For every PR, please check the following:

@@ -1009,6 +1010,7 @@ end""" |> test_expr
@test CSTParser.parse(raw"'\u222ää'").head == :errortoken
@test CSTParser.parse(raw"'\x222ää'").head == :errortoken
@test CSTParser.parse(raw"'\U222ää'").head == :errortoken
@test CSTParser.parse(raw"'\U10000001'").head == :errortoken

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually a parser error in Julia's own parser:

julia> '\U10000001'
ERROR: syntax: invalid escape sequence

The largest valid code point is U+10FFFF.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's basically what this test tests:

julia> CSTParser.parse(raw"'\U10000001'")
  1:12  errortoken( CSTParser.InvalidChar)
  1:12   CHAR: '\'

julia> CSTParser.parse(raw"'\U00000001'")
  1:12  CHAR: '\U00000001'

Could've chosen a less arbitrary codepoint though, I suppose.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it make sense to test the two values on either side? I.e. test that '\U10FFFF' is valid and that '\U110000' is invalid?

@pfitzseb pfitzseb requested a review from a team June 18, 2021 11:04
@pfitzseb pfitzseb added this to the Next Patch milestone Jun 18, 2021
@davidanthoff davidanthoff merged commit 3a56484 into master Jun 18, 2021
@davidanthoff davidanthoff deleted the sp/fix-long-char-parse branch June 18, 2021 18:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants