Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 The playground stops working correctly whenever I input code with multiple code points #1385

Closed
1 task done
togami2864 opened this issue Dec 31, 2023 · 6 comments · Fixed by biomejs/website#647
Closed
1 task done
Labels
S-Bug-confirmed Status: report has been confirmed as a valid bug

Comments

@togami2864
Copy link
Contributor

Environment information

The latest version of the biome playground

What happened?

The playground stops working correctly whenever I input code with multiple code points.
The cursor starts glitching, and when I press Enter, it reverts to the original settings because of the error like: RangeError: Position 16 is out of range for changeset of length 15.

issue.mp4

Playground

Expected result

It should work with code including multiple code points

Code of Conduct

  • I agree to follow Biome's Code of Conduct
@Conaclos Conaclos added the S-Bug-confirmed Status: report has been confirmed as a valid bug label Dec 31, 2023
@faultyserver
Copy link
Contributor

faultyserver commented Jan 1, 2024

Noting for future investigation, this error comes from inside of @codemirror/state: https://github.com/codemirror/state/blob/8bc5f63f73722c6019e6248d8338ea46802d157e/src/change.ts#L140.

I don't believe this is a bug in CodeMirror, just in the way that we are using it. But I'm not sure yet. I haven't traced through to see where it's actually being emitted from, but that is the root source of it.

@faultyserver
Copy link
Contributor

Investigated a little bit further: this only happens when there is a diagnostic that spans to the end of the text. Which I think implies that there's a difference in how we handle multi-byte characters versus how CodeMirror does (could be a great case of unicode string length between languages: https://hsivonen.fi/string-length/).

If you use let r = /[Á]/u instead (Playground Link), you'll see that you can add and remove characters at the end, because the diagnostic about using var is no longer there. But once you have another diagnostic on that line that reaches the end (like if you backspace into the regex), it will crash again, I think for the same reason of having a diagnostic past the end of the line.

The solution here might be to just do a bounds check on the diagnostics Biome creates to ensure they don't go too far (or convert from codepoints to visible column count to better match JS maybe?). I believe Biome is doing the correct calculations for byte positions here.

@arendjr
Copy link
Contributor

arendjr commented Mar 11, 2024

I believe Biome is doing the correct calculations for byte positions here.

Thanks, I think that highlights the problem indeed. It probably means that Biome assumes UTF8 encoding, which is a reasonable assumption in terminal environments and also what Rust uses internally. But JavaScript uses UTF16 codepoints instead, which indeed creates some discrepancies.

I think we might best do the conversions in the WASM<->JS layer, since there we might still have the strings in both formats, which I think will ease the conversion (you can use the UTF8 string to convert the byte index into a codepoint offset, then use the UTF16 string to compensate for surrogate pairs). But I haven’t worked with that layer in Biome yet, so I’m not sure to which extent it is generated.

@Sec-ant
Copy link
Member

Sec-ant commented Mar 11, 2024

But as demonstrated in #1941 (biome lint), this seems not a JS only issue. The lexer or parser has trouble dealing with these unicode characters.

@arendjr
Copy link
Contributor

arendjr commented Mar 11, 2024

Thanks, that’s a good observation! I’ll reopen that issue, because I now think there are two separate issues at play.

@Sec-ant
Copy link
Member

Sec-ant commented Mar 31, 2024

This might be helpful: https://stackoverflow.com/a/73096001

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-Bug-confirmed Status: report has been confirmed as a valid bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants