Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(playground): multi-byte character issue #647

Merged
merged 1 commit into from
Jun 22, 2024
Merged

Conversation

Sec-ant
Copy link
Member

@Sec-ant Sec-ant commented Jun 21, 2024

Summary

In Rust we use UTF-8 byte offsets for TextSize, which will be used in TextRange and ultimately be passed to the JavaScript side as the numbers in diagnostic.location.span. However, in JavaScript, we use UTF-16 code unit offsets to access the characters.

So when a character is represented by multiple bytes in UTF-8 encoding and is placed before or is part of the code that triggers a diagnostic, the playground will wrongly interpret its byte offset as a code unit offset, thus causing an out of bounds indexing error when it tries to highlight the range or make a selection of the code that triggers the diagnostic.

To fix this problem, we need to convert the UTF-8 byte offset into a code unit offset that can be used directly by JavaScript String methods like slice. I took the solution here and adapted it to fit our use case.

This conversion will inevitably introduce an O(n) overhead because the byte length of an UTF-8 encoded character is variable and we have to scan from the beginning to calculate the byte offset of a certain character in the middle. However, I think it's acceptable, because the code in the playground is usually not very long, and for it performance is also not our first consideration. The benefit is that we no longer have to worry about playground crashes triggered by non-English characters, especailly for users that use a second IME.

Closes biomejs/biome#1385
Closes biomejs/biome#3250

Copy link

netlify bot commented Jun 21, 2024

Deploy Preview for biomejs ready!

Name Link
🔨 Latest commit 9236a97
🔍 Latest deploy log https://app.netlify.com/sites/biomejs/deploys/6675c91009af390008fe31d3
😎 Deploy Preview https://deploy-preview-647--biomejs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@Sec-ant Sec-ant merged commit 3306509 into main Jun 22, 2024
6 checks passed
@Sec-ant Sec-ant deleted the fix/string-indexing branch June 22, 2024 02:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant