-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove UCS-2LE encoding/decoding? #9
Comments
The purpose of encoding/decoding is speed optimisation. As each UTF-8 character can be for 1 to 4 bytes, all string functions work much slower with UTF-8 rather than UTF-16. Unfortunately, emoji seems to be not included in UCS-2LE. I'll think how to deal with emoji and not to lose the speed. |
Thanks for the explanation. I guess one option could also be to try the |
Finally, I've added support of emoji. |
I've got the following example:
Which results in this exception:
So it's throwing an exception here when encoding the string to UCS-2LE with iconv:
If I comment this line and the line that encodes back to the original encoding:
then it works fine:
So I'm wondering - what is the purpose of this encoding/decoding? Could it be removed, or maybe could it be skipped if the input string is UTF-8 (which I assume is commonly used format)? That would allow the lib to be compatible with emojis and other valid UTF-8 strings.
The text was updated successfully, but these errors were encountered: