Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unicode strings with codepoints at or below \u00ff are encoded as \x?? rather than \u00?? #146

Closed
obi1kenobi opened this issue Nov 18, 2015 · 2 comments

Comments

@obi1kenobi
Copy link
Contributor

This is a different facet of the unicode encoding issue mentioned in #129, which is unfortunately not solved by PR #142. This is because of two issues:

  • Python chooses to represent \u00?? codepoints as \x??, which is not supported by OrientDB.
  • on Python 2, pyorient uses str values internally, which are implicitly encoded with ascii and therefore cannot correctly represent all of Unicode.

To reproduce, add a unicode value that contains the \u00c5 character (latin capital letter a with ring above, AKA Angstrom unit symbol normalized) and attempt to write it to OrientDB.

@Ostico
Copy link
Collaborator

Ostico commented Apr 11, 2016

Can i close?

@obi1kenobi
Copy link
Contributor Author

Ah yes I think this was fixed. Closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants