Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated date and unicode handling code. #142

Merged
merged 14 commits into from
Dec 9, 2015
Merged

Updated date and unicode handling code. #142

merged 14 commits into from
Dec 9, 2015

Conversation

obi1kenobi
Copy link
Contributor

Partially addresses #129, in that it shows how to avoid the error (see the updated test case) but does not provide a comprehensive solution.

@obi1kenobi
Copy link
Contributor Author

The CI build failed on a transient connection error, not my fault this time :)

@obi1kenobi
Copy link
Contributor Author

To be clear, here's what this PR suggests as a partial solution.

The binary protocol decoder, on python 2, implicitly decodes received strings to UTF-8. This breaks the "what you put in is what you get out" expectation, because a user could put in a unicode object but get it back as a UTF-8 encoded str. The proposed (clunky but working) fix, for python 2, is that whenever the user expects a particular string object to be of unicode type rather than str, they should transform it using UTF-8 manually upon receiving it back from pyorient. This is exactly how the updated Unicode test works.

@obi1kenobi
Copy link
Contributor Author

Here's an alternative which is slightly more drastic, but is perhaps a cleaner solution from a technical standpoint -- and I'd love to hear your feedback on this.

All strings that pyorient receives are considered to be of type unicode. Any str objects are always implicitly converted to unicode using UTF-8. pyorient always returns unicode objects for fields of type String. This would be consistent across python 2 and 3, and not require that the programmer using the library remember to cast to unicode upon receiving the data back.

@TropicalPenguin
Copy link
Collaborator

I like the maintainability this would add, abstracting away differences between python 2 and 3 usage.

@obi1kenobi
Copy link
Contributor Author

The additional upside by moving to all-unicode when using pyorient is that it would let pyorient only use unicode on the inside, thereby fixing a few more problems I discovered. If you'd like, I can prepare a PR that converts all string literals that pyorient uses to unicode. I feel like that would make it much easier to reason about correctness and much less likely to accidentally have different behavior on python 2 vs 3.

@obi1kenobi
Copy link
Contributor Author

Now also fixes #146, and subsumes #141.

@obi1kenobi
Copy link
Contributor Author

Moving all of pyorient to use unicode internally proved to be difficult, as I don't understand much of the code. I decided to leave that to you @TropicalPenguin and @Ostico.

This PR is consistent with that goal, but still requires python 2 users to manually convert any received string data to unicode if it was passed in as unicode. Please review and let me know what you think!

@obi1kenobi
Copy link
Contributor Author

@mogui @Ostico thoughts on merging this soon?

@obi1kenobi
Copy link
Contributor Author

Sorry to keep poking at this, but I'd really appreciate it if we could fix this set of bugs before the holidays start -- they are major concerns on the project I'm working on.

mogui added a commit that referenced this pull request Dec 9, 2015
Updated date and unicode handling code.
@mogui mogui merged commit 6898cde into mogui:develop Dec 9, 2015
@mogui
Copy link
Owner

mogui commented Dec 9, 2015

sorry @obi1kenobi for the delay :P

@obi1kenobi
Copy link
Contributor Author

No worries! Thank you for getting to it :)

@obi1kenobi obi1kenobi deleted the more_unicode_and_date_fixes branch December 9, 2015 16:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants