Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use binary_type in MongoDBStore.__getitem__ for Python 2 only #401

Merged
merged 2 commits into from
Feb 13, 2019
Merged

Use binary_type in MongoDBStore.__getitem__ for Python 2 only #401

merged 2 commits into from
Feb 13, 2019

Conversation

jakirkham
Copy link
Member

@jakirkham jakirkham commented Feb 12, 2019

Previously there were some test failures on Python 2 and referenced in this comment (and later discussion). Turns out on Python 2 a bson.Binary instance is returned, which subclasses bytes, as explained in the docs. So the coercion to bytes is only needed on Python 2 to convert bson.Binary to bytes. As Python 3 already returns a bytes instance, there is nothing we need to do there. Thus we restrict the binary_type call to Python 2. Should avoid a copy on Python 3.

xref: #372

TODO:

  • Add unit tests and/or doctests in docstrings
  • Add docstrings and API docs for any new/modified user-facing classes and functions
  • New/modified features documented in docs/tutorial.rst
  • Changes documented in docs/release.rst
  • Docs build locally (e.g., run tox -e docs)
  • AppVeyor and Travis CI passes
  • Test coverage is 100% (Coveralls passes)

@pep8speaks
Copy link

pep8speaks commented Feb 12, 2019

Hello @jakirkham! Thanks for updating the PR.

Line 4:80: E501 line too long (80 > 79 characters)
Line 8:80: E501 line too long (86 > 79 characters)
Line 9:80: E501 line too long (82 > 79 characters)
Line 10:80: E501 line too long (84 > 79 characters)
Line 11:80: E501 line too long (84 > 79 characters)
Line 12:80: E501 line too long (85 > 79 characters)
Line 13:80: E501 line too long (81 > 79 characters)
Line 14:80: E501 line too long (89 > 79 characters)
Line 43:80: E501 line too long (84 > 79 characters)
Line 44:80: E501 line too long (80 > 79 characters)
Line 102:80: E501 line too long (83 > 79 characters)
Line 125:80: E501 line too long (84 > 79 characters)
Line 151:80: E501 line too long (85 > 79 characters)
Line 152:80: E501 line too long (83 > 79 characters)
Line 164:80: E501 line too long (85 > 79 characters)
Line 210:80: E501 line too long (92 > 79 characters)
Line 281:80: E501 line too long (86 > 79 characters)
Line 308:80: E501 line too long (80 > 79 characters)
Line 317:80: E501 line too long (90 > 79 characters)
Line 326:80: E501 line too long (85 > 79 characters)
Line 327:80: E501 line too long (80 > 79 characters)
Line 377:80: E501 line too long (81 > 79 characters)
Line 380:80: E501 line too long (83 > 79 characters)
Line 382:80: E501 line too long (89 > 79 characters)
Line 406:80: E501 line too long (94 > 79 characters)
Line 467:80: E501 line too long (80 > 79 characters)
Line 673:80: E501 line too long (80 > 79 characters)
Line 705:80: E501 line too long (84 > 79 characters)
Line 706:80: E501 line too long (83 > 79 characters)
Line 754:80: E501 line too long (83 > 79 characters)
Line 765:80: E501 line too long (87 > 79 characters)
Line 922:80: E501 line too long (80 > 79 characters)
Line 937:80: E501 line too long (80 > 79 characters)
Line 1014:80: E501 line too long (85 > 79 characters)
Line 1024:80: E501 line too long (82 > 79 characters)
Line 1025:80: E501 line too long (83 > 79 characters)
Line 1072:80: E501 line too long (80 > 79 characters)
Line 1074:80: E501 line too long (83 > 79 characters)
Line 1085:80: E501 line too long (80 > 79 characters)
Line 1097:80: E501 line too long (80 > 79 characters)
Line 1104:80: E501 line too long (88 > 79 characters)
Line 1108:80: E501 line too long (82 > 79 characters)
Line 1117:80: E501 line too long (88 > 79 characters)
Line 1126:80: E501 line too long (84 > 79 characters)
Line 1127:80: E501 line too long (83 > 79 characters)
Line 1141:80: E501 line too long (87 > 79 characters)
Line 1145:80: E501 line too long (80 > 79 characters)
Line 1159:80: E501 line too long (89 > 79 characters)
Line 1333:80: E501 line too long (85 > 79 characters)
Line 1336:80: E501 line too long (88 > 79 characters)
Line 1346:80: E501 line too long (80 > 79 characters)
Line 1359:80: E501 line too long (80 > 79 characters)
Line 1361:80: E501 line too long (80 > 79 characters)
Line 1365:80: E501 line too long (84 > 79 characters)
Line 1376:80: E501 line too long (80 > 79 characters)
Line 1388:80: E501 line too long (80 > 79 characters)
Line 1390:80: E501 line too long (80 > 79 characters)
Line 1394:80: E501 line too long (82 > 79 characters)
Line 1395:80: E501 line too long (83 > 79 characters)
Line 1418:80: E501 line too long (83 > 79 characters)
Line 1457:80: E501 line too long (83 > 79 characters)
Line 1460:80: E501 line too long (87 > 79 characters)
Line 1527:80: E501 line too long (83 > 79 characters)
Line 1536:80: E501 line too long (87 > 79 characters)
Line 1547:80: E501 line too long (80 > 79 characters)
Line 1560:80: E501 line too long (80 > 79 characters)
Line 1562:80: E501 line too long (80 > 79 characters)
Line 1566:80: E501 line too long (84 > 79 characters)
Line 1572:80: E501 line too long (86 > 79 characters)
Line 1573:80: E501 line too long (90 > 79 characters)
Line 1575:80: E501 line too long (87 > 79 characters)
Line 1576:80: E501 line too long (86 > 79 characters)
Line 1583:80: E501 line too long (89 > 79 characters)
Line 1584:80: E501 line too long (87 > 79 characters)
Line 1591:80: E501 line too long (83 > 79 characters)
Line 1592:80: E501 line too long (84 > 79 characters)
Line 1593:80: E501 line too long (89 > 79 characters)
Line 1597:80: E501 line too long (84 > 79 characters)
Line 1598:80: E501 line too long (87 > 79 characters)
Line 1701:80: E501 line too long (81 > 79 characters)
Line 1702:80: E501 line too long (80 > 79 characters)
Line 1711:80: E501 line too long (87 > 79 characters)
Line 1720:80: E501 line too long (90 > 79 characters)
Line 1727:80: E501 line too long (91 > 79 characters)
Line 1731:80: E501 line too long (91 > 79 characters)
Line 1749:80: E501 line too long (82 > 79 characters)
Line 1750:80: E501 line too long (89 > 79 characters)
Line 1755:80: E501 line too long (82 > 79 characters)
Line 1797:80: E501 line too long (83 > 79 characters)
Line 1813:80: E501 line too long (86 > 79 characters)
Line 1861:80: E501 line too long (86 > 79 characters)
Line 1898:80: E501 line too long (80 > 79 characters)
Line 2110:80: E501 line too long (80 > 79 characters)
Line 2215:80: E501 line too long (80 > 79 characters)
Line 2278:80: E501 line too long (80 > 79 characters)
Line 2292:80: E501 line too long (86 > 79 characters)
Line 2325:80: E501 line too long (85 > 79 characters)

Comment last updated on February 13, 2019 at 06:37 Hours UTC

@jakirkham jakirkham mentioned this pull request Feb 12, 2019
7 tasks
@jhamman
Copy link
Member

jhamman commented Feb 12, 2019

Still seems to be failing. This by the way was why I opened #393.

On both Python 2 and Python 3, PyMongo converts `binary_type` (i.e.
`bytes`) to BSON type 5 (Binary data) with subtype 0 to store in the
MongoDB instance. When reading that data back out, PyMongo handles BSON
type 5 (Binary data) with subtype 0 differently depending on the Python
version.

On Python 2, it creates a `bson.Binary` instance with the data. Normally
we would coerce this to a `bytes` object using `ensure_bytes`. However
that fails as `bson.Binary` is a subclass of `bytes`. So instead we
explicitly force it to `bytes` (i.e.  `binary_type`). On Python 3,
PyMongo automatically converts the data to `bytes`. Thus we don't need
to do anything there.

ref: http://api.mongodb.com/python/current/python3.html#id3
ref: http://api.mongodb.com/python/current/api/bson/binary.html#bson.binary.Binary
@jakirkham jakirkham changed the title Drop binary_type cast in MongoDBStore.__getitem__ Use binary_type in MongoDBStore.__getitem__ for Python 2 only Feb 13, 2019
@jakirkham
Copy link
Member Author

That's right. Just wanted to see if that was still the case after the dust settled.

After doing a little research, it looks like we can safely constrain this coercion to Python 2 only. Have updated the PR to do that. Should make it easier for us to remember to remove when we do drop Python 2.

Could you please give it another look? :)

@jhamman
Copy link
Member

jhamman commented Feb 13, 2019

yep, this looks right. I agree, this will make it easier to undo when PY2 goes away. Thanks.

@jakirkham
Copy link
Member Author

Thanks @jhamman :)

@jakirkham jakirkham merged commit 3c0e0f5 into zarr-developers:master Feb 13, 2019
@jakirkham jakirkham deleted the drop_binary_type_cast branch February 13, 2019 19:31
@jakirkham jakirkham added this to the v2.3 milestone Feb 13, 2019
@alimanfoo
Copy link
Member

Thanks @jakirkham for this, nice to know exactly what's going on 👍

@jakirkham
Copy link
Member Author

Of course. 😄

FWIW I think we can get a better handle on this behavior by adding a flag to the ensure_* functions to handle subclasses of the expected type (e.g. bytes in this case). Opened PR ( zarr-developers/numcodecs#173 ) along these lines.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants