Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify text handling in DB Stores #670

Merged
merged 4 commits into from
Dec 3, 2020
Merged

Simplify text handling in DB Stores #670

merged 4 commits into from
Dec 3, 2020

Conversation

jakirkham
Copy link
Member

Fixes #669

LMDBStore had its own way of handling text before ensure_text was added to Numcodecs. With that utility function now available, we can simplify this a bit and avoid some subtleties of the LMDB implementation.

Similarly DBMStore had some of its own text handling issues that can be solved the same way.

Finally inline all str.encode calls to make it clearer what is happening in both LMDBStore and DBMStore.

TODO:

  • Add unit tests and/or doctests in docstrings
  • Add docstrings and API docs for any new/modified user-facing classes and functions
  • New/modified features documented in docs/tutorial.rst
  • Changes documented in docs/release.rst
  • AppVeyor and Travis CI passes
  • Test coverage is 100% (Coveralls passes)

@jakirkham jakirkham mentioned this pull request Dec 3, 2020
@jakirkham jakirkham changed the title Simp text handling in DB Stores Simplify text handling in DB Stores Dec 3, 2020
@Carreau
Copy link
Contributor

Carreau commented Dec 3, 2020

Would that bump the minimum version of lmdb ?

@codecov
Copy link

codecov bot commented Dec 3, 2020

Codecov Report

Merging #670 (ed1ffca) into master (a5dfc3b) will not change coverage.
The diff coverage is 100.00%.

@@            Coverage Diff            @@
##            master      #670   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           27        27           
  Lines        10094     10087    -7     
=========================================
- Hits         10094     10087    -7     
Impacted Files Coverage Δ
zarr/storage.py 100.00% <100.00%> (ø)

@jakirkham
Copy link
Member Author

Would that bump the minimum version of lmdb ?

Shouldn't affect it.

Basically ensure_text leverages the Python Buffer Protocol to coerce the data (bytes or other object) to a NumPy array without copying. Effectively this normalizes the data we are working with so we don't need to know about LMDB Buffer objects or whatever other object might come up ( MongoDB had a similar issue #401 ). Then we perform the decode step on the NumPy array, which we already know to work.

@Carreau Carreau added this to the v2.7 milestone Dec 3, 2020
@Carreau
Copy link
Contributor

Carreau commented Dec 3, 2020

Thanks for the explanation; let's get that in.

@Carreau Carreau merged commit 1464d90 into zarr-developers:master Dec 3, 2020
@jakirkham jakirkham deleted the simp_txt_coding_db_stores branch December 3, 2020 23:23
@jakirkham
Copy link
Member Author

Thanks Matthias! 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Failures on Python 3.10
2 participants