Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Chunked writing of h5py.Dataset and zarr.Array #1624
base: main
Are you sure you want to change the base?
Chunked writing of h5py.Dataset and zarr.Array #1624
Changes from 10 commits
d60c3ab
232bee4
c43c5e2
749880b
32e008d
b2192a2
5938d86
99d4400
690b682
6ef459d
31c8ca6
0ba0da2
58c367d
b5c8d7d
c6afa80
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should any of this be configurable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As things stand, the value is currently dependent on both the shape and an arbitrary cutoff in
max
....so Given the current implementation, we could make 2-3 things configurable which seems like overkill. Perhaps justn_rows
should be a setting with1000
as the default?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already have
anndata/src/anndata/_io/h5ad.py
Line 176 in df213f6
Also
anndata/src/anndata/_io/zarr.py
Line 30 in df213f6
Let’s not create multiple different/incompatible conventions / features under the same name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this is an argument for not calling it
chunk_size
? I wasn't proposing literally calling itn_rows
but just that variable being the settings as opposed toentry_chunk_size
or themax
valueThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It’s an argument for keeping our terminology consistent when we get around to make this configurable. But we can also not do that for now.