You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Are partial chunk reads supported, as is the case in zarr-python for datasets using Blosc compression? (see this issue and this merged PR).
We're interested in accessing extremely large public datasets (tens-hundreds of TB) with chunks as large as 100 MB, from a web application. Given their size, it's unlikely that we can create new copies with a more web-manageable chunk size (say, 1-2 MB). Any idea?
Currently it doesn't have any special support for these queries, it is technically possible I presume (with a HTTP range request header to specify what part you want to read). I had a look at the merged PR, from what I understand it actually "reads" the entire file and then decompresses only part of it. Now "reading" of course has a different meaning (one can "open" a file that is local and then only actually access a part of it), on the web we have to do this through range requests.
I'm of course happy to accept a PR for this behavior, otherwise perhaps the best way to solve this is with some intermediate service on a server that takes requests for a smaller chunk size, translating it to the larger chunks and serving them partially (and it should probably have a cache for the parts that you access often). I hope that makes sense!
Are partial chunk reads supported, as is the case in zarr-python for datasets using Blosc compression? (see this issue and this merged PR).
We're interested in accessing extremely large public datasets (tens-hundreds of TB) with chunks as large as 100 MB, from a web application. Given their size, it's unlikely that we can create new copies with a more web-manageable chunk size (say, 1-2 MB). Any idea?
cc @manzt
The text was updated successfully, but these errors were encountered: