-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use xarray's encode_cf / decode_cf functions to handle CF conventions #157
Comments
I've been looking at this more, in the context of fixing #189. I think we could probably almost use
Aligning what we're doing in virtualizarr with this would be another reason to implement a virtualizarr xarray backend engine, see #35. cc @ayushnag |
I think the idea of an virtualizarr backend is appealing. One idea for how it can be implemented is by loading the actual data file and then also creating and storing the byte references in the virtual dataset. This way the dataset structure creation and data loading is handled by xarray, then the bytes are just an add on. This all hinges on if data loading and reference creation both doesn't take much extra time compared to doing just one. As a side note, this might make the reference creation process (#87) simpler as well. Instead of searching for attrs, encoding, dimension names, etc the "chunk reader" only needs to create a low level chunk manifest (bytes, offset, path). The rest of the information is retrieved from the netcdf by xarray. Not sure if that is an actually time consuming part of reference creation however. It would also add the ability to easily inline data (#62) Basically the idea is that |
Thanks @ayushnag . I think your suggestion is almost the opposite of what I'm suggesting doing in this issue, so I moved it to #221 to discuss there. (Opposite in that I'm suggesting calling just part of xarray's |
I think your suggestion makes sense and yes my suggestion was more related to the backend issue. My only concern was that if the solution is to add this to Although if |
I completely agree, which is why I've been looking into which parts of xarray we need. I think it's just
Either that or make some small modifications to xarray upstream so that we can import and use |
We might be able to replace some logic (especially that implemented in #156) with a call to
xarray.decode_cf
. This function can accept aDataset
and return a new dataset with different attributes and encoding etc. I think it's used internally in some of xarray's backends.Also there is an idea to expose a corresponding
xarray.encode_cf
function in xarray, which we might also be able to use (see pydata/xarray#4412).The text was updated successfully, but these errors were encountered: