Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding CESM1-1-CAM5-CMIP5 #5

Closed
aaronspring opened this issue Nov 6, 2019 · 7 comments
Closed

encoding CESM1-1-CAM5-CMIP5 #5

aaronspring opened this issue Nov 6, 2019 · 7 comments

Comments

@aaronspring
Copy link
Contributor

this might be a bit off-topic, but it fits when regarding the package as making CMIP6 work in xarray.

when I tried to save CESM1-1-CAM5-CMIP5 data to netcdf ds.to_netcdf() I get the following warning:
Variable 'tas' has multiple fill values [1e+20, 1e+20]. Cannot encode data.

I haven't tested this for more variables and models.

@bradyrx
Copy link

bradyrx commented Nov 6, 2019

Yes I have seen this with CESM2 and am not sure what the deal is. Clearly something with their fill values.

@jbusecke
Copy link
Owner

jbusecke commented Nov 7, 2019

Thanks for using cmip6_preprocessing!
My guess is that it is an artifact of concatenting the files. Do you get the same issue when you omit the preprocessing argument?

If that is the case, I would raise an issue over at intake-esm just to make sure they are aware of it, and perhaps they can fix this cleanly upstream. If that should not work we could think about a preprocessing function. Have you tried to remove the encoding attrs altogether and see if that fixes things?

cc'ing @andersy005

@dcherian
Copy link

dcherian commented Nov 8, 2019

Variable 'tas' has multiple fill values [1e+20, 1e+20]. Cannot encode data.

This is a conflict between missing_value and _FillValue. I assume one is float32 and the other is not; or roundoff error is making the equivalent check fail.

https://github.com/pydata/xarray/blob/3bb0414f1f45890607bfe178f64577c5936d0432/xarray/coding/variables.py#L149-L159

An xarray PR would be very welcome!

@andersy005
Copy link

@dcherian,

This is a conflict between missing_value and _FillValue. I assume one is float32 and the other is not; or roundoff error is making the equivalent check fail.

What is the preferred behavior in this case? type casting?

@dcherian
Copy link

dcherian commented Nov 8, 2019

maybe. you would still run into roundoff error.. so the proper check may be to use allclose instead of equivalent (maybe). I suggest you open a PR and people who know more can comment ;)

On a related note, Keith has pointed out that we have a related error on the encode step. We should be casting both _FillValue and missing_value to the same dtype as the data in encode(..). Otherwise xarray can write a file where the dtypes differ for the two and then we fail on the decode step.
EDIT: currently netCDF4 is casting _FillValue for us. I think we should cast both _FillValue and missing_value explicitly before handing off to netCDF4.

These should both be easy fixes. The annoying bit will be writing tests but even that should be OK. I highly encourage you to fix these two bugs.

@andersy005
Copy link

These should both be easy fixes. The annoying bit will be writing tests but even that should be OK. I highly encourage you to fix these two bugs.

Will definitely look into these some time today

@dcherian
Copy link

dcherian commented Nov 8, 2019

Looking forward to that PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants