-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow arbitrary xarray methods to be applied to climpred objects #243
Conversation
So my first goal is to get this to work not inplace. I.e., the application of Notes:
Issues:
Any thoughts @ahuang11, @andersy005, @aaronspring? This is all pretty new to me so I'm having trouble getting this to work as desired. working case: import numpy as np
import xarray as xr
import climpred as cp
fake_dat = np.random.rand(30, 10,)
hind = xr.DataArray(fake_dat, dims=['init', 'lead',])
hind['init'] = np.arange(1950, 1980)
hind['lead'] = np.arange(10)
hind.name = 'SST'
# Only for the demo. Will remove the PredictionEnsemble import
# for the final version
hind = cp.PredictionEnsemble(hind)
# Original dataset
print(hind._datasets)
>>> {'initialized': <xarray.Dataset>
Dimensions: (init: 30, lead: 10)
Coordinates:
* init (init) int64 1950 1951 1952 1953 1954 ... 1975 1976 1977 1978 1979
* lead (lead) int64 0 1 2 3 4 5 6 7 8 9
Data variables:
SST (init, lead) float64 0.7234 0.6834 0.6956 ... 0.2756 0.3057 0.3313}
# Select the first lead dimension
hind = hind.isel(lead=0)
print(hind._datasets)
>>> {'initialized': <xarray.Dataset>
Dimensions: (init: 30)
Coordinates:
* init (init) int64 1950 1951 1952 1953 1954 ... 1975 1976 1977 1978 1979
lead int64 0
Data variables:
SST (init) float64 0.01507 0.01859 0.3588 ... 0.2327 0.02468 0.3506}
# Slice over init
hind = hind.isel(init=slice(0, 10))
print(hind._datasets)
>>> {'initialized': <xarray.Dataset>
Dimensions: (init: 10)
Coordinates:
* init (init) int64 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959
lead int64 0
Data variables:
SST (init) float64 0.01507 0.01859 0.3588 ... 0.8931 0.002805 0.3095} weird issue: # Open up with same data production as above example.
print(hind._datasets)
>>> {'initialized': <xarray.Dataset>
Dimensions: (init: 30, lead: 10)
Coordinates:
* init (init) int64 1950 1951 1952 1953 1954 ... 1975 1976 1977 1978 1979
* lead (lead) int64 0 1 2 3 4 5 6 7 8 9
Data variables:
SST (init, lead) float64 0.1705 0.8209 0.5546 ... 0.774 0.03419 0.1477}
# Cool, inplace doesn't work.
hind.isel(lead=0)
print(hind._datasets)
>>> {'initialized': <xarray.Dataset>
Dimensions: (init: 30, lead: 10)
Coordinates:
* init (init) int64 1950 1951 1952 1953 1954 ... 1975 1976 1977 1978 1979
* lead (lead) int64 0 1 2 3 4 5 6 7 8 9
Data variables:
SST (init, lead) float64 0.1705 0.8209 0.5546 ... 0.774 0.03419 0.1477}
# Now try to do it the right way, and it breaks. Somehow doing it inplace modifies things?
hind = hind.isel(lead=0)
>>> TypeError: 'PredictionEnsemble' object is not callable |
ds = load_dataset('MPI-PM-DP-1D')['tos'].isel(period=-1, area=1)
control = load_dataset('MPI-control-1D')['tos'].isel(period=-1, area=1)
pm = climpred.PerfectModelEnsemble(ds)
pm.add_control(control)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-17-94c92430995c> in <module>
----> 1 pm.add_control(control)
~/Coding/climpred/climpred/checks.py in wrapper(*args, **kwargs)
53 # to the actual function call rather than showing a simple Exception
54 # (probably IndexError from trying to subselect an empty dec_args list)
---> 55 return func(*args, **kwargs)
56
57 return wrapper
~/Coding/climpred/climpred/classes.py in add_control(self, xobj)
261 if isinstance(xobj, xr.DataArray):
262 xobj = xobj.to_dataset()
--> 263 match_initialized_dims(self.initialized, xobj)
264 match_initialized_vars(self.initialized, xobj)
265 self.control = xobj
~/Coding/climpred/climpred/checks.py in match_initialized_dims(init, ref, uninitialized)
103 # temporarily rename to time.
104 init = init.rename({'init': 'time'})
--> 105 init_dims = list(init.dims)
106 if 'lead' in init_dims:
107 init_dims.remove('lead')
TypeError: 'function' object is not iterable |
Your example seems to work when not overwriting the object. I use a new name. Maybe you have to delete the object first: hind2 = hind.isel(lead=0)
hc2._datasets |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like a general and smart solution. work for the initialized part. missing so far: add_control/reference
I get why xarray doesnt want inplace operations, but to add a reference to the object, this should be ok, isnt? See https://github.com/bradyrx/climpred/issues/130 |
Yep, these aren't expected to work in its current form. I wanted to solve the inplace issue before I apply it to all objects. But the proposed solution is instead of having attributes with dictionaries (e.g. hindcast.reference = {}, hindcast.uninitialized = {}), we'll just have one large nested dictionry called hindcast._datasets, which makes looping thru and list comprehension much easier.
I get the same issue when not overwriting. The problem is that if you do it interactively first, it breaks. Which is a huge problem to me. I like to check stuff interactively before assigning them variable names, if that makes sense. I can't work out what in the code would cause this and stack exchange isn't helping (or I'm not searching the right thing)
Yeah I get it. It's a design decision, and we should be in line with xarray. I think it's better to be explicit (i.e. not doing it inplace) so that you can check stuff interactively (as I said above) but not permanently change your variable. |
b89926c
to
df27fe7
Compare
Okay I addressed the main comments here. I have two left to address that are just syntax. Need to prioritize documentation and testing for the next push. |
Okay, we're getting close @ahuang11 @aaronspring . I just need to spend an evening writing tests to cover all of the classes module and also want to make a brief docs page on the two objects and point out that you can use |
I'm gonna go ahead and squash commits and merge this since you guys have reviewed a few times and most changes since then were just to docs. This PR is ballooning so big and needs to get in so I am going to address adding testing in another PR so we can keep that contained and will of course wait to make this a formal release until testing is added. I will also open a bunch of small issues for things that need to be modified/added/updated for the classes. |
fix compute_metric for hindcast
refactor HindcastEnsemble and get all functions working refactoring of PM class fix some hindcast pytest errors all tests pass on test hindcastEnsemble all tests for PM object pass
finish refactoring PM with general apply function non inplace working for all but add reference fix hindcast ensemble testing fix PM testing without inplace
add coverage to PM testing rename is_initialized to has_dataset
refactor list comprehension
fix links in documentation remove auto-generated api stuff update API with classes information add api instructions to how to contribute
finish up prediction ensemble docs
74d9be7
to
8b78874
Compare
This is a stale review and comments were addressed
All tests pass, docs build with new features, etc. Decided to merge since we've done extensive review here and didn't want to drag it along too much further. Thanks @aaronspring and @ahuang11 for the comments here. We now have generalized |
Good job. We might need to adapt tiny changes in the future. But I hope users will find these classes and xr calls very conveniently |
Description
This PR allows arbitrary
xarray
functions to be applied toHindcastEnsemble
andPerfectModelEnsemble
objects. This is in the case that someone wants to assemble aclimpred
object and then apply various post-processing methods, such as.isel()
,.sum()
, etc. to our objects.Fixes https://github.com/bradyrx/climpred/issues/102, https://github.com/bradyrx/climpred/issues/130
Type of change
Please delete options that are not relevant.
How Has This Been Tested?
This was tested through example notebooks and updating existing docs notebooks
Checklist (while developing)
pytest
, if necessary.Pre-Merge Checklist (final steps)
make html
on the documents to make sure example notebooks still compile.