-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Smoothing #224
Conversation
climpred/smoothing.py
Outdated
else: | ||
was_dataset = False | ||
ds_out = xe.util.grid_global(*boxsize) | ||
regridder = xe.Regridder(ds, ds_out, 'bilinear') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest a keyword for remap type. This could be 'bilinear', 'patch', etc. It will be hard to natively support conservative remapping since you need grid cell vertices for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, maybe make grid_global more flexible for future uses with non-global stuff.
Maybe you can adapt my code snippet
def _regrid_it(da, d_lon, d_lat, **kwargs):
'''
Global 2D rectilinear grid centers and bounds
Parameters
----------
da : xarray DataArray
Contain input and output grid coordinates. Look for variables
``lon``, ``lat``, and optionally ``lon_b``, ``lat_b`` for
conservative method.
Shape can be 1D (Nlon,) and (Nlat,) for rectilinear grids,
or 2D (Ny, Nx) for general curvilinear grids.
Shape of bounds should be (N+1,) or (Ny+1, Nx+1).
d_lon : float
Longitude step size, i.e. grid resolution
d_lat : float
Latitude step size, i.e. grid resolution
Returns
-------
da : xarray DataArray with coordinate values
'''
if not warn_lon_lat_dne(da): # both lon and lat exist
grid_out = {LON: np.arange(da[LON].min(), da[LON].max() + d_lon, d_lon),
LAT: np.arange(da[LAT].min(), da[LAT].max() + d_lat, d_lat)}
regridder = xe.Regridder(da, grid_out, **kwargs)
return regridder(da)
else:
return da
def regrid_it(ds, d_lon=1, d_lat=None, method='bilinear',
periodic=False, filename=None, reuse_weights=True):
'''
Quick regridding
Parameters
----------
ds : xarray DataSet
Contain input and output grid coordinates. Look for variables
``lon``, ``lat``, and optionally ``lon_b``, ``lat_b`` for
conservative method.
Shape can be 1D (Nlon,) and (Nlat,) for rectilinear grids,
or 2D (Ny, Nx) for general curvilinear grids.
Shape of bounds should be (N+1,) or (Ny+1, Nx+1).
d_lon : float, optional
Longitude step size, i.e. grid resolution; if not provided,
will equal 1
d_lat : float, optional
Latitude step size, i.e. grid resolution; if not provided,
will equal d_lon
method : str
Regridding method. Options are
- 'bilinear'
- 'conservative', **need grid corner information**
- 'patch'
- 'nearest_s2d'
- 'nearest_d2s'
periodic : bool, optional
Periodic in longitude? Default to False.
Only useful for global grids with non-conservative regridding.
Will be forced to False for conservative regridding.
filename : str, optional
Name for the weight file. The default naming scheme is::
{method}_{Ny_in}x{Nx_in}_{Ny_out}x{Nx_out}.nc
e.g. bilinear_400x600_300x400.nc
reuse_weights : bool, optional
Whether to read existing weight file to save computing time.
False by default (i.e. re-compute, not reuse).
Returns
-------
ds : xarray DataSet with coordinate values or DataArray
'''
if d_lat is None:
d_lat = d_lon
kwargs = {
'd_lon': d_lon,
'd_lat': d_lat,
'method': method,
'periodic':periodic,
'filename': filename,
'reuse_weights': reuse_weights,
}
if isinstance(ds, xr.Dataset):
ds = xr.merge(_regrid_it(ds[var], **kwargs)
for var in ds.data_vars)
else:
ds = _regrid_it(ds, **kwargs)
return ds
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am unsure whether we need to have this long function with documentation in our codebase. couldnt we just import this: from xesmf.utils import regrid_it
. 1x1 deg is default here which is also ok for me. and then for the goddard wrapper I would adapt for 5x5. the difference is then that here it doesnt use dicts but d_lon/lat
keywords with stepsize. temporal_smoothing
can also be adapted for this. its either time
or init
. we dont allow for anything else and this the function can guess. any objections for this implementation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't import it because it doesn't exist in the codebase (the PR was rejected)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah ok. so is this the only way to have not automatically remapping to global grid?
By the way, thanks for taking the initiative to make all these PRs! |
Changes:
|
I still have to pull this down and see how it works, but I'm really excited about this implementation. Could you clarify the difference between the |
I thought the other way around: Having a Goddard function that works as wrapper of temp. and spatial smoothing (should be xesmf 5x5). |
climpred/classes.py
Outdated
@@ -91,6 +91,16 @@ def __init__(self, xobj): | |||
def __repr__(self): | |||
return _display_metadata(self) | |||
|
|||
def smooth(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just tried to implement it. smooth_dict would need to change from init
to time
. I would like to try to implement this into the main class and not the subclass, because then we will have to do things twice. I actually wanted to loop over all objects in self. didnt work.
now when a user wants to compare DPLE hindcast to some observational data product, the user cannot add the obs if this has different grid information. however, it maybe be useful to be able to regrid hindcast and obs to 5x5 degree. maybe we have the checks later in the compute function and relax them while adding. |
I did a sketch and a few tests how the classes implementation could look like. |
climpred/smoothing.py
Outdated
|
||
Args: | ||
ds (xr.object): input. | ||
smooth_dict (dict): length of smoothing of timesteps. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this just be an int that gets entered? Like smooth=3
? Since we should only expect 'time'
and 'lead'
dimension names for our package we can infer which one it is. And I don't think we should have multiple entries for a given dataset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could yes. I wanted to have the inputs types aligned for all smoothing funcs. Especially in the classes. What I still aim for is a clever way that the contents of the dict are checked and then all the smoothing operations are applied.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to have the args for all smoothing funcs consistent as dicts. but sure this could also be infered.
climpred/classes.py
Outdated
if isinstance(smooth_dict, str): | ||
if smooth_dict == 'goddard2013': | ||
smooth_fct = smooth_goddard_2013 | ||
smooth_dict = {'lead': 4} # default |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly, need to figure out if this holds for non-annual averages. Maybe just have the user manually put this in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once temporal PR is ready and we know the requirements what needs to be coded there, this can be implemented
climpred/classes.py
Outdated
) | ||
elif isinstance(smooth_dict, dict): | ||
print(smooth_dict, type(smooth_dict)) | ||
if 'lon' in smooth_dict or 'lat' in smooth_dict: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand this series of if statements. It only allows one thing to be done -- either spatial smoothing or temporal smoothing. Shouldn't the dictionary let both be done, e.g., if the user says smooth_dict={'lead': 4, 'lat': 5, 'lon': 5}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I still need to implement that all dict items are used to smooth.
climpred/classes.py
Outdated
raise ValueError( | ||
'Please provide kwargs as str or dict and not', type(smooth_dict) | ||
) | ||
self.initialized = smooth_fct(self.initialized, smooth_dict) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we're trying to avoid inplace operations and rather use return (see https://github.com/bradyrx/climpred/issues/130)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it’s logical to first create the prediction ensemble and then smooth and have the same. This new inplace PR should then fix this consistently
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are inplace operations always to be avioded? for me it feels itutive if I have a PredictionEnsemble
and I smooth it to have a smoothed PredictionEnsemble
afterwards (inplace).
on the other side, once the xarray.isel and so an can be applied to PredictionEnsemble
this smooth
feature should behave the same way. lets save this for then.
Left a ton of comments, but this looks really great so far @aaronspring. I'm excited to have this implemented. Some thoughts outside of my comments on the code:
Great work on this so far! |
let me know if there are still requested changes. I dont see any for now to be addressed in this PR. |
@aaronspring , if you go to "Files Changed" up top you'll see there's still some comments from myself and @ahuang11 . A lot of mine are documentation stuff, but there are a few places where the if statements are set up improperly so it doesn't work as currently set up. |
sorry. I didnt see those. resolved them and commented on them now. somehow now I get errors in |
Thanks for the patience here @aaronspring! This is a great addition. I'll test/expand functionality with my PR on general apply functions. |
Description
Add smoothing functions. Early PR but I need some input of you. I tried to make it a bit generalized, but currently its quite inconsistent.
Contains:
Still working on:
try: import xesmf
to aviod errorsTo Do later:
PredictionEnsemble.smooth()
Fixes # https://github.com/bradyrx/climpred/issues/76
Type of change
How Has This Been Tested?
Checklist (while developing)
pytest
, if necessary.Pre-Merge Checklist (final steps)
make html
on the documents to make sure example notebooks still compile.References
Goddard et al. 2013