Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add grid_bounds_to_polygons #478

Open
wants to merge 15 commits into
base: main
Choose a base branch
from
75 changes: 75 additions & 0 deletions cf_xarray/geometry.py
Original file line number Diff line number Diff line change
Expand Up @@ -955,3 +955,78 @@ def cf_to_polygons(ds: xr.Dataset):
return xr.DataArray(
geoms, dims=node_count.dims, coords=node_count.coords
).drop_vars(node_count_name)


def bounds_to_polygons(ds: xr.Dataset) -> xr.DataArray:
"""
Converts a regular 2D lat/lon grid to a 2D array of shapely polygons.
dcherian marked this conversation as resolved.
Show resolved Hide resolved

Modified from https://notebooksharing.space/view/c6c1f3a7d0c260724115eaa2bf78f3738b275f7f633c1558639e7bbd75b31456.

Parameters
----------
ds : xr.Dataset
Dataset with "latitude" and "longitude" variables as well as their bounds variables.
1D "latitude" and "longitude" variables are supported. This function will automatically
broadcast them against each other.
dcherian marked this conversation as resolved.
Show resolved Hide resolved
dcherian marked this conversation as resolved.
Show resolved Hide resolved

Returns
-------
DataArray
DataArray with shapely polygon per grid cell.
"""
import shapely

grid = ds.cf[["latitude", "longitude"]].load()
bounds = grid.cf.bounds
coordinates = grid.cf.coordinates
dims = tuple(grid.cf.sizes)
bounds_dim = grid.cf.get_bounds_dim_name("latitude")

if "latitude" in dims or "longitude" in dims:
# for 1D lat, lon, this allows them to be
# broadcast against each other
grid = grid.reset_coords()

assert "latitude" in bounds
assert "longitude" in bounds
(lon,) = coordinates["longitude"]
(lon_bounds,) = bounds["longitude"]
(lat,) = coordinates["latitude"]
(lat_bounds,) = bounds["latitude"]

with xr.set_options(keep_attrs=True):
broadcasted = xr.broadcast(
grid[lon],
grid[lat],
) + xr.broadcast(
grid[lon_bounds],
grid[lat_bounds],
)
asdict = dict(zip([lon, lat, lon_bounds, lat_bounds], broadcasted))
# display(asdict)
points = xr.Dataset(asdict)

points = points.transpose(..., bounds_dim)
lonbnd = points[lon_bounds].data
latbnd = points[lat_bounds].data

if points.sizes[bounds_dim] == 2:
lonbnd = lonbnd[..., [0, 0, 1, 1]]
latbnd = latbnd[..., [0, 1, 1, 0]]

elif points.sizes[bounds_dim] != 4:
raise ValueError(
f"The size of the detected bounds or vertex dimension {bounds_dim} is not 2 or 4."
)

# geopandas needs this
mask = lonbnd[..., 0] >= 180
lonbnd[mask, :] = lonbnd[mask, :] - 360
Comment on lines +1023 to +1025
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the reason for this is that most geographic data is given in WGS84 which is -180...180 (I may be wrong here). So if you want to compare it to regional polygons it is probably a good idea to wrap the data. Still, it may be confusing to users.

Maybe this could be optional. However, I am not sure what a good name is (In regionmask I call it wrap_lon with the options 180 (= what you do here), 360 and False. That works but there may be better names).


polyarray = shapely.polygons(shapely.linearrings(lonbnd, latbnd))

# 'geometry' is a blessed name in geopandas.
boxes = points[lon_bounds][..., 0].copy(data=polyarray).rename("geometry")

return boxes
3 changes: 1 addition & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -140,8 +140,7 @@ module=[
"pint",
"matplotlib",
"pytest",
"shapely",
"shapely.geometry",
"shapely.*",
"xarray.core.pycompat",
]
ignore_missing_imports = true
Expand Down
Loading