Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(Un)wrapping of DEM tiles across dateline #50

Merged
merged 43 commits into from
Dec 20, 2022
Merged
Show file tree
Hide file tree
Changes from 37 commits
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
a8b2979
discuss lat/lon conventions and dateline support
cmarshak Nov 22, 2022
0fc7d80
simple vector type functions for dateline
cmarshak Nov 22, 2022
13a3c90
move tiling functions around
cmarshak Nov 22, 2022
0d84c0b
update exceptions
cmarshak Nov 22, 2022
e8f54da
expose tiling functions
cmarshak Nov 22, 2022
6bb91ca
initial tests for dateline
cmarshak Nov 22, 2022
c5d080a
comments and clarify control flow
cmarshak Nov 22, 2022
0a5d782
update merge arrays api and split extents
cmarshak Nov 23, 2022
b595cb3
finish geoid modifications
cmarshak Nov 23, 2022
72c277d
changelog
cmarshak Nov 23, 2022
7c504ad
fix some notebooks with changed apis
cmarshak Nov 23, 2022
11d3c9a
fix error description
cmarshak Nov 23, 2022
3cdcc1d
fix api on comparison
cmarshak Nov 23, 2022
3a98ee7
fix api on comparison
cmarshak Nov 23, 2022
9c87290
update dateline readme
cmarshak Nov 23, 2022
e177af0
update changelog
cmarshak Nov 28, 2022
673a454
update np.nan description
cmarshak Nov 28, 2022
a7f6593
no pole crossings
cmarshak Nov 28, 2022
85a1299
dateline integration tests
cmarshak Nov 28, 2022
b69f423
window intersection
cmarshak Nov 28, 2022
b1e326e
lint
cmarshak Nov 28, 2022
8d427fa
integration tests for notebooks
cmarshak Nov 29, 2022
6e80401
ensure memory files are closed
cmarshak Nov 29, 2022
3275c18
include missing typing
cmarshak Nov 29, 2022
25d964e
lint
cmarshak Nov 29, 2022
483ef99
Merge branch 'dev' into antimeridian
cmarshak Nov 29, 2022
4db10ca
include papermill import within func
cmarshak Nov 29, 2022
e5300c4
ensure tile profile metadata is preserved
cmarshak Nov 30, 2022
03dc36d
remove examples
cmarshak Nov 30, 2022
ca5cfcf
ensure consistent georeferencing across dateline and better variable …
cmarshak Dec 6, 2022
1896f40
update driver information
cmarshak Dec 6, 2022
b8ab024
ensure correct tagging and update driver warning
cmarshak Dec 6, 2022
68e78fd
clarify polar bounds logic
cmarshak Dec 6, 2022
f250950
include new additions
cmarshak Dec 6, 2022
ab721e3
better test name
cmarshak Dec 6, 2022
e049969
fix typo
cmarshak Dec 6, 2022
d353fe1
driver in changelog
cmarshak Dec 6, 2022
ea859c9
update readme, good idea marin!
cmarshak Dec 8, 2022
329a1c1
more readme updates
cmarshak Dec 8, 2022
48e3094
more readme improvements
cmarshak Dec 9, 2022
b2bece6
more readme improvements 2
cmarshak Dec 9, 2022
2524db3
more readme improvements 3
cmarshak Dec 9, 2022
7685d9c
fix readme final
cmarshak Dec 9, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,7 @@ notebooks/*.kmz
notebooks/*.jpg
notebooks/*.wld
notebooks/*.geojson
notebooks/**/*.tif

# VRT
*.vrt
Expand Down
17 changes: 17 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,23 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [PEP 440](https://www.python.org/dev/peps/pep-0440/)
and uses [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [2.4.0]

### Added
- Included feature for extracting DEMs across datelines
- Updated merge apis with more descriptive names for more general usage
- Exceptions to determine valid extents and ensure single dateline crossing
- Added functions for dateline in `dateline.py`
- Tests for added and changed functionality.
- Integration tests for notebooks.
- Clarity about driver keyword in `stitch_dem` in readme, docstrings

### Changed
- Moved functions into more logical python file including merge calls into `merge.py` and tile functions into `datasets.py`
- Renamed internal functions for greater clarity and better description of tasks
- Ensures window reading checks bounds of src raster and does intersection if required to ensure no unexpected rasterio errors. Further, raises error if no overlap.


## [2.3.1]

### Fixed
Expand Down
31 changes: 24 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,17 @@
[![Conda version](https://img.shields.io/conda/vn/conda-forge/dem_stitcher)](https://anaconda.org/conda-forge/dem_stitcher)
[![Conda platforms](https://img.shields.io/conda/pn/conda-forge/dem_stitcher)](https://anaconda.org/conda-forge/dem_stitcher)

This tool aims to (a) provide a continuous raster of Digital Elevation Raster over an area of interest and (b) perform some standard transformations for processing. Such transformations include:
This tool aims to (a) provide a continuous raster of global Digital Elevation Raster over an area of interest and (b) perform some standard transformations for processing. Such transformations include:

+ converting the vertical datum from a reference geoid to the WGS84 ellipsoidal
+ ensuring a coordinate reference system centered at either the upper-left corner (`Area` tag) or center of the pixel (`Point` tag).

We utilize the GIS formats from `rasterio`. The API can be summarized as

```
# as xmin, ymin, xmax, ymax in epsg:4326
bounds = [-119.085, 33.402, -118.984, 35.435]

X, p = stitch_dem(bounds,
dem_name='glo_30',
dst_ellipsoidal_height=False,
Expand All @@ -30,7 +32,7 @@ with rasterio.open('dem.tif', 'w', **p) as ds:
ds.write(X, 1)
ds.update_tags(AREA_OR_POINT='Point')
```

Global DEMs supported are tiled in lat/lon (`epsg:4326`) and the API assumes that bounds are supplied in this format.

# Installation

Expand All @@ -56,6 +58,10 @@ Currently, python 3.7+ is supported.

Although the thrust of using this package is for staging DEMs for InSAR (particularly ISCE2), testing and maintaining suitable environments to use with InSAR processors is beyond the scope of what we are attempting to accomplish here. We provide an example notebook [here](./notebooks/Staging_a_DEM_for_ISCE2.ipynb) that demonstrates how to stage a DEM for ISCE2, which requires additional packages than required for the package on its own and additionally requires python version `<3.10`. For the notebook, we use the environment found in `environment.yml` of the Dockerized TopsApp [repository](https://github.com/ACCESS-Cloud-Based-InSAR/DockerizedTopsApp/blob/dev/environment.yml), used to generate interferograms (GUNWs) in the cloud.

## About the raster metadata

The creation metadata unrelated to georeferencing (e.g. `compress` key, see [here](https://rasterio.readthedocs.io/en/latest/topics/image_options.html#creation-options)) returned from the `stitch_dem` API is copied from the tiles being used. Although a driver can be specified through this API, we recommend using the default `GTiff`. We caution that the other creation metadata from a dem tile set may not be valid with the various types of drivers. Furthermore, different distributions of `rasterio` support different drivers. These rasterio metadata creation is beyond the scope of this library.

## Credentials

The accessing of NASADEM and SRTM require earthdata login credentials to be put into the `~/.netrc` file. If these are not present, the stitcher will
Expand Down Expand Up @@ -86,6 +92,8 @@ In [1]: from dem_stitcher.datasets import DATASETS; DATASETS
Out[1]: ['srtm_v3', 'nasadem', 'glo_90_missing', 'glo_30', '3dep', 'glo_90', 'ned1']
```

All the tiles are given in lat/lon CRS (i.e. `epsg:4326`). A notable omission is the Artic DEM [here](https://www.pgc.umn.edu/data/arcticdem/), which is suitable for DEMs at the northern pole of the globe due to lat/lon distortion.

1. `glo_30`/`glo_90`: Copernicus GLO-30/GLO-90 DEM. They are the 30 and 90 meter resolution, respectively [[link](https://registry.opendata.aws/copernicus-dem/)].
2. The USGS DEMSs:
- `ned1`: Ned 1 arc-second (deprecated by USGS) [[link](https://cugir.library.cornell.edu/catalog/cugir-009096)]
Expand All @@ -109,12 +117,16 @@ Wherever possible, we do not resample the original DEMs unless specified by the
+ Adjust the geoid to pixel/area coordinates
+ resample the geoids into the DEM reference frame
+ Adjust the vertical datum
5. All DEMs are converted to `float32` and have nodata `np.nan`. Although this can increase data size of certain rasters (SRTM is distributed as integers), this ensures (a) easy comparison across DEMs and (b) no side-effects of the stitcher due to unusual nodata values. Note, this datatype is done in `merge_tiles` in the `stitcher.py`. Other nodata values can be specified outside the stitcher as is frequently done (e.g. ISCE2 requires nodata to be filled as `0`).
5. All DEMs are converted to `float32` and have nodata `np.nan`. Although this can increase data size of certain rasters (SRTM is distributed as integers), this ensures (a) easy comparison across DEMs and (b) no side-effects of the stitcher due to unusual nodata values. Note, this datatype is done in `merge_tile_datasets` in `merge.py`. Other nodata values can be specified outside the stitcher as is frequently done (e.g. ISCE2 requires nodata to be filled as `0`).

There are some [notebooks](notebooks/analysis_and_comparison) that illustrate how tiles are merged by comparing the output of our stitcher with the original tiles.

As a performance note, when merging DEM tiles, we merge the all needed tiles in memory and this process has an associated overhead. An alternative approach would be to download the tiles to disk and use virtual warping. Ultimately, the accuracy of the final DEM is our prime focus and these minor performance tradeoffs are sidelined.

# Dateline support

We assume that the supplied bounds overlap the standard lat/lon CRS grid i.e. longitudes between -/+ 180 longitude and are within -/+ 90 latitude. If there is a single dateline crossing by the supplied bounds, then the tiles are wrapped and translated to provide a continuous raster over the area provided. We assume a maximum of one dateline crossing (if you have multiple dateline crossing, you should not being using the `stitch_dem` API directly). Wrapping around poles (i.e. at -/+ 90 latitude) is *not* supported and an exception will be raised.

# For Development

This is almost identical to normal installation:
Expand All @@ -135,10 +147,15 @@ The former is the more likely. When re-generating tiles, make sure to run all te

# Testing

1. Install `pytest`
2. Run `pytest .`
For unit tests,

1. Install `pytest` via `conda-forge`
2. Run `pytest tests -m 'not integration'`

We also have a number integration test (marked as `integration` within pytest) to ensures all the DEM tile datasets can be downloaded and transformed successfully (though in most cases, correctness is not verified, that is the domain of the non-integration unit tests). Integration tests will quickly indicate if urls for DEM tiles are working correctly. Integration tests are not checked within the github actions currently. Additionally, to ensure the documentation of this repository, in the form of notebooks, utilize the correct functional API, we utilize `papermill`. Thus for integration tests please install `papermill` via:

We have an integration test (marked as `integration`) which ensures all the datasets are downloaded and can be transformed (not validated for correctness at this time). Otherwise, all tests have basic tests with mock data to illustrate how the DEM stitcher is working. The non-integration tests are as github actions via `pytest tests -m "not integration"`.
1. `mamba install -c conda forge papermill`
2. `pytest tests -m 'integration'`

# Contributing

Expand All @@ -158,4 +175,4 @@ We use `flake8` and associated linting packages to ensure some basic code qualit

# Acknowledgements

This tool was developed to support cloud SAR processing using ISCE2 and various research. The early work of this repository was done by Charlie Marshak, David Bekaert, Michael Denbina, and Marc Simard. Since the utilization of this package for GUNW generation (see this [repo](https://github.com/ACCESS-Cloud-Based-InSAR/DockerizedTopsApp)), a subset of the ACCESS team, including Joseph (Joe) H. Kennedy, Simran Sangha, Grace Bato, Andrew Johnston, and Charlie Marshak, have improved this repository greatly. In particular, Joe Kennedy has lead the inclusion/development of actions, tests, packaging, distribution (including PyPI and `conda-forge`) and all the things to make this package more reliable, accessible, readable, etc. Simran Sangha has helped make sure output rasters are compatible with ISCE2 and other important bug-fixes.
This tool was developed to support cloud SAR processing using ISCE2 and various research projects at JPL. The early work of this repository was done by Charlie Marshak, David Bekaert, Michael Denbina, and Marc Simard. Since the utilization of this package for GUNW generation (see this [repo](https://github.com/ACCESS-Cloud-Based-InSAR/DockerizedTopsApp)), a subset of the ACCESS team, including Joseph (Joe) H. Kennedy, Simran Sangha, Grace Bato, Andrew Johnston, and Charlie Marshak, have improved this repository greatly. In particular, Joe Kennedy has lead the inclusion/development of actions, tests, packaging, distribution (including PyPI and `conda-forge`) and all the things to make this package more reliable, accessible, readable, etc. Simran Sangha has helped make sure output rasters are compatible with ISCE2 and other important bug-fixes.
3 changes: 3 additions & 0 deletions dem_stitcher/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
# FIXME: Python 3.8+ this should be `from importlib.metadata...`
from importlib_metadata import PackageNotFoundError, version

from .datasets import get_global_dem_tile_extents, get_overlapping_dem_tiles
from .stitcher import stitch_dem

try:
Expand All @@ -15,6 +16,8 @@


__all__ = [
'get_global_dem_tile_extents',
'get_overlapping_dem_tiles',
'stitch_dem',
'__version__',
]
74 changes: 72 additions & 2 deletions dem_stitcher/datasets.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,12 @@
from pathlib import Path

import geopandas as gpd
import pandas as pd
from rasterio.crs import CRS
from shapely.geometry import box

from .dateline import check_4326_bounds, get_dateline_crossing
from .exceptions import DEMNotSupported
from .geojson_io import read_geojson_gzip

DATA_PATH = Path(__file__).parents[0].absolute()/'data'
Expand All @@ -16,10 +20,76 @@ def get_available_datasets():
return DATASETS


def get_dem_tile_extents(dataset: str) -> gpd.GeoDataFrame:
def get_global_dem_tile_extents(dataset: str) -> gpd.GeoDataFrame:
"""Obtains globally avaialable tiles from DEM names supported.

Parameters
----------
dataset : str
A DEM name supported e.g. 'glo_30', 'glo_90', 'nasadem'

Returns
-------
gpd.GeoDataFrame
Columns are `tile_id`, `url`, `geometry` and `dem_name`

Raises
------
DEMNotSupported
Dataset is not supported.
"""
if dataset not in DATASETS:
raise ValueError(f'{dataset} must be in {", ".join(DATASETS)}')
raise DEMNotSupported(f'{dataset} must be in {", ".join(DATASETS)}')
df = read_geojson_gzip(DATA_PATH/f'{dataset}.geojson.zip')
df['dem_name'] = dataset
df.crs = CRS.from_epsg(4326)
return df


def get_overlapping_dem_tiles(bounds: list, dem_name: str) -> gpd.GeoDataFrame:
"""_summary_

Parameters
----------
bounds : list
4326 bounds as xmin, ymin, xmax, ymax
dem_name : str
A DEM name supported e.g. 'glo_30', 'glo_90', 'nasadem'

Returns
-------
gpd.GeoDataFrame
Columns are `tile_id`, `url`, `geometry` and `dem_name`

Raises
------
DEMNotSupported
If not in supported dem Name
"""
check_4326_bounds(bounds)

if dem_name not in DATASETS:
raise DEMNotSupported(f'Please use dem_name in: {", ".join(DATASETS)}')
box_sh = box(*bounds)
df_tiles_all = get_global_dem_tile_extents(dem_name)

crossing = get_dateline_crossing(bounds)
if crossing:
df_tiles_all_translated = df_tiles_all.copy()
x_translation = 2 * crossing
df_tiles_all_translated.geometry = df_tiles_all.geometry.translate(xoff=x_translation)
df_tiles_all = pd.concat([df_tiles_all, df_tiles_all_translated], axis=0).reset_index(drop=True)

index = df_tiles_all.intersects(box_sh)
df_tiles = df_tiles_all[index].copy()

# Merging is order dependent - ensures consistency
df_tiles = df_tiles.sort_values(by='tile_id')
df_tiles = df_tiles.reset_index(drop=True)
return df_tiles


def intersects_missing_glo_30_tiles(extent: list) -> bool:
extent_geo = box(*extent)
df_missing = get_overlapping_dem_tiles(extent, 'glo_90_missing')
return df_missing.intersects(extent_geo).sum() > 0
105 changes: 105 additions & 0 deletions dem_stitcher/dateline.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
from typing import List, Tuple

from shapely.affinity import translate
from shapely.geometry import box

from .exceptions import DoubleDatelineCrossing, Incorrect4326Bounds


def check_4326_bounds(bounds: list) -> bool:

xmin, ymin, xmax, ymax = bounds

if (xmin > xmax) or (ymin > ymax):
raise Incorrect4326Bounds('Ensure xmin <= xmax and ymin <= ymax')

standard_4326_box = box(-180, -90, 180, 90)
bounds_box = box(*bounds)

if not (standard_4326_box.intersects(bounds_box)):
raise Incorrect4326Bounds('Make sure bounds have intersection over standard 4326 CRS i.e. '
'between longitude -180 and 180 and latitude -90 and 90.')

if (ymin < -90) or (ymax > 90):
raise Incorrect4326Bounds('Boxes beyond the North/South Pole at +/- 90 Latitude not supported')

return True


def get_dateline_crossing(bounds: list) -> int:
"""Checks dateline (aka antimeridian) crossing. Returns +/- 180 depending on extents of bounding box provided.
Assumes only 1 dateline can be crossed otherwise exception raised.

Parameters
----------
bounds : list
xmin, ymin, xmax, ymax in EPSG:4326

Returns
-------
int
0 if no dateline crossing, and +/- 180 if there is depending on sign of dateline that is intersected

Raises
------
DoubleDatelineCrossing
If dateline is crossed twice
Incorrect4326Bounds
If there is no intersection with normal lat/long CRS extent i.e. longitude between -180 and 180 and latitude
between -90 and 90.
"""
xmin, _, xmax, _ = bounds

check_4326_bounds(bounds)

# This logic assumes there is intersection within the standard 4326 CRS.
# There are exactly 2 * 2 = 4 conditions
if (xmin > -180) and (xmax < 180):
return 0

elif (xmin <= -180) and (xmax < 180):
return -180

elif (xmin > -180) and (xmax >= 180):
return 180

elif (xmin <= -180) and (xmax >= 180):
raise DoubleDatelineCrossing('Shrink your bounding area')


def split_extent_across_dateline(extent: list) -> Tuple[List]:
"""If extent crosses the dateline, then we return tuple of left and right hemispheres
assuming lat/lon CRS. Otherwise, just returns, extent and empty list

Parameters
----------
extent : list
minx, miny, maxx, maxy

Returns
-------
Tuple[List]
(bounds_of_extent_on_left_hemisphere,
bounds_of_extent_on_right_hemisphere) if corsses dateline

Otherwise,
(extent, [])
"""
crossing = get_dateline_crossing(extent)

if crossing:
left_hemisphere = box(-180, -90, 0, 90)
right_hemisphere = box(0, -90, 180, 90)

extent_box = box(*extent)
translation_x = - crossing * 2

extent_box_t = translate(extent_box, translation_x, 0)
multipolygon = extent_box.union(extent_box_t)

bounds_l = list(multipolygon.intersection(left_hemisphere).bounds)
bounds_r = list(multipolygon.intersection(right_hemisphere).bounds)
return (bounds_l, bounds_r)

else:
return extent, []
8 changes: 8 additions & 0 deletions dem_stitcher/exceptions.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,11 @@ class NoDEMCoverage(Exception):

class DEMNotSupported(Exception):
"""Throw this exception if DEM Name is not supported"""


class DoubleDatelineCrossing(Exception):
"""Throw this exception if dateline is crossed twice i.e. at longitude of -180 and 180"""


class Incorrect4326Bounds(Exception):
"""If epsg:4326 xmin, ymin, xmax, ymax does not intersect -180, -90, 180, 90"""
Loading