Skip to content
This repository has been archived by the owner on Jun 30, 2022. It is now read-only.

Return a Dask array when loading Bedmap2 #45

Merged
merged 8 commits into from
Jun 24, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/install.rst
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ Dependencies
* `xarray <https://xarray.pydata.org/>`__
* `pandas <https://pandas.pydata.org>`__
* `rasterio <https://rasterio.readthedocs.io>`__
* `dask <https://dask.org/>`__

Most of the examples in the :ref:`gallery` also use:

Expand Down
1 change: 1 addition & 0 deletions environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ dependencies:
- xarray
- pandas
- rasterio
- dask
# Development requirements
- matplotlib
- cmocean
Expand Down
1 change: 1 addition & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,4 @@ pooch>=0.5
xarray
pandas
rasterio
dask
19 changes: 15 additions & 4 deletions rockhound/bedmap2.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@
}


def fetch_bedmap2(datasets, *, load=True):
def fetch_bedmap2(datasets, *, load=True, chunks=1000, **kwargs):
"""
Fetch the Bedmap2 datasets for Antarctica.

Expand Down Expand Up @@ -55,8 +55,11 @@ def fetch_bedmap2(datasets, *, load=True):
relative to EIGEN-GL04C geoid (to convert back to WGS84, add this grid)

.. warning ::
Loading a great number of datasets may require a fair amount of memory that
could crash your system. We recommend loading only the needed datasets.
Loading datasets into memory may require a fair amount of memory.
In order to prevent this, the function loads the datasets as Dask arrays if
``chunks`` is not ``None``.
Be careful when doing operations that loads the entire datasets into memory,
like plotting or performing some computations.

.. warning ::
Loading any dataset along with ``thickness_uncertainty_5km`` would modify the
Expand All @@ -70,6 +73,14 @@ def fetch_bedmap2(datasets, *, load=True):
Wether to load the data into an :class:`xarray.Dataset` or just return the
path to the downloaded data tiff files. If False, will return a list with the
paths to the files corresponding to *datasets*.
chunks : int, tuple or dict
Chunk sizes along each dimension. This argument is passed to the
:func:`xarray.open_rasterio` function in order to obtain
`Dask arrays <https://docs.dask.org/en/latest/array.html>`_ inside the
returned :class:`xarray.Dataset`.
This helps to read the dataset without loading it entirely into memory.
**kwargs
Extra parameters passed to the :func:`xarray.open_rasterio` function.

Returns
-------
Expand All @@ -88,7 +99,7 @@ def fetch_bedmap2(datasets, *, load=True):
return [get_fname(dataset, fnames) for dataset in datasets]
arrays = []
for dataset in datasets:
array = xr.open_rasterio(get_fname(dataset, fnames))
array = xr.open_rasterio(get_fname(dataset, fnames), chunks=chunks, **kwargs)
# Replace no data values with nans
array = array.where(array != array.nodatavals)
# Remove "band" dimension and coordinate
Expand Down