Skip to content

Commit

Permalink
Auto chunk (#4064)
Browse files Browse the repository at this point in the history
* Added chunks='auto' option in dataset.py

* FIX: correct dask array handling in _calc_idxminmax (#3922)

* FIX: correct dask array handling in _calc_idxminmax

* FIX: remove unneeded import, reformat via black

* fix idxmax, idxmin with dask arrays

* FIX: use array[dim].data in `_calc_idxminmax` as per @keewis suggestion, attach dim name to result

* ADD: add dask tests to `idxmin`/`idxmax` dataarray tests

* FIX: add back fixture line removed by accident

* ADD: complete dask handling in `idxmin`/`idxmax` tests in test_dataarray, xfail dask tests for dtype dateime64 (M)

* ADD: add "support dask handling for idxmin/idxmax" in whats-new.rst

* MIN: reintroduce changes added by #3953

* MIN: change if-clause to use `and` instead of `&` as per review-comment

* MIN: change if-clause to use `and` instead of `&` as per review-comment

* WIP: remove dask handling entirely for debugging purposes

* Test for dask computes

* WIP: re-add dask handling (map_blocks-approach), add `with raise_if_dask_computes()` context to idxmin-tests

* Use dask indexing instead of map_blocks.

* Better chunk choice.

* Return -1 for _nan_argminmax_object if all NaNs along dim

* Revert "Return -1 for _nan_argminmax_object if all NaNs along dim"

This reverts commit 58901b9.

* Raise error for object arrays

* No error for object arrays. Instead expect 1 compute in tests.

Co-authored-by: dcherian <[email protected]>

* fix the failing flake8 CI (#4057)

* rename d and l to dim and length

* Fixed typo in rasterio docs (#4063)

* Added chunks='auto' option in dataset.py

Added changes to whats-new.rst

* Added chunks='auto' option in dataset.py

Added changes to whats-new.rst

* Error fix, catch chunks=None

* Minor reformatting + flake8 changes

* Added isinstance(chunks, (Number, str)) in dataset.py, passing

* format changes

* added auto-chunk test for dataarrays

* Assert chunk sizes equal in auto-chunk test

Co-authored-by: Kai Mühlbauer <[email protected]>
Co-authored-by: dcherian <[email protected]>
Co-authored-by: keewis <[email protected]>
Co-authored-by: clausmichele <[email protected]>
Co-authored-by: Keewis <[email protected]>
  • Loading branch information
6 people authored May 25, 2020
1 parent 3194b3e commit 1de38bc
Show file tree
Hide file tree
Showing 3 changed files with 18 additions and 3 deletions.
4 changes: 4 additions & 0 deletions doc/whats-new.rst
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,10 @@ Breaking changes

New Features
~~~~~~~~~~~~

- ``chunks='auto'`` is now supported in the ``chunks`` argument of
:py:meth:`Dataset.chunk`. (:issue:`4055`)
By `Andrew Williams <https://github.com/AndrewWilliams3142>`_
- Added :py:func:`xarray.cov` and :py:func:`xarray.corr` (:issue:`3784`, :pull:`3550`, :pull:`4089`).
By `Andrew Williams <https://github.com/AndrewWilliams3142>`_ and `Robin Beer <https://github.com/r-beer>`_.
- Added :py:meth:`DataArray.polyfit` and :py:func:`xarray.polyval` for fitting polynomials. (:issue:`3349`)
Expand Down
9 changes: 6 additions & 3 deletions xarray/core/dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -1707,7 +1707,10 @@ def chunks(self) -> Mapping[Hashable, Tuple[int, ...]]:
def chunk(
self,
chunks: Union[
None, Number, Mapping[Hashable, Union[None, Number, Tuple[Number, ...]]]
None,
Number,
str,
Mapping[Hashable, Union[None, Number, str, Tuple[Number, ...]]],
] = None,
name_prefix: str = "xarray-",
token: str = None,
Expand All @@ -1725,7 +1728,7 @@ def chunk(
Parameters
----------
chunks : int or mapping, optional
chunks : int, 'auto' or mapping, optional
Chunk sizes along each dimension, e.g., ``5`` or
``{'x': 5, 'y': 5}``.
name_prefix : str, optional
Expand All @@ -1742,7 +1745,7 @@ def chunk(
"""
from dask.base import tokenize

if isinstance(chunks, Number):
if isinstance(chunks, (Number, str)):
chunks = dict.fromkeys(self.dims, chunks)

if chunks is not None:
Expand Down
8 changes: 8 additions & 0 deletions xarray/tests/test_dask.py
Original file line number Diff line number Diff line change
Expand Up @@ -1035,6 +1035,14 @@ def test_unify_chunks_shallow_copy(obj, transform):
assert_identical(obj, unified) and obj is not obj.unify_chunks()


@pytest.mark.parametrize("obj", [make_da()])
def test_auto_chunk_da(obj):
actual = obj.chunk("auto").data
expected = obj.data.rechunk("auto")
np.testing.assert_array_equal(actual, expected)
assert actual.chunks == expected.chunks


def test_map_blocks_error(map_da, map_ds):
def bad_func(darray):
return (darray * darray.x + 5 * darray.y)[:1, :1]
Expand Down

0 comments on commit 1de38bc

Please sign in to comment.