Add 'to_iris' and 'from_iris' to methods Dataset #2449

jacobtomlinson · 2018-10-01T09:02:26Z

This PR adds to_iris and from_iris methods to DataSet. I've added this because I frequently find myself writing little list and dictionary comprehensions to pack and unpack both DataSets from DataArrays and Iris CubeLists from Cubes.

Tests added (for all bug fixes or enhancements)
Tests passed (for all non-documentation changes)
Fully documented, including whats-new.rst for all changes and api.rst for new API

shoyer

My only hesitation here is on the name: is it entirely obvious that Dataset.to_iris() should produce a CubeList? Maybe Dataset.to_iris_cubelist()?

I guess Iris doesn't have other high level data structures for multiple cubes, so this is the obvious choice.

shoyer · 2018-10-02T14:43:18Z

xarray/convert.py

+def dataset_from_iris(cubelist):
+    """ Convert an Iris CubeList into a Dataset.
+    """
+    return Dataset({cube.var_name: DataArray.from_iris(cube) for cube in cubelist})


Can we use the name attribute already on DataArray.from_iris(cube)? We have some special logic already for figuring out names in DataArray.from_iris.

So cube.name instead of cube.var_name?

I would say DataArray.from_iris(cube).name, but it probably makes sense save it in an intermediate variable to avoid converting the cube twice.

jacobtomlinson · 2018-10-02T18:15:28Z

Thanks for the feedback.

I think when working in iris that you expect multiple cubes to be returned as a cube list. Also when calling iris.load you expect a cubelist.

I would propose that to_iris is fine, however if you feel strongly I would be happy to change it.

shoyer · 2018-10-06T07:49:55Z

I'm happy to defer to Iris users (like you) on what they would expect for converting to/from an xarray.Dataset.

jhamman · 2018-11-05T20:11:23Z

I wonder if we can get @pelson to weigh in here. Like @shoyer said, whatever the Iris users think makes most sense for the naming of these methods is fine by me.

DPeterK · 2019-01-23T11:24:49Z

Apologies about adding my thoughts after a bit of a gap on here...

My only hesitation here is on the name

I like the to_iris and from_iris method names suggested here. They're consistent with existing functionality available for DataArray, and I think it's reasonable to expect that as DataArrays map to cubes, so also for Datasets and CubeLists. As it's possible for both Datasets and CubeLists to contain only a single object there could be some extra logic to return a DataArray or cube respectively in such a case, but I think that would add needless complexity as it's not hard from a user perspective to get back to the single item from the Dataset or CubeList.

We have some special logic already for figuring out names in DataArray.from_iris

Does this logic include handling multiple cubes of the same name in a single CubeList? Iris will quite happily handle this, but I guess the name:DataArray mapping in Xarray requires unique names. For example:

>>> names = [c.name() for c in cubes]
>>> print(names)
['air_pressure', 'air_pressure', 'air_pressure_at_sea_level', 'air_temperature', 'air_temperature', 'air_temperature', 'air_temperature', 'air_temperature', 'dew_point_temperature', 'geopotential_height', 'relative_humidity', 'relative_humidity', 'specific_humidity', 'surface_air_pressure', 'upward_air_velocity', 'x_wind', 'x_wind', 'x_wind', 'y_wind', 'y_wind', 'y_wind']

The differences between these cubes is one or more of:

one cube describes the phenomenon at the surface, and
another cube the phenomenon on height or pressure levels, or
another cube describes the phenomenon after statistical processing, and so on.

If such a case isn't currently handled, it could be handled by using this differing metadata to modify the name used for the key; for example air_temperature --> air_temperature__maximum_1_hr – so long as returning to a CubeList will also return the names to their originals.

jhamman · 2023-09-14T02:52:46Z

@DPeterK / @jacobtomlinson - this has grown quite stale. Any interest in finishing this up or should we close this in favor of a new contribution down the line?

trexfeathers · 2023-09-14T09:58:05Z

@pp-mo 👀

jacobtomlinson · 2023-09-18T09:33:53Z

@jhamman this PR is 5 years old and my need for it has long since gone away. If other folks want to pick this up that would be awesome.

Jacob Tomlinson added 5 commits October 1, 2018 09:58

Add 'to_iris' and 'from_iris' to methods Dataset

330de46

Add whats new

9a3d495

Fix import

7dd812a

Fix imports

7d9f72c

Get cube not cubelist

0a9da68

shoyer reviewed Oct 2, 2018

View reviewed changes

dcherian mentioned this pull request Oct 24, 2018

xarray 0.11 release #2505

Closed

5 tasks

jhamman added the plan to close May be closeable, needs more eyeballs label Sep 14, 2023

jacobtomlinson closed this Sep 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add 'to_iris' and 'from_iris' to methods Dataset #2449

Add 'to_iris' and 'from_iris' to methods Dataset #2449

jacobtomlinson commented Oct 1, 2018

shoyer left a comment

shoyer Oct 2, 2018

jacobtomlinson Oct 2, 2018

shoyer Oct 2, 2018

jacobtomlinson commented Oct 2, 2018

shoyer commented Oct 6, 2018

jhamman commented Nov 5, 2018

DPeterK commented Jan 23, 2019

jhamman commented Sep 14, 2023

trexfeathers commented Sep 14, 2023

jacobtomlinson commented Sep 18, 2023

Add 'to_iris' and 'from_iris' to methods Dataset #2449

Add 'to_iris' and 'from_iris' to methods Dataset #2449

Conversation

jacobtomlinson commented Oct 1, 2018

shoyer left a comment

Choose a reason for hiding this comment

shoyer Oct 2, 2018

Choose a reason for hiding this comment

jacobtomlinson Oct 2, 2018

Choose a reason for hiding this comment

shoyer Oct 2, 2018

Choose a reason for hiding this comment

jacobtomlinson commented Oct 2, 2018

shoyer commented Oct 6, 2018

jhamman commented Nov 5, 2018

DPeterK commented Jan 23, 2019

jhamman commented Sep 14, 2023

trexfeathers commented Sep 14, 2023

jacobtomlinson commented Sep 18, 2023