You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A DataTree is created without problem even when its children nodes store non deepcopiable objects as array elements in their variables.
Minimal Complete Verifiable Example
# minimal setupimportxarrayasxrfromcopyimportcopy, deepcopyclassNoDeepCopy:
def__deepcopy__(self, memo):
raiseTypeError("Class can't be deepcopied")
# check things do what we expectexample=NoDeepCopy()
copy(example)
# worksdeepcopy(example)
# Raises TypeError: Class can't be deepcopied# On to xarray use. All of these work correctly:da=xr.DataArray(NoDeepCopy())
ds=xr.Dataset({"var": da})
dt1=xr.DataTree(ds)
dt2=xr.DataTree.from_dict({"/": ds})
# However, none of these work, they all end up triggering the `__deepcopy__`# method of the `NoDeepCopy` classdt3=xr.DataTree(ds, children={"child": dt1})
dt4=xr.DataTree.from_dict({"child": ds})
dt5=xr.DataTree()
dt5.children= {"child": xr.DataTree(ds)}
MVCE confirmation
Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
Complete example — the example is self-contained, including all data and the text of any traceback.
Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
New issue — a search of GitHub Issues suggests this is not a duplicate.
Recent environment — the issue occurs with the latest version of xarray and its dependencies.
Relevant log output
---------------------------------------------------------------------------TypeErrorTraceback (mostrecentcalllast)
CellIn[9], line1---->1dt4=xr.DataTree.from_dict({"child": ds})
2dt4File~/bin/miniforge3/envs/arviz/lib/python3.11/site-packages/xarray/core/datatree.py:1198, inDataTree.from_dict(cls, d, name)
1196else:
1197raiseTypeError(f"invalid values: {data}")
->1198obj._set_item(
1199path,
1200new_node,
1201allow_overwrite=False,
1202new_nodes_along_path=True,
1203 )
1205# TODO: figure out why mypy is raising an error here, likely something1206# to do with the return type of Dataset.copy()1207returnobjFile~/bin/miniforge3/envs/arviz/lib/python3.11/site-packages/xarray/core/treenode.py:652, inTreeNode._set_item(self, path, item, new_nodes_along_path, allow_overwrite)
650raiseKeyError(f"Already a node object at path {path}")
651else:
-->652current_node._set(name, item)
File~/bin/miniforge3/envs/arviz/lib/python3.11/site-packages/xarray/core/datatree.py:944, inDataTree._set(self, key, val)
942new_node=val.copy(deep=False)
943new_node.name=key-->944new_node._set_parent(new_parent=self, child_name=key)
945else:
946ifnotisinstance(val, DataArray|Variable):
947# accommodate other types that can be coerced into VariablesFile~/bin/miniforge3/envs/arviz/lib/python3.11/site-packages/xarray/core/treenode.py:115, inTreeNode._set_parent(self, new_parent, child_name)
113self._check_loop(new_parent)
114self._detach(old_parent)
-->115self._attach(new_parent, child_name)
File~/bin/miniforge3/envs/arviz/lib/python3.11/site-packages/xarray/core/treenode.py:152, inTreeNode._attach(self, parent, child_name)
147ifchild_nameisNone:
148raiseValueError(
149"To directly set parent, child needs a name, but child is unnamed"150 )
-->152self._pre_attach(parent, child_name)
153parentchildren=parent._children154assertnotany(
155childisselfforchildinparentchildren156 ), "Tree is corrupt."File~/bin/miniforge3/envs/arviz/lib/python3.11/site-packages/xarray/core/datatree.py:528, inDataTree._pre_attach(self, parent, name)
526node_ds=self.to_dataset(inherit=False)
527parent_ds=parent._to_dataset_view(rebuild_dims=False, inherit=True)
-->528check_alignment(path, node_ds, parent_ds, self.children)
529_deduplicate_inherited_coordinates(self, parent)
File~/bin/miniforge3/envs/arviz/lib/python3.11/site-packages/xarray/core/datatree.py:149, incheck_alignment(path, node_ds, parent_ds, children)
147ifparent_dsisnotNone:
148try:
-->149align(node_ds, parent_ds, join="exact")
150exceptValueErrorase:
151node_repr=_indented(_without_header(repr(node_ds)))
File~/bin/miniforge3/envs/arviz/lib/python3.11/site-packages/xarray/core/alignment.py:883, inalign(join, copy, indexes, exclude, fill_value, *objects)
687""" 688 Given any number of Dataset and/or DataArray objects, returns new 689 objects with aligned indexes and dimension sizes. (...) 873 874 """875aligner=Aligner(
876objects,
877join=join,
(...)
881fill_value=fill_value,
882 )
-->883aligner.align()
884returnaligner.resultsFile~/bin/miniforge3/envs/arviz/lib/python3.11/site-packages/xarray/core/alignment.py:583, inAligner.align(self)
581self.results=self.objects582else:
-->583self.reindex_all()
File~/bin/miniforge3/envs/arviz/lib/python3.11/site-packages/xarray/core/alignment.py:558, inAligner.reindex_all(self)
557defreindex_all(self) ->None:
-->558self.results=tuple(
559self._reindex_one(obj, matching_indexes)
560forobj, matching_indexesinzip(
561self.objects, self.objects_matching_indexes, strict=True562 )
563 )
File~/bin/miniforge3/envs/arviz/lib/python3.11/site-packages/xarray/core/alignment.py:559, in<genexpr>(.0)
557defreindex_all(self) ->None:
558self.results=tuple(
-->559self._reindex_one(obj, matching_indexes)
560forobj, matching_indexesinzip(
561self.objects, self.objects_matching_indexes, strict=True562 )
563 )
File~/bin/miniforge3/envs/arviz/lib/python3.11/site-packages/xarray/core/alignment.py:547, inAligner._reindex_one(self, obj, matching_indexes)
544new_indexes, new_variables=self._get_indexes_and_vars(obj, matching_indexes)
545dim_pos_indexers=self._get_dim_pos_indexers(matching_indexes)
-->547returnobj._reindex_callback(
548self,
549dim_pos_indexers,
550new_variables,
551new_indexes,
552self.fill_value,
553self.exclude_dims,
554self.exclude_vars,
555 )
File~/bin/miniforge3/envs/arviz/lib/python3.11/site-packages/xarray/core/dataset.py:3569, inDataset._reindex_callback(self, aligner, dim_pos_indexers, variables, indexes, fill_value, exclude_dims, exclude_vars)
3567reindexed=self._overwrite_indexes(new_indexes, new_variables)
3568else:
->3569reindexed=self.copy(deep=aligner.copy)
3570else:
3571to_reindex= {
3572k: v3573fork, vinself.variables.items()
3574ifknotinvariablesandknotinexclude_vars3575 }
File~/bin/miniforge3/envs/arviz/lib/python3.11/site-packages/xarray/core/dataset.py:1374, inDataset.copy(self, deep, data)
1277defcopy(self, deep: bool=False, data: DataVars|None=None) ->Self:
1278"""Returns a copy of this dataset. 1279 1280 If `deep=True`, a deep copy is made of each of the component variables. (...) 1372 pandas.DataFrame.copy 1373 """->1374returnself._copy(deep=deep, data=data)
File~/bin/miniforge3/envs/arviz/lib/python3.11/site-packages/xarray/core/dataset.py:1410, inDataset._copy(self, deep, data, memo)
1408variables[k] =index_vars[k]
1409else:
->1410variables[k] =v._copy(deep=deep, data=data.get(k), memo=memo)
1412attrs=copy.deepcopy(self._attrs, memo) ifdeepelsecopy.copy(self._attrs)
1413encoding= (
1414copy.deepcopy(self._encoding, memo) ifdeepelsecopy.copy(self._encoding)
1415 )
File~/bin/miniforge3/envs/arviz/lib/python3.11/site-packages/xarray/core/variable.py:940, inVariable._copy(self, deep, data, memo)
937ndata=indexing.MemoryCachedArray(data_old.array) # type: ignore[assignment]939ifdeep:
-->940ndata=copy.deepcopy(ndata, memo)
942else:
943ndata=as_compatible_data(data)
File~/bin/miniforge3/envs/arviz/lib/python3.11/copy.py:153, indeepcopy(x, memo, _nil)
151copier=getattr(x, "__deepcopy__", None)
152ifcopierisnotNone:
-->153y=copier(memo)
154else:
155reductor=dispatch_table.get(cls)
File~/bin/miniforge3/envs/arviz/lib/python3.11/copy.py:153, indeepcopy(x, memo, _nil)
151copier=getattr(x, "__deepcopy__", None)
152ifcopierisnotNone:
-->153y=copier(memo)
154else:
155reductor=dispatch_table.get(cls)
CellIn[2], line3, inNoDeepCopy.__deepcopy__(self, memo)
2def__deepcopy__(self, memo):
---->3raiseTypeError("Class can't be deepcopied")
TypeError: Classcan'tbedeepcopied
Anything else we need to know?
I added the traceback of the dt4 case, not sure which would be the most informative. They are not all exactly the same in the beginning but they are after TreeNode._set_parent, then on to TreeNode._attach, DataTree._pre_attach... until __deepcopy__. Any of the different starts for the tracebacks should be reproducible copy pasting the example though.
Environment
INSTALLED VERSIONS
commit: None
python: 3.11.8 | packaged by conda-forge | (main, Feb 16 2024, 20:53:32) [GCC 12.3.0]
python-bits: 64
OS: Linux
OS-release: 5.14.21-150500.55.83-default
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: ca_ES.UTF-8
LOCALE: ('ca_ES', 'UTF-8')
libhdf5: 1.14.3
libnetcdf: None
What happened?
I have xarray objects that contain object dtype numpy arrays whose elements can't be deepcopied. This has never been an issue, not even when using
xarray-datatree
and I saw in https://github.com/pydata/xarray/blob/main/DATATREE_MIGRATION_GUIDE.md#api-changes that children objects are shallow copied but they seem to be deepcopied.What did you expect to happen?
A DataTree is created without problem even when its children nodes store non deepcopiable objects as array elements in their variables.
Minimal Complete Verifiable Example
MVCE confirmation
Relevant log output
Anything else we need to know?
I added the traceback of the
dt4
case, not sure which would be the most informative. They are not all exactly the same in the beginning but they are afterTreeNode._set_parent
, then on toTreeNode._attach
,DataTree._pre_attach
... until__deepcopy__
. Any of the different starts for the tracebacks should be reproducible copy pasting the example though.Environment
INSTALLED VERSIONS
commit: None
python: 3.11.8 | packaged by conda-forge | (main, Feb 16 2024, 20:53:32) [GCC 12.3.0]
python-bits: 64
OS: Linux
OS-release: 5.14.21-150500.55.83-default
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: ca_ES.UTF-8
LOCALE: ('ca_ES', 'UTF-8')
libhdf5: 1.14.3
libnetcdf: None
xarray: 2024.10.0
pandas: 2.2.3
numpy: 1.26.4
scipy: 1.14.1
netCDF4: None
pydap: None
h5netcdf: 1.3.0
h5py: 3.11.0
zarr: 2.16.1
cftime: None
nc_time_axis: None
iris: None
bottleneck: None
dask: 2024.5.0
distributed: 2024.5.0
matplotlib: 3.8.2
cartopy: None
seaborn: None
numbagg: None
fsspec: 2024.3.1
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 68.2.2
pip: 23.3.2
conda: None
pytest: 7.4.3
mypy: None
IPython: 8.18.1
sphinx: 7.2.6
The text was updated successfully, but these errors were encountered: