We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Describe the bug When a nested list column contains a struct column and we try to extract the host-scalar it results in an error.
Steps/Code to reproduce bug
In [1]: import cudf In [2]: s = cudf.Series([[[{'a':1, 'b':2, 'c':10}]]]) In [3]: s[0] --------------------------------------------------------------------------- RuntimeError Traceback (most recent call last) Cell In [3], line 1 ----> 1 s[0] File /nvme/0/pgali/envs/cudfdev/lib/python3.9/contextlib.py:79, in ContextDecorator.__call__.<locals>.inner(*args, **kwds) 76 @wraps(func) 77 def inner(*args, **kwds): 78 with self._recreate_cm(): ---> 79 return func(*args, **kwds) File /nvme/0/pgali/envs/cudfdev/lib/python3.9/site-packages/cudf/core/series.py:1167, in Series.__getitem__(self, arg) 1165 return self.iloc[arg] 1166 else: -> 1167 return self.loc[arg] File /nvme/0/pgali/envs/cudfdev/lib/python3.9/contextlib.py:79, in ContextDecorator.__call__.<locals>.inner(*args, **kwds) 76 @wraps(func) 77 def inner(*args, **kwds): 78 with self._recreate_cm(): ---> 79 return func(*args, **kwds) File /nvme/0/pgali/envs/cudfdev/lib/python3.9/site-packages/cudf/core/series.py:258, in _SeriesLocIndexer.__getitem__(self, arg) 255 except (TypeError, KeyError, IndexError, ValueError): 256 raise KeyError(arg) --> 258 return self._frame.iloc[arg] File /nvme/0/pgali/envs/cudfdev/lib/python3.9/contextlib.py:79, in ContextDecorator.__call__.<locals>.inner(*args, **kwds) 76 @wraps(func) 77 def inner(*args, **kwds): 78 with self._recreate_cm(): ---> 79 return func(*args, **kwds) File /nvme/0/pgali/envs/cudfdev/lib/python3.9/site-packages/cudf/core/series.py:180, in _SeriesIlocIndexer.__getitem__(self, arg) 178 if isinstance(arg, tuple): 179 arg = list(arg) --> 180 data = self._frame._get_elements_from_column(arg) 182 if ( 183 isinstance(data, (dict, list)) 184 or _is_scalar_or_zero_d_array(data) 185 or _is_null_host_scalar(data) 186 ): 187 return data File /nvme/0/pgali/envs/cudfdev/lib/python3.9/site-packages/cudf/core/single_column_frame.py:385, in SingleColumnFrame._get_elements_from_column(self, arg) 379 def _get_elements_from_column(self, arg) -> Union[ScalarLike, ColumnBase]: 380 # A generic method for getting elements from a column that supports a 381 # wide range of different inputs. This method should only used where 382 # _absolutely_ necessary, since in almost all cases a more specific 383 # method can be used e.g. element_indexing or slice. 384 if _is_scalar_or_zero_d_array(arg): --> 385 return self._column.element_indexing(int(arg)) 386 elif isinstance(arg, slice): 387 start, stop, stride = arg.indices(len(self)) File /nvme/0/pgali/envs/cudfdev/lib/python3.9/site-packages/cudf/core/column/column.py:445, in ColumnBase.element_indexing(self, index) 442 if idx > len(self) - 1 or idx < 0: 443 raise IndexError("single positional indexer is out-of-bounds") --> 445 return libcudf.copying.get_element(self, idx).value File scalar.pyx:174, in cudf._lib.scalar.DeviceScalar.value.__get__() File scalar.pyx:146, in cudf._lib.scalar.DeviceScalar._to_host_scalar() File scalar.pyx:431, in cudf._lib.scalar._get_py_list_from_list() File interop.pyx:144, in cudf._lib.interop.to_arrow() RuntimeError: cuDF failure at: /nvme/0/pgali/cudf/cpp/src/interop/to_arrow.cu:280: Number of field names and number of children doesn't match
Expected behavior
In [4]: s[0] Out[4]: [[{'a': 1, 'b': 2, 'c': 10}]]
Environment overview (please complete the following information)
The text was updated successfully, but these errors were encountered:
Fix issue with extracting nested column data & dtype preservation (#1…
866434f
…1671) This PR: Fixes: #11670 - [x] Fixes: #11670, by correctly generating the `column_metadata` for nested scenarios. - [x] Also fixes an issue with dtype mismatch after updating `children` in a `ListColumn`. See the pytest below. Authors: - GALI PREM SAGAR (https://github.com/galipremsagar) Approvers: - Ashwin Srinath (https://github.com/shwina) - Lawrence Mitchell (https://github.com/wence-) URL: #11671
galipremsagar
Successfully merging a pull request may close this issue.
Describe the bug
When a nested list column contains a struct column and we try to extract the host-scalar it results in an error.
Steps/Code to reproduce bug
Expected behavior
Environment overview (please complete the following information)
The text was updated successfully, but these errors were encountered: