Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TYP: __getitem__ method of EA (2nd pass) #37921

Closed

Conversation

simonjayhawkins
Copy link
Member

follow-on from #37898

remove casts and ignores added in #37898 (needed to use typevar in DatetimeLikeArrayMixin in conjunction with overload for slice to return correct DatetimeArray/PeriodArray)

another pass required to

more typevars for return type of type(self)
type __getitem__ in concrete classes (Categorical, IntervalArray and SparseArray)
overloads with Any report overlapping overloads
NAType resolves to Any since implemented in cython without stub file

@simonjayhawkins simonjayhawkins added ExtensionArray Extending pandas with custom dtypes or arrays. Typing type annotations, mypy/pyright type checking labels Nov 17, 2020
@@ -484,6 +484,7 @@ def split_and_operate(
-------
list of blocks
"""
assert isinstance(self.values, np.ndarray)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to ensure split_and_operate is not called from EA block (or base method not overridden)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move assert after ndim check

# TODO: overload DatetimeLikeArrayMixin.__getitem__
values = cast(PeriodArray, self._data[:0])
return type(self)._simple_new(values, name=self.name)
return type(self)._simple_new(self._data[:0], name=self.name)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reverting change from previous PR

first_value = cast(Any, self._values[mask.argmax()])
result[(locs == 0) & (where._values < first_value)] = -1
first = mask.argmax()
result[(locs == 0) & (where._values < self._values[first])] = -1
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reverting change from previous PR

@jreback jreback added this to the 1.2 milestone Nov 18, 2020
@jreback
Copy link
Contributor

jreback commented Nov 18, 2020

lgtm. @jbrockmendel

@overload
# error: Overloaded function signatures 1 and 2 overlap with incompatible
# return types [misc]
def __getitem__(self, key: int) -> EAScalarOrMissing: # type: ignore[misc]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can also return NDArrayBackedExtensionArrayT

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you elaborate. is this for 2d EA?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NDArrayBackedExtensionArray supports 2D, exactly

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so it also supports a tuple indexer? and I guess from that NDArrayBackedExtensionArray can't support nested data as it uses is_scalar checks.

so also need to change EAScalarOrMissing = object # both scalar value and na_value can be any type -> ScalarOrScalarMissing = Scalar # both values and na_value must be scalars ?

Copy link
Member

@jbrockmendel jbrockmendel Nov 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so it also supports a tuple indexer?

Yes. For that matter i think even 1D will support 1-tuples

and I guess from that NDArrayBackedExtensionArray can't support nested data as it uses is_scalar checks.

PandasArray can have object dtype, and Categorical can hold tuples. We never see 2D versions of those in practice though (yet).

@@ -266,9 +267,19 @@ def __array__(self, dtype=None) -> np.ndarray:
return np.array(list(self), dtype=object)
return self._ndarray

@overload
def __getitem__(self, key: int) -> DTScalarOrNaT:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same, can still return DatetimeLikeArrayT

@jbrockmendel
Copy link
Member

couple of comments, otherwise LGTM

@simonjayhawkins
Copy link
Member Author

removing this from the 1.2 milestone for now.

since #35259 is merged, can now proceed with the outstanding tasks in parallel. so plan is to get all the tasks started (to get review) and will then circle back to this.

@simonjayhawkins simonjayhawkins removed this from the 1.2 milestone Nov 23, 2020
@github-actions
Copy link
Contributor

This pull request is stale because it has been open for thirty days with no activity. Please update or respond to this comment if you're still interested in working on this.

@github-actions github-actions bot added the Stale label Dec 24, 2020
@simonjayhawkins
Copy link
Member Author

closing while we discuss #39513

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ExtensionArray Extending pandas with custom dtypes or arrays. Stale Typing type annotations, mypy/pyright type checking
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants