-
-
Notifications
You must be signed in to change notification settings - Fork 18.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TYP: partial typing of masked array #31728
Conversation
return self._data.nbytes + self._mask.nbytes | ||
|
||
@classmethod | ||
def _concat_same_type(cls, to_concat): | ||
def _concat_same_type(cls: Type[BaseMaskedArrayT], to_concat) -> BaseMaskedArrayT: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need to add the typevar to avoid...
pandas\core\arrays\integer.py:117: error: Incompatible return value type (got "BaseMaskedArray", expected "IntegerArray")
pandas\core\arrays\boolean.py:122: error: Incompatible return value type (got "BaseMaskedArray", expected "BooleanArray")
we can't use the unbound typevar from pandas._typing here otherwise we get...
pandas\core\arrays\masked.py:183: error: Too many arguments for "object"
since the typevar is needed here, it is also used for the other methods that return type(self)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I said in another issue, I don't have problems with it if there is no way around it, but it needs to be documented then
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is a problem; it's documented in PEP 484
https://www.python.org/dev/peps/pep-0484/#annotating-instance-and-class-methods
# The value used to fill '_data' to avoid upcasting | ||
_internal_fill_value: "Scalar" | ||
|
||
def __init__(self, values: np.ndarray, mask: np.ndarray, copy: bool = False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
__init__ needs to be declared in the base class...
pandas\core\arrays\masked.py:56: error: Too many arguments for "BaseMaskedArray"
pandas\core\arrays\masked.py:181: error: Too many arguments for "BaseMaskedArray"
pandas\core\arrays\masked.py:207: error: Too many arguments for "BaseMaskedArray"
pandas\core\arrays\masked.py:207: error: Unexpected keyword argument "copy" for "BaseMaskedArray"
pandas\core\arrays\masked.py:213: error: Too many arguments for "BaseMaskedArray"
pandas\core\arrays\masked.py:213: error: Unexpected keyword argument "copy" for "BaseMaskedArray"
also creating this ensures that the subclasses have the correct signature for the constructor to work with __invert__
, _concat_same_type
, take
and copy
from the base class.
we could just use a AbstractMethodError but I think it makes sense to put the shared functionality here.
BooleanArray has checking for values.ndim and mask.ndim. IntegerArray does not. It may make sense to have that check here also if applicable to IntegerArray.
mask = mask.copy() | ||
|
||
self._data = values | ||
self._mask = mask | ||
self._dtype = BooleanDtype() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this could be a class attribute. no need to be assigned in constructor?
pandas/core/arrays/boolean.py
Outdated
@@ -387,7 +382,7 @@ def __setitem__(self, key, value): | |||
self._data[key] = value | |||
self._mask[key] = mask | |||
|
|||
def astype(self, dtype, copy=True): | |||
def astype(self, dtype, copy: bool = True) -> Union[np.ndarray, BaseMaskedArray]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a differentiator between this and ArrayLike
from pandas._typing (save the TypeVar)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we could, since I assume that astype should eventually support other ExtensionArrays other than IntegerArray and BooleanArray.
|
||
self._data = values | ||
self._mask = mask | ||
super().__init__(values, mask, copy=copy) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which superclass performs this logic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BaseMaskedArray?
@@ -317,18 +313,18 @@ def map_string(s): | |||
scalars = [map_string(x) for x in strings] | |||
return cls._from_sequence(scalars, dtype, copy) | |||
|
|||
def _values_for_factorize(self) -> Tuple[np.ndarray, Any]: | |||
def _values_for_factorize(self) -> Tuple[np.ndarray, int]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, though, the "Any" is correct (I don't know what the typing should do here, but the signature of the base class would use "Any", so any place where _values_for_factorize
is called would need to assume "Any")
return self._data.nbytes + self._mask.nbytes | ||
|
||
@classmethod | ||
def _concat_same_type(cls, to_concat): | ||
def _concat_same_type(cls: Type[BaseMaskedArrayT], to_concat) -> BaseMaskedArrayT: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I said in another issue, I don't have problems with it if there is no way around it, but it needs to be documented then
data = np.concatenate([x._data for x in to_concat]) | ||
mask = np.concatenate([x._mask for x in to_concat]) | ||
return cls(data, mask) | ||
|
||
def take(self, indexer, allow_fill=False, fill_value=None): | ||
def take( | ||
self: BaseMaskedArrayT, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We otherwise don't type self, or do we?
(self is always the type of the class, no?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Thanks @simonjayhawkins |
No description provided.