-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve accuracy of six
byte index methods
#9117
Conversation
stdlib/_operator.pyi
Outdated
@@ -1,4 +1,5 @@ | |||
import sys | |||
from builtins import _GetItemIterable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you use _typeshed.SupportsGetItem
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I cannot as-is, because SupportsGetItem
defines __contains__
, AND it is defined as def __contains__(self, __x: object) -> bool: ...
.
Not all indexable types (including buffer-like ones) define __contains__
*
bytes defines it as def __contains__(self, __o: SupportsIndex | bytes) -> bool: ...
, and I am uncertain of the change required to make it work. (other than just typing the parameter __x
as Any
).
* On that note, mmap
missing __contains__
seems to be a mistake:
>>> from mmap import mmap
>>> 0 in mmap(0, 5)
False
>>> b'\x00' in mmap(0, 5)
True
I've done what I think are the necessary changes.
stubs/six/six/__init__.pyi
Outdated
def indexbytes(buf: bytes, i: int) -> int: ... | ||
def iterbytes(buf: bytes) -> _Iterator[int]: ... | ||
|
||
byte2int = operator.itemgetter(0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's not legal in a stub.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops, didn't catch that. I accidentally ran the tests locally with py
instead of python
(venv fail)
PEP 646 allows that, but mypy doesn't have full support yet. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Interesting note about that last mypy_primer run, is that mypy and pyright give slightly different types, and pyright gives no error: import operator
reveal_type(operator.itemgetter(1).__call__)
test_dict = min({"first": 1, "second": 2}, key=operator.itemgetter(1)) # Ok
test_items = min({"first": 1, "second": 2}.items(), key=operator.itemgetter(1)) # mypy error
reveal_type(test_dict)
reveal_type(test_items) pyright:
mypy:
Yet mypy knows about the correct return type. Is this a mypy bug? |
stdlib/mmap.pyi
Outdated
@@ -68,7 +68,8 @@ class mmap(Iterable[int], Sized): | |||
@overload | |||
def __setitem__(self, __index: slice, __object: ReadableBuffer) -> None: ... | |||
# Doesn't actually exist, but the object is actually iterable because it has __getitem__ and | |||
# __len__, so we claim that there is also an __iter__ to help type checkers. | |||
# __len__, so we claim that there is also an __iter__ and __contains__ to help type checkers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think __contains__
deserves a separate comment, since it's a separate and independent issue. The problem is that in
works on objects that define __getitem__
but not __contains__
, but type checkers (at least mypy and pyright; I just added support for this pattern in pyanalyze in quora/pyanalyze#564) don't know about this. That's a similar but separate issue to how objects with __getitem__
are iterable.
I think this is also why the SupportsGetItem
protocol has __contains__
.
That does look like a mypy bug. I haven't looked too deeply at the mypy-primer changes but we'll have to fix these issues before we can merge the PR. |
Same, but the ones I have looked at stemmed from the same issue. I'm in no hurry to need this, so if it is confirmed to be a mypy issue and can be fixed relatively soon, then I can wait. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Welp. At least |
This comment has been minimized.
This comment has been minimized.
Didn't look closely, but mind opening an issue for the suspected mypy bug? |
stdlib/_operator.pyi
Outdated
@@ -77,11 +78,9 @@ def delitem(__a: MutableSequence[Any], __b: slice) -> None: ... | |||
@overload | |||
def delitem(__a: MutableMapping[_K, Any], __b: _K) -> None: ... | |||
@overload | |||
def getitem(__a: Sequence[_T], __b: SupportsIndex) -> _T: ... | |||
def getitem(__a: SupportsGetItem[_K, _V], __b: _K) -> _V: ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't this largely mask the slice
overload in practice? E.g. if we call getitem([1, 2, 3], slice(1))
, we should get a Sequence back, but I think this overload will be matched and we'll get a slice instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was also worried about this, but I tried it out and it doesn't seem to be the case: https://mypy-play.net/?mypy=latest&python=3.10&gist=184cb82e0026c4bd0e9a6d5fa47eb1a1
Might be worth adding a test case, though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a slice will never match the first overload (at least with current type definitions) because:
Argument of type "slice" cannot be assigned to parameter "__b" of type "_K@getitem" in function "getitem"
Type "slice" cannot be assigned to type "SupportsIndex"
"slice" is incompatible with protocol "SupportsIndex"
"__index__" is not present
Still I think it makes sense to put the more precise/least generic overload first. Out of good practice.
stdlib/_operator.pyi
Outdated
@@ -107,16 +106,25 @@ class attrgetter(Generic[_T_co]): | |||
@final | |||
class itemgetter(Generic[_T_co]): | |||
@overload | |||
def __new__(cls, item: Any) -> itemgetter[Any]: ... | |||
def __new__(cls, item: _T_co) -> itemgetter[_T_co]: ... # type: ignore[misc] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a comment explaining why we need the type ignore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did better and actually fixed the overlap
stdlib/_operator.pyi
Outdated
# "tuple[int, int]" is incompatible with protocol "SupportsIndex" | ||
# preventing [_T_co, ...] instead of [Any, ...] | ||
# | ||
# A bug in mypy prevents using [..., _T] instead of [..., Any] here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should link to the bug in the mypy issue tracker.
stdlib/mmap.pyi
Outdated
@@ -67,8 +67,11 @@ class mmap(Iterable[int], Sized): | |||
def __setitem__(self, __index: int, __object: int) -> None: ... | |||
@overload | |||
def __setitem__(self, __index: slice, __object: ReadableBuffer) -> None: ... | |||
# Doesn't actually exist, but the object is actually iterable because it has __getitem__ and | |||
# __len__, so we claim that there is also an __iter__ to help type checkers. | |||
# Doesn't actually exist, but the object is actually iterable because it has __getitem__, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# Doesn't actually exist, but the object is actually iterable because it has __getitem__, | |
# Doesn't actually exist, but the object actually supports "in" because it has __getitem__, |
I'll try to get a minimal repro. I'm not too sure where it is that's something was going wrong. Likely some interaction between TypeVar and Callable. There was a handful of types interacting together. |
This comment has been minimized.
This comment has been minimized.
According to mypy_primer, this change has no effect on the checked open source code. 🤖🎉 |
Specifically:
byte2int
,indexbytes
, anditerbytes
While working in a library that uses
six
, I noticed that I was getting false-positive typing issues. The 3 functions above accept more buffer-like types than just bytes.I've also updated
itemgetter
to actually be generic and not useAny
Relates to #9006
Possible improvement: Is there a way to properly unpack
*items
into atuple
of N-length? I feel like I've seen something, but idk if mypy supports it.