Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using length comparison to narrow types for Tuples and Literals #10367

Closed
wants to merge 4 commits into from

Conversation

hatal175
Copy link
Contributor

Description

Fixes #1178.

This PR uses length based comparisons with Literals to narrow down types in branches. It is similar to the way mypy does it with isinstance.
There are three types of narrowing done:

  1. Choosing Tuples from a union according to length
  2. Reducing variant length tuples to specific sized tuples
  3. Choosing Literals according to length

Test Plan

I've added unit tests based on both the use cases and bugs mypy primer found.

@github-actions

This comment has been minimized.

@DevilXD
Copy link

DevilXD commented Apr 26, 2021

As someone who just ran into #1178 myself, this PR is a +1 from me, assuming it's correct and works.

@JukkaL
Copy link
Collaborator

JukkaL commented May 5, 2021

Thanks for the PR! Since we are (hopefully) close to a release, I won't review and merge this PR just yet -- improvements to type inference can have tricky edge cases that may slow down the release.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@MaddyGuthridge
Copy link

Would love to see this get added soon - it'd improve my program's readability massively

@hatal175
Copy link
Contributor Author

hatal175 commented Apr 8, 2022

I've added this pull request almost a year ago and there doesn't seem to be much progress with it. Has it been reviewed? Does it need fixes? What's up?

@DevilXD
Copy link

DevilXD commented Apr 9, 2022

@hatal175 I think that may be because some tests on this one have failed. Here's the output from the Windows x64 test suite:

================================== FAILURES ===================================
___________________ testNarrowingLenBiggerThanVariantTuple ____________________
[gw1] win32 -- Python 3.7.9 D:\a\mypy\mypy\.tox\py37\Scripts\python.EXE
data: D:\a\mypy\mypy\test-data\unit\check-narrowing.test:1274:
D:\a\mypy\mypy\mypy\test\testcheck.py:140: in run_case
    self.run_case_once(testcase)
D:\a\mypy\mypy\mypy\test\testcheck.py:227: in run_case_once
    assert_string_arrays_equal(output, a, msg.format(testcase.file, testcase.line))
E   AssertionError: Unexpected type checker output (D:\a\mypy\mypy\test-data\unit\check-narrowing.test, line 1274)
---------------------------- Captured stderr call -----------------------------
Expected:
  main:10: note: Revealed type is "builtins.tuple[builtins.int]" (diff)
  main:12: note: Revealed type is "builtins.tuple[builtins.int]" (diff)
Actual:
  main:10: note: Revealed type is "builtins.tuple[builtins.int, ...]" (diff)
  main:12: note: Revealed type is "builtins.tuple[builtins.int, ...]" (diff)

Alignment of first line difference:
  E: ...s.tuple[builtins.int]"
  A: ...s.tuple[builtins.int, ...]"
                            ^
_____________________ testNarrowingLenVariantLengthTuple ______________________
[gw0] win32 -- Python 3.7.9 D:\a\mypy\mypy\.tox\py37\Scripts\python.EXE
data: D:\a\mypy\mypy\test-data\unit\check-narrowing.test:1156:
D:\a\mypy\mypy\mypy\test\testcheck.py:140: in run_case
    self.run_case_once(testcase)
D:\a\mypy\mypy\mypy\test\testcheck.py:227: in run_case_once
    assert_string_arrays_equal(output, a, msg.format(testcase.file, testcase.line))
E   AssertionError: Unexpected type checker output (D:\a\mypy\mypy\test-data\unit\check-narrowing.test, line 1156)
---------------------------- Captured stderr call -----------------------------
Expected:
  main:9: note: Revealed type is "Tuple[builtins.int, builtins.int, builtins....
  main:11: note: Revealed type is "builtins.tuple[builtins.int]" (diff)
  main:14: note: Revealed type is "builtins.tuple[builtins.int]" (diff)
  main:16: note: Revealed type is "Tuple[builtins.int, builtins.int, builtins...
Actual:
  main:9: note: Revealed type is "Tuple[builtins.int, builtins.int, builtins....
  main:11: note: Revealed type is "builtins.tuple[builtins.int, ...]" (diff)
  main:14: note: Revealed type is "builtins.tuple[builtins.int, ...]" (diff)
  main:16: note: Revealed type is "Tuple[builtins.int, builtins.int, builtins...

Alignment of first line difference:
  E: ...s.tuple[builtins.int]"
  A: ...s.tuple[builtins.int, ...]"
                            ^
=========================== short test summary info ===========================
FAILED mypy/test/testcheck.py::TypeCheckSuite::check-narrowing.test::testNarrowingLenBiggerThanVariantTuple
FAILED mypy/test/testcheck.py::TypeCheckSuite::check-narrowing.test::testNarrowingLenVariantLengthTuple
===== 2 failed, 9659 passed, 375 skipped, 7 xfailed in 1548.55s (0:25:48) =====
ERROR: InvocationError for command 'D:\a\mypy\mypy\.tox\py37\Scripts\python.EXE' -m pytest (exited with code 1)

In general, this section should be showing that all 14 checks were successful, and the branch needs to not have any conflicts, for a proper review to happen:
obraz
You can re-run the tests by making a push to this PR's branch - please note that a push can consist of several commits, so if you intend on fixing a bunch of things in separate commits and know that the tests will fail in-between them, it's best to hold on with any pushes until after you make all commits that you think will make the test suite pass this time. This just helps to save on unnecessary processing from the test suite side.

Side note: I've been following this PR closely as I've ran into several cases where this would be really helpful for my purposes. Thank you for it 🙂

@DevilXD
Copy link

DevilXD commented Apr 9, 2022

Also, in addition to fixing the cause of the failing tests, I'd suggest taking a look at this specific primer output, since it may signify requiring a change too:

kornia (https://github.com/kornia/kornia)
+ kornia/augmentation/container/image.py:130: error: Value of type <nothing> is not indexable  [index]
+ kornia/augmentation/container/image.py:131: error: Value of type <nothing> is not indexable  [index]

Here's the relevant code from that point in time: https://github.com/kornia/kornia/blob/406f03aa3da709dddeee4a6992a1158d2054fc63/kornia/augmentation/container/image.py#L130-L131

@hatal175
Copy link
Contributor Author

hatal175 commented Apr 9, 2022

Fixed tests and typing due to changes from last rebase done here (nothing ground breaking).
The primer error is an error in the codebase (random_apply should be a union of Tuple[int] as well).

@github-actions

This comment has been minimized.

@github-actions
Copy link
Contributor

github-actions bot commented Apr 9, 2022

Diff from mypy_primer, showing the effect of this PR on open source code:

rich (https://github.com/willmcgugan/rich)
+ rich/padding.py:69: error: Redundant cast to "Tuple[int, int]"  [redundant-cast]
+ rich/padding.py:72: error: Redundant cast to "Tuple[int, int, int, int]"  [redundant-cast]

rotki (https://github.com/rotki/rotki)
+ rotkehlchen/api/server.py:284: error: Unused "type: ignore" comment
+ rotkehlchen/api/server.py:287: error: Unused "type: ignore" comment

anyio (https://github.com/agronholm/anyio)
+ src/anyio/_core/_sockets.py:499: error: Redundant cast to "Tuple[str, int, int, int]"  [redundant-cast]
+ src/anyio/_core/_sockets.py:506: error: Redundant cast to "Tuple[str, int]"  [redundant-cast]

kornia (https://github.com/kornia/kornia)
+ kornia/augmentation/container/image.py:132: error: Value of type <nothing> is not indexable  [index]
+ kornia/augmentation/container/image.py:133: error: Value of type <nothing> is not indexable  [index]

urllib3 (https://github.com/urllib3/urllib3)
+ src/urllib3/fields.py:236: error: Redundant cast to "Tuple[str, Union[str, bytes], str]"  [redundant-cast]
+ src/urllib3/fields.py:240: error: Redundant cast to "Tuple[str, Union[str, bytes]]"  [redundant-cast]

aiohttp (https://github.com/aio-libs/aiohttp)
+ aiohttp/resolver.py:47: error: Unused "type: ignore" comment

black (https://github.com/psf/black)
+ src/blib2to3/pgen2/tokenize.py:253:29: error: Redundant cast to "Tuple[int, str]"
+ src/blib2to3/pgen2/tokenize.py:255:49: error: Redundant cast to "Tuple[int, str, Tuple[int, int], Tuple[int, int], str]"

Copy link
Member

@ilevkivskyi ilevkivskyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks very good, thanks! I did a quick pass and left some comments. Also please resolve the merge conflicts and fix this mypy_primer output:

kornia (https://github.com/kornia/kornia)
+ kornia/augmentation/container/image.py:132: error: Value of type <nothing> is not indexable  [index]
+ kornia/augmentation/container/image.py:133: error: Value of type <nothing> is not indexable  [index]

refers_to_fullname(expr.callee, 'builtins.len') and
len(expr.args) == 1 and
is_narrowable_literal(collapse_walrus(expr.args[0]))
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mypyc generates a slow code for nested functions. Can you move these two out of the class, e.g. put them near len_of_type()?

ind in narrowable_len_operand_index_to_hash
for ind in expr_indices)

if not is_len_check_expression:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW I don't like huge if blocks. Can you factor out the corresponding logic in this block into a helper method?

continue

# we already checked that operand[i] is CallExpr since it is narrowable
expr = operands[i].args[0] # type: ignore[attr-defined]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't use # type: ignore, use assert isinstance(..., CalExpr) instead.

@@ -5774,6 +5969,17 @@ def conditional_types_to_typemaps(expr: Expression,
return cast(Tuple[TypeMap, TypeMap], tuple(maps))


def len_of_type(typ: Type) -> Optional[int]:
"""Takes a type and returns an int that represents the length
of instances of that type or None if not applicable or variant length"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use a single line docstring here, for example Return the length of objects of given type if known statically.

Also please replace "variant length" -> "variable length" everywhere in the PR (including test cases).

if isinstance(proper_type, TupleType):
return len(proper_type.items)
if isinstance(proper_type, LiteralType) and isinstance(proper_type.value, str):
return len(proper_type.value)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would propose to leave out literal strings from the initial version of this feature. It is quite niche and should be easy to add if people will ask. You can leave a TODO item here to add support for literal strings in the future.

return typ
proper_type = get_proper_type(typ)
if (isinstance(proper_type, Instance) and proper_type.type.fullname == "builtins.tuple"
and length >= 0):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I notice this excludes zero-length tuples, and it doesn't seem like they're tested. We might want a special case for them, though it's not too likely in code that you'd do anything with the result.

@ilevkivskyi
Copy link
Member

Superseded by #16237 (note I am not including LiteralType support in the initial implementation)

@ilevkivskyi ilevkivskyi closed this Oct 8, 2023
ilevkivskyi added a commit that referenced this pull request Oct 21, 2023
Fixes #1178 
Supersedes #10367 

This is includes implementation for fixed length tuples, homogeneous
tuples, and variadic tuples (and combinations of those). Generally
implementation is straightforward. Some notes:
* Unfortunately, it is necessary to add a new attribute `min_len` to
`TypeVarTupleType`, which is probably fine, as it doesn't have that many
attributes so far.
* Supporting more general use cases (like `>` comparisons for variadic
tuples) can cause quick proliferation of unions. I added two mechanisms
to counteract this: not applying the narrowing if the integer literal in
comparison is itself large, and collapsing unions of tuples into a
single tuple (if possible) after we are done with the narrowing. This
looks a bit arbitrary, but I think it is important to have.
* Main missing feature here is probably not inferring type information
from indirect comparisons like `len(x) > foo() > 1`. Supporting this
kind of things in full generality is cumbersome, and we may add cases
that turn out to be important later.
* Note I am quite careful with indexing "inside" a `TypeVarTuple`, it is
not really needed now, but I wanted to make everything future proof, so
that it will be easy to add support for upper bounds for
`TypeVarTuple`s, like `Nums = TypeVarTuple("Nums", bound=tuple[float,
...])`.
* I also fix couple existing inconsistencies with `Any` handling in type
narrowing. It looks like they stem from the old incorrect logic that
meet of `Any` and `X` should be `X`, while in fact it should be `Any`.
These fixes are not strictly necessary, but otherwise there may be new
false positives, because I introduce a bunch of additional type
narrowing scenarios here.

cc @hatal175, thanks for the test cases, and for your nice first attempt
to implement this!
Co-authored-by: Tal Hayon <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unpacking tuples of variable length
8 participants