Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add partial overload checks #5476

Merged
merged 13 commits into from
Aug 27, 2018

Conversation

Michael0x2a
Copy link
Collaborator

This pull request adds more robust support for detecting partially overlapping types. Specifically, it detects overlaps with...

  1. TypedDicts
  2. Tuples
  3. Unions
  4. TypeVars
  5. Generic types containing variations of the above.

It also swaps out the code for detecting overlaps with operators and removes some associated (and now unused) code.

This pull request builds on top of #5474 and #5475 -- once those two PRs are merged, I'll rebase this diff if necessary.

This pull request also supercedes #5475 -- that PR contains basically the same code as these three PRs, just smushed together.

@Michael0x2a
Copy link
Collaborator Author

@ilevkivskyi -- So, just to summarize the status of all these PRs in one place...

The following PRs are independent from one another and can be reviewed in any order:

  1. Generalize subtype caches #5474 (Generalizing subtype caches)
  2. Refactor reversible operators #5475 (Refactoring reversible operators)
  3. Remove and refactor old overload selection logic #5321 (Clean up old overload logic -- this one is low priority)

This PR builds on top of code from 1 and 2. Once those two PRs are merged in, I'll rebase this PR to shrink the diff size.

I've left #5280, the old PR for adding partial overlaps, open for now since it has some extra context/probably includes some feedback that I've forgotten to incorporate. Feel free to close it whenever though if you want to minimize clutter -- I can re-find it pretty easily.

I've tested this PR on our internal codebases -- so far, I believe this PR introduces one regression that I've as of yet been unable to reproduce and fix. (Basically, the code is asserting some field is None, and mypy gets confused for some reason).

I'll continue working on tracking that final regression and fixing it over the next few days while I'm waiting on your code reviews, but if you have time to take a look/can figure out how to produce a smaller repo, I'd be extremely grateful.

(And sorry for dumping all of this code on you all of a sudden -- if you want me to break up the above PRs into even smaller ones, let me know, and I'll be happy to do it.)

@ilevkivskyi
Copy link
Member

I've left #5280, the old PR for adding partial overlaps, open for now since it has some extra context/probably includes some feedback that I've forgotten to incorporate. Feel free to close it whenever though if you want to minimize clutter -- I can re-find it pretty easily.

I would prefer to close it, just to not get lost :-)

(And sorry for dumping all of this code on you all of a sudden -- if you want me to break up the above PRs into even smaller ones, let me know, and I'll be happy to do it.)

No problem at all, thanks for working on this. The PR sizes are just right for my taste.

(Btw, there is already a merge conflict, you can probably rebase after merging the subtype cache.)

This pull request adds more robust support for detecting partially
overlapping types. Specifically, it detects overlaps with...

1. TypedDicts
2. Tuples
3. Unions
4. TypeVars
5. Generic types containing variations of the above.

It also swaps out the code for detecting overlaps with operators and
removes some associated (and now unused) code.

This pull request builds on top of python#5474
and python#5475 -- once those two PRs are
merged, I'll rebase this diff if necessary.

This pull request also supercedes python#5475 --
that PR contains basically the same code as these three PRs, just
smushed together.
@Michael0x2a Michael0x2a force-pushed the improve-partial-overload-checks branch from b9c8d22 to af2109c Compare August 16, 2018 20:36
@Michael0x2a
Copy link
Collaborator Author

@ilevkivskyi -- the rebase is done now, fyi

@ilevkivskyi
Copy link
Member

OK, thanks! It is a bit late here. I will finish this tomorrow morning.

Copy link
Member

@ilevkivskyi ilevkivskyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, looks good! Here are few comments.

mypy/checker.py Outdated
we do not check for partial overlaps. This lets us vary our error messages
depending on the severity of the overlap.

See 'is_unsafe_partially_overlapping_overload_signatures' for the full checks.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there two functions only for more tuned error messages? If yes, then I think it is not worth it. This is already complex part of the code and the error messages aren't much cleaner because of this.

mypy/checker.py Outdated
# (We already have a rudimentary implementation of 'is_partially_overlapping', but it only
# attempts to handle the obvious cases -- see its docstring for more info.)
# if "foo" in signature.name or "bar" in signature.name or "chain_call" in signature.name:
# print("in first")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Debugging print?

mypy/checker.py Outdated
The caller can then unify on all type variables whether or not the callable is originally
from a class or not."""
type_list = typ.arg_types + [typ.ret_type]
# old_type_list = list(type_list)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stray comment again?

mypy/meet.py Outdated
# Note: doing 'return typ.items()' makes mypy
# infer a too-specific return type of List[CallableType]
out = [] # type: List[Type]
out.extend(typ.items())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just use return list(typ.iems()) (i.e. make a copy so that mypy is happy about invariance).

mypy/meet.py Outdated

# We should never encounter these types, but if we do, we handle
# them in the same way we handle 'Any'.
illegal_types = (UnboundType, PartialType, ErasedType, DeletedType)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PartialType should be an assert False, "Something bad about partial types", for other it is OK to be Any

mypy/meet.py Outdated
ignore_promotions: bool = False) -> bool:
"""Returns true if left and right are overlapping tuples.

Precondition: is_tuple(left) and is_tuple(right) are both true."""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You already have asserts below, better add an assert message there instead.

if out is None:
pass
return out
[builtins fixtures/isinstance.pyi]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH I am not sure what these two tests test. Maybe add a note or somehow modify them so that it is more clear what is a potential failure?

def f(self, x): ...

# TODO: This shouldn't trigger an error message.
# Related to testTypeCheckOverloadImplementationTypeVarDifferingUsage2?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please create a follow-up issue for this kind of problems?

mypy/meet.py Outdated
# We ought to have handled every case by now: we conclude the
# two types are not overlapping, either completely or partially.

return False
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be safer to have different defaults for overloads and isinstance() checks. For the former, it is better to have False to avoid false positives. But for isinstance() it is probably better to have True for the cases where it is not clear (to avoid inferring some branches as unreachable, as this may cause problems for mypyc).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could I persuade you to keep this a return False? Basically the way is_overlapping_types works is by steadily working through each pair of types and comparing them against each other.

So, if the left and the right are both TypedDict, we compare those two directly and return either True or False, if the left and the right are both Instance, we do the same...

This final return False case is meant to handle the situation where left and right are genuinely two unrelated types -- returning anything other then False here would lead to incorrect results in those cases.

It's possible that there are some overlap cases we've missed (e.g. can CallableType overlap with Type? Can Type overlap with Instance?), but I'd much rather prefer we handle those missing cases directly and fix is_overlapping_types as they come up instead of silently letting the function continue to be imprecise.

to avoid inferring some branches as unreachable, as this may cause problems for mypyc

I think maybe a better way of solving this is to make mypy report branches that are accidentally unreachable -- it's fairly easy to accidentally add an isinstance check that'll never be reached, I think. (Of course, we wouldn't report branches that are intentionally unreachable due to system platform guards or things like 'if False' -- we also probably shouldn't report branches that just contain an assert/just throw an exception).

I'm hoping that such a check could just be enabled by default -- I can't really imagine people intentionally adding logic to a no-op branch.

(I'd be willing to work on adding this sort of check if you think it's valuable enough).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could I persuade you to keep this a return False?

Not just me :-)

Your proposal with the additional error may generate false positives. Consider this example:

class A: ...
class B: ...
if isinstance(x, A):
    ...
    if isinstance(x, B):
        # this is not reachable ...or is it?

I think we already discussed some time ago that there are special cases (like multiple inheritance) where we are deliberately unsafe. All such cases should be treated on per-use-case basis. For example, when checking for overloads, we treat A and B as non-overlapping, while when inferring reachability of a branch we might want to do the opposite.

Even if you will not add the default argument for such corner cases, I am quite sure it will be added later (the above example is unreachable on master, but this may be changed soon).

@msullivan what do you think about this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, ok. In that case then, maybe a better approach might be to add a allow_multiple_inheritance flag to this function? And when it's true, we'd skip all of the logic in the if isinstance(left, Instance) and isinstance(right, Instance) check and just directly return True.

(I haven't had a chance to test this on our internal codebases yet, but it doesn't seem like it'd break too many things, based on what happens when I run mypy's own tests.)

As it stands, just swapping the return False line at the very bottom with return default won't correctly handle the multiple inheritance case -- the "left and right are instances" check sometimes returns False.

More broadly, I'd prefer whitelisting specific cases where we want to relax the overlap checks -- I think the code would be more maintainable that way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe a better approach might be to add a allow_multiple_inheritance flag to this function?
[snip]
More broadly, I'd prefer whitelisting specific cases where we want to relax the overlap checks -- I think the code would be more maintainable that way

You think it is more maintainable, I think it looks like "premature maintainability optimization" :-)

Currently all edge cases should be considered non-overlapping for overloads, but overlapping for inferring reachability. For example, empty containers: List[int] and List[Type[int]] are considered non-overlapping for overloads but should be overlapping for reachability.

Anyway, it looks like this is an independent issue, so could you please add a TODO item above the final return False, and add a permalink to it in #3603 ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently all edge cases should be considered non-overlapping for overloads, but overlapping for inferring reachability. For example, empty containers: List[int] and List[Type[int]] are considered non-overlapping for overloads but should be overlapping for reachability.

Well, List[int] and List[Type[int]] are both represented by Instance objects underneath -- so I think special-casing the "left and right are both Instances" tactic would work?

In any case, I'll make a TODO -- we can worry about this later once it comes up again :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so I think special-casing the "left and right are both Instances" tactic would work?

It may be more subtle because it will recurse to checking arguments that are Instance vs TypeType. Also there are other cases if you want non-instances, e.g. Instance vs CallableType should be always overlapping for the purpose of isinstance() and callable() checks (that @msullivan implemented a while ago), but should not be overlapping for overloads.

In any case, I'll make a TODO -- we can worry about this later once it comes up again :)

OK.

mypy/meet.py Outdated
@@ -34,7 +39,7 @@ def narrow_declared_type(declared: Type, narrowed: Type) -> Type:
if isinstance(declared, UnionType):
return UnionType.make_simplified_union([narrow_declared_type(x, narrowed)
for x in declared.relevant_items()])
elif not is_overlapping_types(declared, narrowed, use_promotions=True):
elif not is_overlapping_types(declared, narrowed):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a test for this, or you jut noticed the inconsistency?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, what happened here is that that I changed the behavior of is_overlapping_types so it does use promotions by default. So, removing this param isn't actually causing any difference in behavior.

@Michael0x2a
Copy link
Collaborator Author

@ilevkivskyi -- ok, sorry for the delay! I can confirm that this is now (finally) ready for a second look, and can also confirm that it triggers no new errors in our internal codebases.

Specific changes I made:

  1. Removed debug prints and comments
  2. Consolidated the two "unsafe overlapping overload" checks into just a single one, and removed the extraneous whitespace.
  3. Finally tracked down and fixed a bug related to Nones, isinstance checks, and no-strict-optional. In sort, the bug was that I was inconsistently filtering out NoneType when in non-strict-optional mode.
  4. Created an issue (Moving an overloaded generic function into a class can sometimes introduce "incompatible return type" errors #5510) to track the extra error message that we generate when turning a generic overload function into a method.
  5. Added a comment explaining why we don't care about variance in is_overlapping_types.
  6. Added a comment to the two extra test cases I added to check-isinstance.test explaining what they do.

I've also left a few responses to your comments up above.

@ilevkivskyi
Copy link
Member

Thanks for the updates! I will make another pass of review tomorrow (most likely).

Copy link
Member

@ilevkivskyi ilevkivskyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good now modulo the TODO comment I mentioned, feel free to merge after adding it.

Btw, there is a merge conflict, could you please double check this is still OK with internal codebases after fixing the conflict (just in case).

@Michael0x2a
Copy link
Collaborator Author

@ilevkivskyi -- it looks like everything checks out internally.

Before I merge though, I do have one final concern: while I can confirm this diff doesn't introduce any new errors, it's still possible that this diff will start making mypy mark previously reachable branches as being unreachable.

How big of a concern is this/do you think this is something we should try testing for first in some way?

@ilevkivskyi
Copy link
Member

How big of a concern is this/do you think this is something we should try testing for first in some way?

I think this is OK. I am not too concerned about rare false negatives, we are already aware of the problem, and the issues are filed. I think you can go ahead and merge this.

@Michael0x2a Michael0x2a merged commit dd70710 into python:master Aug 27, 2018
@Michael0x2a Michael0x2a deleted the improve-partial-overload-checks branch August 27, 2018 18:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants