-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[red-knot] Initial tests for instance attributes #15474
Conversation
This comment was marked as off-topic.
This comment was marked as off-topic.
3702bf9
to
462c6ab
Compare
462c6ab
to
bcb7937
Compare
crates/red_knot_python_semantic/resources/mdtest/annotations/unsupported_type_qualifiers.md
Show resolved
Hide resolved
85cc979
to
92e5f00
Compare
92e5f00
to
a7cd7e2
Compare
|
||
#### Variable declared in class body and defined in unrelated method | ||
|
||
We also recognize pure instance variables if they are defined in a method that is not `__init__`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not saying we shouldn't but this will require traversing the entire class body. This could be somewhat expensive (at least it's more expensive than only doing so for __init__
).
- Does that include accesses inside of nested classes, functions or lambdas?
- Does this include accesses where the variable isn't self
class C:
pure_instance_variable: str
def test(self, other: Self, value: str):
other.pure_instance_variable = value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mypy and pyright both support this, so I think we have to, expensive though it may be. Users will expect their type checker to support this and will complain if it doesn't. It's also unfortunately reasonably common in Python to do things like
class Foo:
def __init__(self):
self.initialize_x_variable()
def initialize_x_variable(self):
# many lines of complicated logic
self.x = 42
Does that include accesses inside of nested classes, functions or lambdas?
Does this include accesses where the variable isn't self
I think we probably don't need to support these, certainly not initially
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can also experiment with varying degrees of laziness here. For first-party code that we're actually checking, I think we'll probably need to know all instance attributes of every class. But for third-party/stdlib code that our first-party code is interacting with, it often probably isn't necessary to know what instance attributes the class has and, even if it is, it might not be necessary to materialize the full set of instance attributes the class has. For something like the following, we could plausibly short-circuit if we find the foo
attribute defined in __init__
or __new__
, and not bother analyzing the other methods the SomeClass
defines:
from third_party import SomeClass
x = SomeClass()
print(x.foo + 5) # all we need to know here is whether `SomeClass` has a `foo` attribute
# and what its type is
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not a full review yet, but gotta go for lunch!
|
||
#### Variable declared in class body and defined in unrelated method | ||
|
||
We also recognize pure instance variables if they are defined in a method that is not `__init__`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mypy and pyright both support this, so I think we have to, expensive though it may be. Users will expect their type checker to support this and will complain if it doesn't. It's also unfortunately reasonably common in Python to do things like
class Foo:
def __init__(self):
self.initialize_x_variable()
def initialize_x_variable(self):
# many lines of complicated logic
self.x = 42
Does that include accesses inside of nested classes, functions or lambdas?
Does this include accesses where the variable isn't self
I think we probably don't need to support these, certainly not initially
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, this is fantastic! Good call on leaving out Final
for now.
Two notable things that you haven't covered are that we need to infer an instance variable for any declaration in the class body, even if there is no binding for the variable in any method at all. We could have a disabled-by-default rule that warns if you try to access the variable when it's not bound in any method, but pyright's experience with such a rule has been very mixed; it's hard to get it right without lots of false positives:
class Foo:
x: int
and we need to infer an instance variable for a variable that is defined in a non-__init__
variable but not declared in a class body:
class Foo:
def __init__(self):
self.initialize_x()
def initialize_x(self):
self.x = 42 # TODO: should the attribute have `Unknown`, `int`, `Literal[42]` or `Unknown | Literal[42]`
# from the perspective of code from other scopes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great -- though I'd still personally add the cases I mentioned in #15474 (review) prior to landing
Depending on how exhaustive you're looking to be for class Foo:
def __init__(self):
self.x: ClassVar[str] = "x" |
I was planning to, you are too fast Alex 😄. I think those should be covered now. I actually changed a previous test to cover the "not declared in body, only bound in non- |
I'll add it to the list in the PR description. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
|
||
c_instance = C(1) | ||
|
||
# TODO: should be `Literal["value set in __init__"]` (or `str` which would probably be more generally useful) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an interesting question. Both pyright and mypy have heuristics to widen literal types when they think it's "probably more useful." This seems a bit unprincipled to me, but I'm open to the possibility that we'll also need to do it. I guess my question is, if the only assignment to self.pure_instance_variable1
that occurs anywhere in the class is self.pure_instance_variable1 = "value set in __init__"
, then why wouldn't you want it typed as Literal["value set in __init__"]
? Just to accommodate external code setting it to some other value? That seems like an edge case, and not too much to ask you to add an explicit : str
if that's what you wanted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the only assignment to
self.pure_instance_variable1
that occurs anywhere in the class isself.pure_instance_variable1 = "value set in __init__"
, then why wouldn't you want it typed asLiteral["value set in __init__"]
? Just to accommodate external code setting it to some other value?
Subclasses also won't be able to assign a type to the attribute unless the type is assignable to the Literal[...]
type
That seems like an edge case
Hmm, I don't agree. I think it's quite common for code to only assign an instance variable in the __init__
method of the class, but with the intention that it should be possible to reassign the attribute from other scopes.
and not too much to ask you to add an explicit
: str
if that's what you wanted.
Surely that violates the gradual guarantee? And this doesn't seem to tackle at all the problem of typed code interacting with untyped third-party code, which may be unwilling to add type annotations in a timely manner?
I'm leaning towards either Unknown | Literal[...]
or Unknown | str
for cases like these. Haven't made my mind up about which is preferable there (haven't thought too deeply about it yet!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Surely that violates the gradual guarantee
Yes, but so does widening the literal type to str
. That's what I mean by "unprincipled" -- it doesn't fundamentally change anything, just arbitrarily chooses some wider type because we think it's "probably what you meant."
I like the idea of a union with Unknown
for inferred attribute types! I think that is the principled gradual-guarantee approach here. It will be more forgiving than what people are used to, but I generally like it if we take the gradual guarantee more seriously than existing type checkers: if you want more strictness, annotate.
(I don't see why we would special-case converting it to Unknown | str
instead of Unknown | Literal[...]
if we're unioning with Unknown
-- I think the union already takes care of the problematic aspects of the literal inference, by allowing you to assign anything.)
# mypy and pyright do not show an error here. | ||
reveal_type(c_instance.pure_instance_variable5) # revealed: @Todo(instance attributes) | ||
|
||
c_instance.pure_instance_variable1 = "value set on instance" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depending on how we answer the above question, arguably this should be an error
# TODO: should ideally be `Literal["value set on instance"]` | ||
# (due to earlier assignment of the attribute from the global scope) | ||
reveal_type(c_instance.pure_instance_variable1) # revealed: @Todo(instance attributes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is some question about the extent to which we'll want to do this narrowing. It's unsound in general (because we can't really say with any certainty what happens to c_instance
between that assignment and this check), and particularly likely to be unsound in global scope. But we'll probably need to do some of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've had this conversation several times now 😆 I still favour the more pragmatic behaviour of mypy and pyright over the stricter behaviour of pyre here. But yeah, we can debate the details at a later stage
|
||
# for a more realistic example, let's actually call the method | ||
c_instance.set_instance_variable() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably the even more realistic example would have this method called from __init__
, but if we do that in the test it raises more questions about whether we're trying to detect that call...
## Summary Some clarifications in the instance-attributes tests, mostly regarding type inference behavior following this discussion: #15474 (comment)
Summary
Adds some initial tests for class and instance attributes, mostly to document (and discuss) what we want to support eventually. These tests are not exhaustive yet. The idea is to specify the coarse-grained behavior first.
Things that we'll eventually want to test:
Final
in addition toClassVar
ClassVar
, like making sure that we support things likex: Annotated[ClassVar[int], "metadata"]
__new__
in addition to the tests for__init__
@property
Test Plan
New Markdown tests