-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-74690: typing: Call _get_protocol_attrs
and _callable_members_only
at protocol class creation time, not during isinstance()
checks
#103160
Conversation
…callable_member_only`
Removing request for review from @gvanrossum since he's on vacation :) |
Cc. @posita, for interest :) |
Although I agree that runtime patching of a Protocol is pretty unlikely in practice, I still find this change uncomfortable, because in case any runtime patching does occur, the new behavior is somewhat unpredictable and hard to specify accurately. Effectively the state of the Protocol will be "locked in" whenever the first isinstance check against it occurs. So patches that take effect before that will be respected, and after that won't be. I feel like "runtime patching of a Protocol is not ever respected" would be a defensible behavior, but this inconsistent behavior seems like something we shouldn't adopt. The ideal behavior would probably be "runtime patching of a Protocol is not even possible and errors immediately," (and then we could safely make this optimization) but that's quite difficult or impossible to implement. |
There is some similar caching for ABCs. I'm not too familiar with how it works, but it might be worth looking into how that cache deals with monkeypatching. If we ignore monkeypatching there, it might be fine to do the same for Protocols. |
We could just calculate the |
I would, yeah. It seems like a consistent behavior that we could document, that the state of a runtime-checkable Protocol is locked in when it is defined and runtime patching of the Protocol class has no effect. It also seems unlikely that someone would bother to decorate a Protocol as runtime-checkable and then never runtime check anything against it, so I'm not sure how often the laziness would offer a benefit? It's entirely possible I'm being too conservative here, though -- if @JelleZijlstra is right that there is already similar caching in ABCs with similar issues, that would be a strong signal that we don't need to care here. |
Maybe a library that provides a large number of runtime-checkable protocols as utilities? Doing the calculation of these attributes at class-creation time could plausibly slow down the import of that library, and the user of the library might end up only using the classes for type annotations, never doing any But, doesn't really feel like a big deal.
I also have no idea how the ABC caching works, but will experiment and get back to you! |
Ah yes, here's the ABC caching behaviour: >>> import collections.abc
>>> class Foo:
... def __iter__(self):
... while True: yield 42
...
>>> isinstance(Foo(), collections.abc.Iterable)
True
>>> iter(Foo())
<generator object Foo.__iter__ at 0x000001DBE55B7040>
>>> del Foo.__iter__
>>> isinstance(Foo(), collections.abc.Iterable)
True
>>> iter(Foo())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'Foo' object is not iterable Given that behaviour, it could well be reasonable to slap a cache on the whole |
I do actually think it will make the code easier to read and understand, apart from anything else, if we do the calculation at class-creation time, rather than lazily as I currently have it in my PR. So I think I will make that change. |
typing._ProtocolMeta.__instancecheck__
: Cache results of _get_protocol_attrs
and _callable_members_only
_get_protocol_attrs
and _callable_members_only
at protocol class creation time, not during __isinstance()
checks
_get_protocol_attrs
and _callable_members_only
at protocol class creation time, not during __isinstance()
checks_get_protocol_attrs
and _callable_members_only
at protocol class creation time, not during isinstance()
checks
Moving the calls to protocol-class-creation time makes the benchmark even faster (I've updated the numbers in my initial post). This comes at the cost of making protocol class creation about 40% slower. I think that's actually a reasonable trade-off. Creating the protocol class only needs to be done once, whereas |
Lib/typing.py
Outdated
def __init__(cls, *args, **kwargs): | ||
cls.__protocol_attrs__ = _get_protocol_attrs(cls) | ||
# PEP 544 prohibits using issubclass() | ||
# with protocols that have non-method members. | ||
cls.__callable_proto_members_only__ = all( | ||
callable(getattr(cls, attr, None)) for attr in cls.__protocol_attrs__ | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only reason I'm adding this method here rather than doing this work in Protocol.__init_subclass__
is that it seems slightly more backwards-compatible. With this PR, _ProtcolMeta.__instancecheck__
assumes that all classes with _ProtocolMeta
as their metaclass will have a __protocol_attrs__
attribute. Since _ProtocolMeta
is an undocumented implementation detail, it should only be Protocol
and Protocol
subclasses using _ProtocolMeta
as their metaclass, and if we could count on that, then it would be safe to do this work in Protocol.__init_subclass__
. But it's possible users might have been reaching into the internals of typing.py
and creating other classes that use _ProtocolMeta
as their metaclass, and I don't want to risk breaking their code unnecessarily.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confirmed that there's at least a few uses of _ProtocolMeta
out in the wild:
This change is clearly newsworthy, but I'm adding "skip news" for now -- I'll add a NEWS entry (and possibly a note in "What's new in 3.12") after all the performance-related PRs have been decided on (and possibly merged). |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, assuming NEWS is coming later.
I confirmed locally that this still provides a big speedup, even after #103160 being merged in |
…bers_only` at protocol class creation time, not during `isinstance()` checks (python#103160)
…bers_only` at protocol class creation time, not during `isinstance()` checks (python#103160)
This PR proposes caching the results of
_get_protocol_attrs()
and_callable_members_only()
, so that they only need to be computed once for each protocol class. This hugely speeds up callingisinstance()
against runtime-checkable protocols on the "second call", for all kinds of subtypes of a runtime-checkable protocol. There is, however, a small behaviour change:It seems pretty unlikely that anybody would be doing that, though (monkey-patching methods or the
__annotations__
dict on a protocol class itself). Do we care about the behaviour change? Is it worth documenting the behaviour change, if we do decide it's okay?Here's benchmark results on my machine for this PR:
And here's the same benchmark on
main
:(The benchmark is pretty skewed towards showing a good result for caching, since it just calls
isinstance()
500,000 times against the same runtime-checkable protocol.)Benchmark script