gh-74690: typing: Call `_get_protocol_attrs` and `_callable_members_only` at protocol class creation time, not during `isinstance()` checks #103160

AlexWaygood · 2023-04-01T00:04:18Z

This PR proposes caching the results of _get_protocol_attrs() and _callable_members_only(), so that they only need to be computed once for each protocol class. This hugely speeds up calling isinstance() against runtime-checkable protocols on the "second call", for all kinds of subtypes of a runtime-checkable protocol. There is, however, a small behaviour change:

>>> from typing import *
>>> @runtime_checkable
... class Bar(Protocol):
...     x: int
...
>>> class Foo:
...     def __init__(self):
...         self.x = 42
...
>>> isinstance(Foo(), Bar)
True
>>> Bar.__annotations__["y"] = int
>>> isinstance(Foo(), Bar)  # Evaluates to `False` on `main`; `True` with this PR

It seems pretty unlikely that anybody would be doing that, though (monkey-patching methods or the __annotations__ dict on a protocol class itself). Do we care about the behaviour change? Is it worth documenting the behaviour change, if we do decide it's okay?

Here's benchmark results on my machine for this PR:

Time taken for objects with a property: 1.76
Time taken for objects with a classvar: 1.69
Time taken for objects with an instance var: 2.38
Time taken for objects with no var: 7.28
Time taken for nominal subclass instances: 19.92
Time taken for registered subclass instances: 11.60

And here's the same benchmark on main:

Time taken for objects with a property: 3.14
Time taken for objects with a classvar: 3.14
Time taken for objects with an instance var: 11.57
Time taken for objects with no var: 15.26
Time taken for nominal subclass instances: 24.60
Time taken for registered subclass instances: 21.32

(The benchmark is pretty skewed towards showing a good result for caching, since it just calls isinstance() 500,000 times against the same runtime-checkable protocol.)

Benchmark script

import time
from typing import Protocol, runtime_checkable

@runtime_checkable
class HasX(Protocol):
    x: int

class Foo:
    @property
    def x(self) -> int:
        return 42

class Bar:
    x = 42

class Baz:
    def __init__(self):
        self.x = 42

class Egg: ...

class Nominal(HasX):
    def __init__(self):
        self.x = 42

class Registered: ...

HasX.register(Registered)

num_instances = 500_000
foos = [Foo() for _ in range(num_instances)]
bars = [Bar() for _ in range(num_instances)]
bazzes = [Baz() for _ in range(num_instances)]
basket = [Egg() for _ in range(num_instances)]
nominals = [Nominal() for _ in range(num_instances)]
registereds = [Registered() for _ in range(num_instances)]


def bench(objs, title):
    start_time = time.perf_counter()
    for obj in objs:
        isinstance(obj, HasX)
    elapsed = time.perf_counter() - start_time
    print(f"{title}: {elapsed:.2f}")


bench(foos, "Time taken for objects with a property")
bench(bars, "Time taken for objects with a classvar")
bench(bazzes, "Time taken for objects with an instance var")
bench(basket, "Time taken for objects with no var")
bench(nominals, "Time taken for nominal subclass instances")
bench(registereds, "Time taken for registered subclass instances")

Issue: Performance of typing._ProtocolMeta._get_protocol_attrs and isinstance #74690

…callable_member_only`

AlexWaygood · 2023-04-01T00:05:00Z

Removing request for review from @gvanrossum since he's on vacation :)

AlexWaygood · 2023-04-01T00:24:38Z

Cc. @posita, for interest :)

carljm · 2023-04-01T14:50:06Z

Although I agree that runtime patching of a Protocol is pretty unlikely in practice, I still find this change uncomfortable, because in case any runtime patching does occur, the new behavior is somewhat unpredictable and hard to specify accurately. Effectively the state of the Protocol will be "locked in" whenever the first isinstance check against it occurs. So patches that take effect before that will be respected, and after that won't be.

I feel like "runtime patching of a Protocol is not ever respected" would be a defensible behavior, but this inconsistent behavior seems like something we shouldn't adopt.

The ideal behavior would probably be "runtime patching of a Protocol is not even possible and errors immediately," (and then we could safely make this optimization) but that's quite difficult or impossible to implement.

JelleZijlstra · 2023-04-01T14:55:12Z

There is some similar caching for ABCs. I'm not too familiar with how it works, but it might be worth looking into how that cache deals with monkeypatching. If we ignore monkeypatching there, it might be fine to do the same for Protocols.

AlexWaygood · 2023-04-01T15:14:15Z

I feel like "runtime patching of a Protocol is not ever respected" would be a defensible behavior, but this inconsistent behavior seems like something we shouldn't adopt.

We could just calculate the __protocol_attrs__ and __callable_proto_members_only__ at protocol-class creation time (in Protocol.__init_subclass__ or _ProtocolMeta.__new__) rather than at first-isinstance()-call-time. That would slow down protocol class creation a little, but probably not by much, and protocol class creation is only ever done once. Would you like that behaviour more?

carljm · 2023-04-01T15:19:33Z

We could just calculate the __protocol_attrs__ and __callable_proto_members_only__ at protocol-class creation time (in __init_subclass__) rather than at first-isinstance()-call-time. That would slow down protocol class creation a little, but probably not by much, and protocol class creation is only ever done once. Would you like that behaviour more?

I would, yeah. It seems like a consistent behavior that we could document, that the state of a runtime-checkable Protocol is locked in when it is defined and runtime patching of the Protocol class has no effect.

It also seems unlikely that someone would bother to decorate a Protocol as runtime-checkable and then never runtime check anything against it, so I'm not sure how often the laziness would offer a benefit?

It's entirely possible I'm being too conservative here, though -- if @JelleZijlstra is right that there is already similar caching in ABCs with similar issues, that would be a strong signal that we don't need to care here.

AlexWaygood · 2023-04-01T15:22:33Z

It also seems unlikely that someone would bother to decorate a Protocol as runtime-checkable and then never runtime check anything against it, so I'm not sure how often the laziness would offer a benefit?

Maybe a library that provides a large number of runtime-checkable protocols as utilities? Doing the calculation of these attributes at class-creation time could plausibly slow down the import of that library, and the user of the library might end up only using the classes for type annotations, never doing any isinstance() checks.

But, doesn't really feel like a big deal.

It's entirely possible I'm being too conservative here, though -- if @JelleZijlstra is right that there is already similar caching in ABCs with similar issues, that would be a strong signal that we don't need to care here.

I also have no idea how the ABC caching works, but will experiment and get back to you!

AlexWaygood · 2023-04-01T15:47:32Z

Ah yes, here's the ABC caching behaviour:

>>> import collections.abc
>>> class Foo:
...     def __iter__(self):
...         while True: yield 42
...
>>> isinstance(Foo(), collections.abc.Iterable)
True
>>> iter(Foo())
<generator object Foo.__iter__ at 0x000001DBE55B7040>
>>> del Foo.__iter__
>>> isinstance(Foo(), collections.abc.Iterable)
True
>>> iter(Foo())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'Foo' object is not iterable

Given that behaviour, it could well be reasonable to slap a cache on the whole _ProtocolMeta.__instancecheck__ call, similar to the way beartype does it... though I still don't much like the idea of such a big behaviour change tbh.

AlexWaygood · 2023-04-01T15:53:44Z

I do actually think it will make the code easier to read and understand, apart from anything else, if we do the calculation at class-creation time, rather than lazily as I currently have it in my PR. So I think I will make that change.

AlexWaygood · 2023-04-01T16:30:22Z

Moving the calls to protocol-class-creation time makes the benchmark even faster (I've updated the numbers in my initial post).

This comes at the cost of making protocol class creation about 40% slower. I think that's actually a reasonable trade-off. Creating the protocol class only needs to be done once, whereas isinstance() checks against the protocol class may need to be done repeatedly. Moreover, a 40% slowdown sounds severe, but I doubt it will actually lead to import times getting slower anywhere. E.g. typing.py defines seven runtime-checkable protocols, but importing typing doesn't seem to get significantly slower (if at all) with this PR.

AlexWaygood · 2023-04-01T16:53:23Z

Lib/typing.py

+    def __init__(cls, *args, **kwargs):
+        cls.__protocol_attrs__ = _get_protocol_attrs(cls)
+        # PEP 544 prohibits using issubclass()
+        # with protocols that have non-method members.
+        cls.__callable_proto_members_only__ = all(
+            callable(getattr(cls, attr, None)) for attr in cls.__protocol_attrs__
+        )


The only reason I'm adding this method here rather than doing this work in Protocol.__init_subclass__ is that it seems slightly more backwards-compatible. With this PR, _ProtcolMeta.__instancecheck__ assumes that all classes with _ProtocolMeta as their metaclass will have a __protocol_attrs__ attribute. Since _ProtocolMeta is an undocumented implementation detail, it should only be Protocol and Protocol subclasses using _ProtocolMeta as their metaclass, and if we could count on that, then it would be safe to do this work in Protocol.__init_subclass__. But it's possible users might have been reaching into the internals of typing.py and creating other classes that use _ProtocolMeta as their metaclass, and I don't want to risk breaking their code unnecessarily.

Confirmed that there's at least a few uses of _ProtocolMeta out in the wild:

https://github.com/antonagestam/phantom-types/blob/68ee857af4452a90a48669a8dbc7d1d7110b31d0/src/phantom/sized.py#L71-L82

https://github.com/protolambda/remerkleable/blob/91ed092d08ef0ba5ab076f0a34b0b371623db728/remerkleable/core.py#L18-L22

https://github.com/InterStella0/stella_bot/blob/bf5f5632bcd88670df90be67b888c282c6e83d99/utils/useful.py#L337-L343

AlexWaygood · 2023-04-02T13:30:06Z

This change is clearly newsworthy, but I'm adding "skip news" for now -- I'll add a NEWS entry (and possibly a note in "What's new in 3.12") after all the performance-related PRs have been decided on (and possibly merged).

AlexWaygood · 2023-04-02T13:43:51Z

main and this PR branch are both slower after 6d59c9e, so I've updated the benchmark results in the PR description.

carljm

Looks good to me, assuming NEWS is coming later.

Lib/typing.py

AlexWaygood · 2023-04-05T11:18:17Z

I confirmed locally that this still provides a big speedup, even after #103160 being merged in

…bers_only` at protocol class creation time, not during `isinstance()` checks (python#103160)

pythongh-74690: typing: Cache results of _get_protocol_attrs and `_…

f15c64a

…callable_member_only`

AlexWaygood added type-feature A feature request or enhancement performance Performance or resource usage DO-NOT-MERGE stdlib Python modules in the Lib dir topic-typing 3.12 bugs and security fixes labels Apr 1, 2023

AlexWaygood requested review from carljm and hauntsaninja April 1, 2023 00:04

AlexWaygood requested review from gvanrossum, Fidget-Spinner and JelleZijlstra as code owners April 1, 2023 00:04

bedevere-bot mentioned this pull request Apr 1, 2023

Performance of typing._ProtocolMeta._get_protocol_attrs and isinstance #74690

Closed

bedevere-bot added the awaiting core review label Apr 1, 2023

AlexWaygood removed the request for review from gvanrossum April 1, 2023 00:04

AlexWaygood added 2 commits April 1, 2023 11:17

Simplify and inline

f3ec3db

Use a metaclass instance method, not a classmethod on the class

8e3c6f7

Move to class-creation time

a34f700

Use __init__, not __new__

4d05074

AlexWaygood commented Apr 1, 2023

View reviewed changes

Add missing super().__init__() call, to be safe

d3f9ebe

AlexWaygood mentioned this pull request Apr 2, 2023

gh-102433: Use inspect.getattr_static in typing._ProtocolMeta.__instancecheck__ #103034

Merged

Merge branch 'main' into protocol-attrs-cache-1

e8a6304

AlexWaygood added the skip news label Apr 2, 2023

JelleZijlstra mentioned this pull request Apr 2, 2023

Typing: undocumented behaviour change for protocols decorated with @final and @runtime_checkable in 3.11 #103171

Closed

carljm approved these changes Apr 4, 2023

View reviewed changes

bedevere-bot added awaiting merge and removed awaiting core review labels Apr 4, 2023

Merge branch 'main' into protocol-attrs-cache-1

4816afa

AlexWaygood commented Apr 5, 2023

View reviewed changes

Lib/typing.py Outdated Show resolved Hide resolved

Update Lib/typing.py

08af615

AlexWaygood removed the DO-NOT-MERGE label Apr 5, 2023

AlexWaygood merged commit 3246688 into python:main Apr 5, 2023

bedevere-bot removed the awaiting merge label Apr 5, 2023

AlexWaygood deleted the protocol-attrs-cache-1 branch April 5, 2023 11:19

AlexWaygood mentioned this pull request Apr 7, 2023

gh-74690: Document changes made to runtime-checkable protocols in 3.12 #103348

Merged

gaogaotiantian pushed a commit to gaogaotiantian/cpython that referenced this pull request Apr 8, 2023

pythongh-74690: typing: Call _get_protocol_attrs and `_callable_mem…

9b978ee

…bers_only` at protocol class creation time, not during `isinstance()` checks (python#103160)

warsaw pushed a commit to warsaw/cpython that referenced this pull request Apr 11, 2023

pythongh-74690: typing: Call _get_protocol_attrs and `_callable_mem…

f2f5e5d

…bers_only` at protocol class creation time, not during `isinstance()` checks (python#103160)

AlexWaygood mentioned this pull request Apr 12, 2023

Backport performance improvements to runtime-checkable protocols python/typing_extensions#137

Merged

AlexWaygood mentioned this pull request May 18, 2023

gh-74690: Don't set special protocol attributes on non-protocol subclasses of protocols #104622

Merged

JelleZijlstra mentioned this pull request May 24, 2023

Add typing.get_protocol_members and typing.is_protocol #104873

Closed

AlexWaygood mentioned this pull request May 24, 2023

[Bug] typing_extensions.Protocol unsupported under typing_extensions ≥ 4.6.0 beartype/beartype#241

Closed

AlexWaygood mentioned this pull request Nov 16, 2023

datetime.__sub__ overload order python/typeshed#10924

Closed

posita mentioned this pull request Aug 9, 2024

Is the protocol caching still needed? beartype/numerary#20

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gh-74690: typing: Call `_get_protocol_attrs` and `_callable_members_only` at protocol class creation time, not during `isinstance()` checks #103160

gh-74690: typing: Call `_get_protocol_attrs` and `_callable_members_only` at protocol class creation time, not during `isinstance()` checks #103160

AlexWaygood commented Apr 1, 2023 •

edited

Loading

AlexWaygood commented Apr 1, 2023

AlexWaygood commented Apr 1, 2023

carljm commented Apr 1, 2023

JelleZijlstra commented Apr 1, 2023

AlexWaygood commented Apr 1, 2023 •

edited

Loading

carljm commented Apr 1, 2023

AlexWaygood commented Apr 1, 2023 •

edited

Loading

AlexWaygood commented Apr 1, 2023 •

edited

Loading

AlexWaygood commented Apr 1, 2023

AlexWaygood commented Apr 1, 2023 •

edited

Loading

AlexWaygood Apr 1, 2023 •

edited

Loading

AlexWaygood Apr 1, 2023

AlexWaygood commented Apr 2, 2023

AlexWaygood commented Apr 2, 2023

carljm left a comment

AlexWaygood commented Apr 5, 2023

gh-74690: typing: Call _get_protocol_attrs and _callable_members_only at protocol class creation time, not during isinstance() checks #103160

gh-74690: typing: Call _get_protocol_attrs and _callable_members_only at protocol class creation time, not during isinstance() checks #103160

Conversation

AlexWaygood commented Apr 1, 2023 • edited Loading

AlexWaygood commented Apr 1, 2023

AlexWaygood commented Apr 1, 2023

carljm commented Apr 1, 2023

JelleZijlstra commented Apr 1, 2023

AlexWaygood commented Apr 1, 2023 • edited Loading

carljm commented Apr 1, 2023

AlexWaygood commented Apr 1, 2023 • edited Loading

AlexWaygood commented Apr 1, 2023 • edited Loading

AlexWaygood commented Apr 1, 2023

AlexWaygood commented Apr 1, 2023 • edited Loading

AlexWaygood Apr 1, 2023 • edited Loading

Choose a reason for hiding this comment

AlexWaygood Apr 1, 2023

Choose a reason for hiding this comment

AlexWaygood commented Apr 2, 2023

AlexWaygood commented Apr 2, 2023

carljm left a comment

Choose a reason for hiding this comment

AlexWaygood commented Apr 5, 2023

gh-74690: typing: Call `_get_protocol_attrs` and `_callable_members_only` at protocol class creation time, not during `isinstance()` checks #103160

gh-74690: typing: Call `_get_protocol_attrs` and `_callable_members_only` at protocol class creation time, not during `isinstance()` checks #103160

AlexWaygood commented Apr 1, 2023 •

edited

Loading

AlexWaygood commented Apr 1, 2023 •

edited

Loading

AlexWaygood commented Apr 1, 2023 •

edited

Loading

AlexWaygood commented Apr 1, 2023 •

edited

Loading

AlexWaygood commented Apr 1, 2023 •

edited

Loading

AlexWaygood Apr 1, 2023 •

edited

Loading