Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UP038 rewrites code to make it slower and more verbose #7871

Open
AlexWaygood opened this issue Oct 9, 2023 · 8 comments
Open

UP038 rewrites code to make it slower and more verbose #7871

AlexWaygood opened this issue Oct 9, 2023 · 8 comments
Labels
needs-decision Awaiting a decision from a maintainer

Comments

@AlexWaygood
Copy link
Member

(Apologies for the slightly negative tone in this issue -- I'm a big fan of ruff, and think it's a great tool!)


If I have a file foo.py, like so:

isinstance('foo', (int, bytes, dict, set, list, memoryview, str))

Running ruff foo.py --select=UP --fix --target-version="py310" on this file will make the following change:

-isinstance('foo', (int, bytes, dict, set, list, memoryview, str))
+isinstance('foo', int | bytes | dict | set | list | memoryview | str)

The new code is more verbose. I wouldn't mind that so much, however... if it wasn't also much slower!

>python -m timeit "isinstance('foo', (int, bytes, list, set, dict, str))"
Running PGUpdate|x64 interpreter...
2000000 loops, best of 5: 160 nsec per loop

>python -m timeit "isinstance('foo', int|bytes|list|set|dict|str)"
Running PGUpdate|x64 interpreter...
500000 loops, best of 5: 500 nsec per loop

(These timings are using a ~reasonably fresh PGO-optimised non-debug build of the CPython main branch on Windows.)

The rule can of course be switched off in a config file for ruff. But I'm not sure what advantages this rewrite actually has. The issue that proposed this rule (#2923) doesn't really give any motivations for wanting these isinstance() checks to be rewritten other than "it would be great". I also feel like the comment in #2923 (comment) is overly dismissive of the performance concerns when the writer says that "isinstance() checks... are very unlikely to dominate any real-world workloads". It can absolutely cause real-world issues if an isinstance() check is unexpectedly slow: see python/cpython#74690 (comment) as an example.

My preferred outcome would honestly be if ruff removed this rule. I feel like it's encouraging an anti-pattern, and it means that I can't have select = ["UP"] in my ruff config without also adding ignore = ["UP038"]. If it has to stay, though, I feel like the performance considerations should be loudly called out in the docs -- there's currently no mention of the potential issues here.

Note that this rule doesn't exist in pyupgrade itself, only in ruff: it was specifically rejected for pyupgrade, in part because of the performance issues: asottile/pyupgrade#736.

@konstin konstin added the needs-decision Awaiting a decision from a maintainer label Oct 9, 2023
@konstin
Copy link
Member

konstin commented Oct 9, 2023

Isn't that more like a cpython bug? The pipe version is more consistent with type annotations users will be already familiar with, e.g. for x in f(x: int | str) holds true that isinstance(x, int | str), but from my perspective there's no reason why a UnionType instance should be slower than a tuple[type]. (Even though we have to design the rules according to what cpython currently does)

@AlexWaygood
Copy link
Member Author

AlexWaygood commented Oct 9, 2023

(Possibly relevant context: I'm a CPython core dev and a typing-module maintainer, though I basically stick to the pure-Python parts of the CPython repo :)

Isn't that more like a cpython bug?

Maybe! But we already had a go at optimising isinstance() checks against types.UnionType instances: python/cpython#91603. If you have any other ideas for how we could speed them up, feel free to file an issue ;)

I also think ruff's doing a disservice to its users if its rules are based on an imagined version of CPython that has no bugs. Whether or not it's a CPython bug is relevant to whether an issue should be filed with CPython to improve the performance here. But the CPython performance improvement, if it's doable, won't arrive until Python 3.13. Should ruff really be recommending the slower idiom to its users in the meantime, on the basis of "Well, this is CPython's problem, this code should really be faster"?

The pipe version is more consistent with type annotations users will be already familiar with, e.g. for x in f(x: int | str) holds true that isinstance(x, int | str)

Absolutely, and I believe this is why PEP 604 specified that isinstance() and issubclass() should accept instances of types.UnionType: so that you could cleanly write code like this, which I have no quarrel with:

CustomTypeAlias = str | int

def foo(x: CustomTypeAlias | bytes) -> None:
    if isinstance(x, CustomTypeAlias):
        print('foo')
    else:
        print('bar')

Just because you can do it to make your code more DRY in cases like this, though, doesn't mean that you should change existing code to make it slower just so that it makes use of an idiom that was added in a more recent Python release.

@zanieb
Copy link
Member

zanieb commented Oct 9, 2023

I think the best path forward here may be to remove this rule from the default set when we re-categorize the rules.

makes use of an idiom that was added in a more recent Python release.

Is kind of the point of the upgrade rules, but I agree it should probably be opt-in since it has a performance cost. A comment about this in the documentation for the rule would be good.

@AlexWaygood
Copy link
Member Author

makes use of an idiom that was added in a more recent Python release.

Is kind of the point of the upgrade rules

Well, the original pyupgrade tool definitely upgrades code to use more modern syntax, and to use features introduced in more recent Python versions. But I feel like that's a means to an end than the actual goal, per se. This is just my subjective impression, but I always considered the goal of pyupgrade to be to transform code to make it cleaner and more elegant -- and the way it does that is by upgrading the code to make use of more modern idioms. In my opinion, I don't think this rule really encourages code that's significantly cleaner or more elegant.

@AlexWaygood
Copy link
Member Author

Anyway, thanks for the responses! I can write up a docs PR :)

zanieb pushed a commit that referenced this issue Oct 9, 2023
See discussion in #7871. I tried to use language similar to the existing
performance warnings in the `flake8-use-pathlib` docs, e.g.
https://docs.astral.sh/ruff/rules/os-path-abspath/#os-path-abspath-pth100
@charliermarsh
Copy link
Member

Super interesting, thanks for sharing @AlexWaygood.

I suspect that if I knew this at the time, I probably wouldn't have added UP038 -- though it's hard to say. While the code change has merit (with respect to improved compatibility with annotation syntax elsewhere), it has to be weighed holistically, and (CPython) performance is part of that equation.

Removing it entirely feels off at this point, but I agree with trying to categorize it out of any default ruleset once we recategorize.

@Avasam
Copy link
Contributor

Avasam commented Mar 14, 2024

In the mean time, can rules with fixes that create a known negative performance impact be marked as unsafe?
I can think of this one and https://docs.astral.sh/ruff/rules/suppressible-exception/

@zanieb
Copy link
Member

zanieb commented Mar 15, 2024

I don't think performance fits into our concept of fix safety. I think not raising the violation makes a lot more sense than gating the fix for the violation.

Note you can manually downgrade fixes to unsafe in your configuration.

DimitriPapadopoulos added a commit to DimitriPapadopoulos/setuptools that referenced this issue May 22, 2024
DimitriPapadopoulos added a commit to DimitriPapadopoulos/setuptools that referenced this issue May 22, 2024
DimitriPapadopoulos added a commit to DimitriPapadopoulos/setuptools that referenced this issue May 23, 2024
DimitriPapadopoulos added a commit to DimitriPapadopoulos/setuptools that referenced this issue May 23, 2024
DimitriPapadopoulos added a commit to DimitriPapadopoulos/setuptools that referenced this issue May 23, 2024
DimitriPapadopoulos added a commit to DimitriPapadopoulos/setuptools that referenced this issue May 23, 2024
DimitriPapadopoulos added a commit to DimitriPapadopoulos/setuptools that referenced this issue May 23, 2024
DimitriPapadopoulos added a commit to DimitriPapadopoulos/setuptools that referenced this issue May 23, 2024
DimitriPapadopoulos added a commit to DimitriPapadopoulos/setuptools that referenced this issue Jun 28, 2024
DimitriPapadopoulos added a commit to DimitriPapadopoulos/setuptools_scm that referenced this issue Oct 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-decision Awaiting a decision from a maintainer
Projects
None yet
Development

No branches or pull requests

5 participants