[red-knot] Remove `Type::None` #14024

sharkdp · 2024-10-31T20:09:33Z

Summary

Removes Type::None in favor of KnownClass::NoneType.to_instance(…).

closes #13670

Test Plan

Existing tests pass.

codspeed-hq · 2024-10-31T20:21:16Z

CodSpeed Performance Report

Merging #14024 will degrade performances by 4.32%

_{Comparing david/remove-type-none (c7d4bdb) with main (e302c2d)}

Summary

❌ 1 (👁 1) regressions
✅ 31 untouched benchmarks

Benchmarks breakdown

	Benchmark	`main`	`david/remove-type-none`	Change
👁	`red_knot_check_file[incremental]`	4.4 ms	4.6 ms	-4.32%

github-actions · 2024-10-31T20:29:40Z

`ruff-ecosystem` results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

crates/red_knot_python_semantic/src/types.rs

crates/red_knot_python_semantic/src/types/infer.rs

AlexWaygood · 2024-10-31T20:32:43Z

crates/red_knot_python_semantic/src/types.rs

+            (Type::Instance(self_class), Type::Instance(target_class))
+                if self_class.is_known(db, KnownClass::NoneType) =>
+            {
+                target_class.is_known(db, KnownClass::NoneType)
+            }


We should be able to remove this branch too once we understand that NoneType is a subclass of object (my MRO PR is coming soon!!)

Note that this is about NoneType <: NoneType. The X <: object case is handled further above.

I'm actually surprised that NoneType <: NoneType doesn't work without this. We have a other == self check in is_subclass_of on the ClassTypes. Is that not working correctly? Is it actually checking for equal classes or for some other identity. Like just comparing salsa IDs?

my MRO PR is coming soon!!

That will definitely help us remove some cases!

self == other is indeed just checking salsa IDs. But Salsa IDs are assigned by Salsa interning, which interns based on struct equality, so that should be the same thing as an Eq check.

It would be good to understand what's happening here. Let me know if you'd like another pair of eyes on it. Maybe it's related to the fact that there are multiple NoneType in typeshed? (The one in _typeshed and the one in types for more recent Python versions.)

Also, I think this branch should go into is_equivalent_to instead of is_subtype_of.

carljm

Looks great!

I wonder how much of the perf regression is due to the fact that currently I think every None ends up being a union of the None defined in _typeshed and the one defined intypes? This could mean we spend a lot more time in recursive union handling for things like is_subtype_of? If so, this should be largely mitigated by sys.version_info checking.

carljm · 2024-10-31T21:06:04Z

crates/red_knot_python_semantic/src/types.rs

+            (Type::Instance(self_class), Type::Instance(target_class))
+                if self_class.is_known(db, KnownClass::NoneType) =>
+            {
+                target_class.is_known(db, KnownClass::NoneType)
+            }


self == other is indeed just checking salsa IDs. But Salsa IDs are assigned by Salsa interning, which interns based on struct equality, so that should be the same thing as an Eq check.

It would be good to understand what's happening here. Let me know if you'd like another pair of eyes on it. Maybe it's related to the fact that there are multiple NoneType in typeshed? (The one in _typeshed and the one in types for more recent Python versions.)

Also, I think this branch should go into is_equivalent_to instead of is_subtype_of.

MichaReiser · 2024-11-01T07:35:14Z

I took a look at the perf regression

What I spotted is that this branch now spends 5.6% of total time inside infer_function_definition_statement where we only spent 1.6% before.

Expanding the tree shows that the difference comes from infer_decorators that isn't present in the profile for main. I suspect that it comes from

@lru_cache(maxsize=None)
def cached_tz(hour_str: str, minute_str: str, sign_str: str) -> timezone:
    sign = 1 if sign_str == "+" else -1
    return timezone(
        timedelta(
            hours=sign * int(hour_str),
            minutes=sign * int(minute_str),
        )
    )

Inside infer_decorators there's now a call to Type::none which starts semantic indexing and what not of `lru_cache.

I tried to debug if we now infer a more accurate type for lru_cache but both main and this PR infer Todo for

from functools import lru_cache
from typing import reveal_type

reveal_type(lru_cache(maxsize=None))

The main thing that's puzzling me.... Why is the lru_cache type not cached? Shouldn't this bail out early instead of calling into semantic_index?

Ohh... never mind. I looked at the warmup phase instead of the incremental path.

This is for the incremental part

The finding is the same... and, for some reason, KnownClass::none is re-executing semantic indexing of the functools module which it should not... Not sure why. We should look into this

incremental check_types this PR

incremental check_types main

MichaReiser · 2024-11-01T07:44:33Z

Looking at the salsa code. What this suggests is that we don't reuse the same salsa ids and, because of that have cache misses...

MichaReiser · 2024-11-01T08:07:41Z

I'm not sure what's happening and/or if this is a bug in the benchmark. I copied the tomllib and manually ran the red knot CLI in watch mode and the salsa events indicate that it uses the memoized values except for _re.py that changed.

MichaReiser · 2024-11-01T08:50:45Z

Okay, I might still have looked at the wrong profiles. I now created a PR to use a named closure for the incremental and warmup vs to avoid this in the future.

I haven't looked at it too closely because salsa's maybe changed after profiles are a bit of a pain but I strongly suspect that the difference mainly comes from that salsa now has to validate the ingredients for the module defining None? I suspect that addressing #13169 could help here as well.

carljm · 2024-11-01T20:55:52Z

the difference mainly comes from that salsa now has to validate the ingredients for the module defining None

Is this a speculation or is there evidence pointing to it? I couldn't really follow which comments here are based on looking at the wrong part of the profile :)

This could be, though. I was thinking we would already have a dependence on types module (because builtins imports it), but it's possible tomllib didn't actually resolve the type of any builtin depending on types module, so our lazy per-definition inference might have meant we previously didn't have to semantic-index types module.

If this is the cause of the regression, it's mostly just a weakness in our benchmark. Real-world codebases are likely to already rely on the types module in some way; adding a core typechecking dependency on it should not really be considered a real-world regression, if the regression just comes from "now we have to index that module."

I still think it's worth seeing whether Salsa-caching the resolution of the type of None makes a difference here. If the regression is entirely due to having all the new ingredients from the types module, then it won't.

sharkdp · 2024-11-04T11:38:53Z

Performance investigation

The original version of this PR had a -4% performance regression on the "incremental" as well as the "cold" benchmark (results). The "cold" benchmark was probably just slightly below the warning/error threshold.
Consequently, a rebase to main (after the merge of Give non-existent files a durability of at least Medium #14034 by @MichaReiser) did not help with improving performance. Both versions are still at -4% (results)
I then tried the suggestion by @carljm and added explicit salsa caching for the lookup of NoneType in 503aa46. The CI run now passes, but the performance regression is still there with -3% for both versions (results). Not sure if the resolution of these benchmarks is really sub-1%? If so, this is a small improvement.
I checked which modules were resolved when running red-knot on tomllib; On main, we do not load _typeshed; on this branch, we do (due to the explicit lookup of NoneType in typeshed). This alone could probably explain the increase in runtime.

This also explains why I didn't see a performance regression when comparing this branch with main on a larger codebase (black). Because we do load _typeshed even on main (for some reason) when running on black. Results:

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`./red_knot_main --current-directory /home/shark/black`	113.7 ± 3.4	105.5	120.0	1.00 ± 0.05
`./red_knot_remove_none --current-directory /home/shark/black`	113.2 ± 4.3	107.3	121.1	1.00

After talking to @MichaReiser, I removed the explicit caching of the NoneType lookup again.

AlexWaygood

I've only skimmed, but this all LGTM, and I agree that a small perf "regression" here is fine given your analysis (and given the significant improvements to maintainability this gives us)

* main: (39 commits) Also remove trailing comma while fixing C409 and C419 (astral-sh#14097) Re-enable clippy `useless-format` (astral-sh#14095) Derive message formats macro support to string (astral-sh#14093) Avoid cloning `Name` when looking up function and class types (astral-sh#14092) Replace `format!` without parameters with `.to_string()` (astral-sh#14090) [red-knot] Do not panic when encountering string annotations (astral-sh#14091) [red-knot] Add MRO resolution for classes (astral-sh#14027) [red-knot] Remove `Type::None` (astral-sh#14024) Cached inference of all definitions in an unpacking (astral-sh#13979) Update dependency uuid to v11 (astral-sh#14084) Update Rust crate notify to v7 (astral-sh#14083) Update cloudflare/wrangler-action action to v3.11.0 (astral-sh#14080) Update dependency mdformat-mkdocs to v3.1.1 (astral-sh#14081) Update pre-commit dependencies (astral-sh#14082) Update dependency ruff to v0.7.2 (astral-sh#14077) Update NPM Development dependencies (astral-sh#14078) Update Rust crate thiserror to v1.0.67 (astral-sh#14076) Update Rust crate syn to v2.0.87 (astral-sh#14075) Update Rust crate serde to v1.0.214 (astral-sh#14074) Update Rust crate pep440_rs to v0.7.2 (astral-sh#14073) ...

sharkdp added the red-knot Multi-file analysis & type inference label Oct 31, 2024

sharkdp requested review from carljm, MichaReiser and AlexWaygood as code owners October 31, 2024 20:09

sharkdp force-pushed the david/remove-type-none branch from 917ddd1 to 0534f31 Compare October 31, 2024 20:15

AlexWaygood reviewed Oct 31, 2024

View reviewed changes

crates/red_knot_python_semantic/src/types.rs Show resolved Hide resolved

crates/red_knot_python_semantic/src/types/infer.rs Outdated Show resolved Hide resolved

AlexWaygood reviewed Oct 31, 2024

View reviewed changes

sharkdp force-pushed the david/remove-type-none branch from 0ff025a to 1735975 Compare October 31, 2024 20:58

carljm approved these changes Oct 31, 2024

View reviewed changes

MichaReiser mentioned this pull request Nov 1, 2024

Give non-existent files a durability of at least Medium #14034

Merged

sharkdp added 4 commits November 4, 2024 12:04

[red-knot] Remove Type::None

9cef3a9

Expand TODO comment

77486d1

Remove two identical branches

710d9d2

Handle NoneType == NoneType in is_equivalent_to

c7d4bdb

sharkdp force-pushed the david/remove-type-none branch from f5f010f to c7d4bdb Compare November 4, 2024 11:04

sharkdp force-pushed the david/remove-type-none branch from 503aa46 to c7d4bdb Compare November 4, 2024 12:02

MichaReiser approved these changes Nov 4, 2024

View reviewed changes

AlexWaygood approved these changes Nov 4, 2024

View reviewed changes

sharkdp merged commit 88d9bb1 into main Nov 4, 2024
39 checks passed

sharkdp deleted the david/remove-type-none branch November 4, 2024 13:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[red-knot] Remove `Type::None` #14024

[red-knot] Remove `Type::None` #14024

sharkdp commented Oct 31, 2024

codspeed-hq bot commented Oct 31, 2024 •

edited

Loading

github-actions bot commented Oct 31, 2024 •

edited

Loading

AlexWaygood Oct 31, 2024

sharkdp Oct 31, 2024 •

edited

Loading

carljm Oct 31, 2024

carljm left a comment

carljm Oct 31, 2024

MichaReiser commented Nov 1, 2024 •

edited

Loading

MichaReiser commented Nov 1, 2024

MichaReiser commented Nov 1, 2024

MichaReiser commented Nov 1, 2024

carljm commented Nov 1, 2024

sharkdp commented Nov 4, 2024 •

edited

Loading

AlexWaygood left a comment •

edited

Loading

[red-knot] Remove Type::None #14024

[red-knot] Remove Type::None #14024

Conversation

sharkdp commented Oct 31, 2024

Summary

Test Plan

codspeed-hq bot commented Oct 31, 2024 • edited Loading

CodSpeed Performance Report

Merging #14024 will degrade performances by 4.32%

Summary

Benchmarks breakdown

github-actions bot commented Oct 31, 2024 • edited Loading

ruff-ecosystem results

Linter (stable)

Linter (preview)

AlexWaygood Oct 31, 2024

Choose a reason for hiding this comment

sharkdp Oct 31, 2024 • edited Loading

Choose a reason for hiding this comment

carljm Oct 31, 2024

Choose a reason for hiding this comment

carljm left a comment

Choose a reason for hiding this comment

carljm Oct 31, 2024

Choose a reason for hiding this comment

MichaReiser commented Nov 1, 2024 • edited Loading

MichaReiser commented Nov 1, 2024

MichaReiser commented Nov 1, 2024

MichaReiser commented Nov 1, 2024

carljm commented Nov 1, 2024

sharkdp commented Nov 4, 2024 • edited Loading

Performance investigation

AlexWaygood left a comment • edited Loading

Choose a reason for hiding this comment

[red-knot] Remove `Type::None` #14024

[red-knot] Remove `Type::None` #14024

codspeed-hq bot commented Oct 31, 2024 •

edited

Loading

github-actions bot commented Oct 31, 2024 •

edited

Loading

`ruff-ecosystem` results

sharkdp Oct 31, 2024 •

edited

Loading

MichaReiser commented Nov 1, 2024 •

edited

Loading

sharkdp commented Nov 4, 2024 •

edited

Loading

AlexWaygood left a comment •

edited

Loading