[red-knot] Switch to a handwritten parser for mdtest error assertions #16422

AlexWaygood · 2025-02-27T20:26:40Z

Summary

I keep on making mistakes when writing mdtests that the framework makes hard to debug. The latest iteration here was when I added a test that looked like this while working on #16321:

```py
class TestIter:
    def __next__(self) -> int:
        return 42

class Test:
    def __iter__(self) -> TestIter | int:
        return TestIter()

# error: [not-iterable] "Object of type `Test` may not be iterable because its `__iter__` method returns an object of type `TestIter | int`, which may not have a `__next__` method
for x in Test():
    reveal_type(x)  # revealed: int
```

Notice the mistake there? It took me an annoying amount of time to figure it out. The error in the test is that the asserted error message is missing its closing " character. But mdtest doesn't tell you that if you apply this diff to main:

diff --git a/crates/red_knot_python_semantic/resources/mdtest/loops/for.md b/crates/red_knot_python_semantic/resources/mdtest/loops/for.md
index 394c104b9..c32cbda7e 100644
--- a/crates/red_knot_python_semantic/resources/mdtest/loops/for.md
+++ b/crates/red_knot_python_semantic/resources/mdtest/loops/for.md
@@ -286,7 +286,7 @@ class Test:
     def __iter__(self) -> TestIter | int:
         return TestIter()
 
-# error: [not-iterable] "Object of type `Test` may not be iterable because its `__iter__` method returns an object of type `TestIter | int`, which may not have a `__next__` method"
+# error: [not-iterable] "Object of type `Test` may not be iterable because its `__iter__` method returns an object of type `TestIter | int`, which may not have a `__next__` method
 for x in Test():
     reveal_type(x)  # revealed: int

Instead it says:

failures:

---- mdtest__loops_for stdout ----

for.md - For loops - Union type as iterator where one union element has no `__next__` method

  crates/red_knot_python_semantic/resources/mdtest/loops/for.md:290 unexpected error: [not-iterable] "Object of type `Test` may not be iterable because its `__iter__` method returns an object of type `TestIter | int`, which may not have a `__next__` method"

To rerun this specific test, set the environment variable: MDTEST_TEST_FILTER="for.md - For loops - Union type as iterator where one union element has no `__next__` method"
MDTEST_TEST_FILTER="for.md - For loops - Union type as iterator where one union element has no `__next__` method" cargo test -p red_knot_python_semantic --test mdtest -- mdtest__loops_for

This PR switches our parsing of # error: assertion comments from regexes to a custom parser. This allows us to explicitly reject malformed assertion messages like this rather than misinterpreting them. Helpful messages are now given to the user in this and many other cases when a malformed assertion message is identified. Assertion messages are also now parsed lazily when matching assertion messages to the diagnostics that are emitted: this allows us to print tailored error messages to the terminal about the malformed assertion messages and recover from them, which would be harder if they were parsed eagerly.

Test Plan

I added many unit tests to the red_knot_test crate. I also manually checked that if you apply the diff given above, mdtest now produces this error message:

failures:

---- mdtest__loops_for stdout ----

for.md - For loops - Union type as iterator where one union element has no `__next__` method

  crates/red_knot_python_semantic/resources/mdtest/loops/for.md:290 invalid assertion: expected '"' to be the final character in an assertion with an error message
  crates/red_knot_python_semantic/resources/mdtest/loops/for.md:290 unexpected error: 10 [not-iterable] "Object of type `Test` may not be iterable because its `__iter__` method returns an object of type `TestIter | int`, which may not have a `__next__` method"

To rerun this specific test, set the environment variable: MDTEST_TEST_FILTER="for.md - For loops - Union type as iterator where one union element has no `__next__` method"
MDTEST_TEST_FILTER="for.md - For loops - Union type as iterator where one union element has no `__next__` method" cargo test -p red_knot_python_semantic --test mdtest -- mdtest__loops_for

github-actions · 2025-02-27T20:41:36Z

`ruff-ecosystem` results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

Formatter (stable)

✅ ecosystem check detected no format changes.

Formatter (preview)

✅ ecosystem check detected no format changes.

carljm

👍

carljm · 2025-02-28T00:47:48Z

crates/red_knot_test/src/matcher.rs

+// This is necessary since we only parse assertions lazily,
+// and sometimes we know before parsing any assertions that an assertion will be unmatched,
+// e.g. if we've exhausted all diagnostics but there are still assertions left.


Seems a little odd that you only get assertion-parsing errors if there is at least one diagnostic that could match the assertion, but I think in practice it's fine?

Would it be hard to force-parse unmatched assertions?

I don't think it fits neatly into the current architecture. I'm going to defer it for now.

I'll add a TODO

dhruvmanila · 2025-02-28T06:28:25Z

crates/red_knot_test/src/assertion.rs

+impl<'a> ErrorAssertion<'a> {
+    fn from_str(source: &'a str) -> Result<Self, ErrorAssertionParseError<'a>> {
+        ErrorAssertionParser::new(source).parse()
+    }
+}


nit: you could implement FromStr instead

This was my first thought too, but unfortunately I don't think it works because of the fact that ErrorAssertion is generic over a lifetime :-( Or at least, I can't seem to make it work

MichaReiser

Nice

MichaReiser · 2025-02-28T08:15:28Z

crates/red_knot_test/src/assertion.rs

+        comment
+            .strip_prefix(TYPE_PREFIX)
+            .map(Self::Revealed)
+            .or_else(|| comment.strip_prefix(ERROR_PREFIX).map(Self::Error))


Not necessarily something for this PR but two common errors that I made in the past:

I wrote error[code]: which is wrong

I misspelled error or revealed

Too much or not enough whitespace between # and the pragma keyword`

The second one is probably easy to detect (error on any comment of the form <pragma>: and allowlist legit pragmas (fmt, type, knot)

The second one is probably easy to detect (error on any comment of the form <pragma>: and allowlist legit pragmas (fmt, type, knot)

Hah, you would think so... I've just spent a bit of time playing around with it, and it seems surprisingly hard to shoehorn into our current architecture :-(

I'm making the whitespace handling more lenientm though; that's easy to fix

crates/red_knot_test/src/assertion.rs

MichaReiser · 2025-02-28T08:27:05Z

crates/red_knot_test/src/matcher.rs

+// This is necessary since we only parse assertions lazily,
+// and sometimes we know before parsing any assertions that an assertion will be unmatched,
+// e.g. if we've exhausted all diagnostics but there are still assertions left.


Would it be hard to force-parse unmatched assertions?

* main: [red-knot] Switch to a handwritten parser for mdtest error assertions (#16422) [red-knot] Disallow more invalid type expressions (#16427) Bump version to Ruff 0.9.9 (#16434) Check `LinterSettings::preview` for version-related syntax errors (#16429) Avoid caching files with unsupported syntax errors (#16425) Prioritize "bug" label for changelog sections (#16433) [`flake8-copyright`] Add links to applicable options (`CPY001`) (#16421) Fix string-length limit in documentation for PYI054 (#16432) Show version-related syntax errors in the playground (#16419) Allow passing `ParseOptions` to inline tests (#16357) Bump version to 0.9.8 (#16414) [red-knot] Ignore surrounding whitespace when looking for `` directives in mdtests (#16380) Notify users for invalid client settings (#16361) Avoid indexing the project if `configurationPreference` is `editorOnly` (#16381)

* dcreager/alist-type: Remove unneeded shear override Update property test CI Move alist into red_knot_python_semantic [red-knot] Switch to a handwritten parser for mdtest error assertions (#16422) [red-knot] Disallow more invalid type expressions (#16427) Bump version to Ruff 0.9.9 (#16434) Check `LinterSettings::preview` for version-related syntax errors (#16429) Avoid caching files with unsupported syntax errors (#16425) Prioritize "bug" label for changelog sections (#16433) [`flake8-copyright`] Add links to applicable options (`CPY001`) (#16421) Fix string-length limit in documentation for PYI054 (#16432) Show version-related syntax errors in the playground (#16419) Allow passing `ParseOptions` to inline tests (#16357) Bump version to 0.9.8 (#16414) [red-knot] Ignore surrounding whitespace when looking for `` directives in mdtests (#16380) Notify users for invalid client settings (#16361) Avoid indexing the project if `configurationPreference` is `editorOnly` (#16381)

AlexWaygood added 5 commits February 27, 2025 18:19

Switch to using a parser for error assertions

160e3fb

Use thiserror

a950cf3

Also parse revealed assertions lazily

f6f5e25

more tests

bb23552

more docs

211fc46

AlexWaygood added testing Related to testing Ruff itself red-knot Multi-file analysis & type inference labels Feb 27, 2025

AlexWaygood requested review from carljm, MichaReiser and sharkdp as code owners February 27, 2025 20:26

update doc-comments

918e2dd

carljm approved these changes Feb 28, 2025

View reviewed changes

dhruvmanila reviewed Feb 28, 2025

View reviewed changes

MichaReiser approved these changes Feb 28, 2025

View reviewed changes

AlexWaygood force-pushed the alex/mdtest-error-msg branch 2 times, most recently from e8bf3ad to 9056629 Compare February 28, 2025 11:22

review comments

7826cfd

AlexWaygood force-pushed the alex/mdtest-error-msg branch from 9056629 to 7826cfd Compare February 28, 2025 11:24

add todo

5203341

AlexWaygood merged commit 0c7c001 into main Feb 28, 2025
21 checks passed

AlexWaygood deleted the alex/mdtest-error-msg branch February 28, 2025 11:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[red-knot] Switch to a handwritten parser for mdtest error assertions #16422

[red-knot] Switch to a handwritten parser for mdtest error assertions #16422

AlexWaygood commented Feb 27, 2025

github-actions bot commented Feb 27, 2025 •

edited

Loading

carljm left a comment

carljm Feb 28, 2025

MichaReiser Feb 28, 2025

AlexWaygood Feb 28, 2025

AlexWaygood Feb 28, 2025

AlexWaygood Feb 28, 2025

dhruvmanila Feb 28, 2025

AlexWaygood Feb 28, 2025

MichaReiser left a comment

MichaReiser Feb 28, 2025

AlexWaygood Feb 28, 2025

MichaReiser Feb 28, 2025

[red-knot] Switch to a handwritten parser for mdtest error assertions #16422

[red-knot] Switch to a handwritten parser for mdtest error assertions #16422

Conversation

AlexWaygood commented Feb 27, 2025

Summary

Test Plan

github-actions bot commented Feb 27, 2025 • edited Loading

ruff-ecosystem results

Linter (stable)

Linter (preview)

Formatter (stable)

Formatter (preview)

carljm left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MichaReiser left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Feb 27, 2025 •

edited

Loading

`ruff-ecosystem` results