Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement support for "mypy: ignore" comments #17875

Open
wants to merge 25 commits into
base: master
Choose a base branch
from

Conversation

wyattscarpenter
Copy link
Contributor

@wyattscarpenter wyattscarpenter commented Oct 4, 2024

mypy: ignore comments are now honored by mypy exactly as though they said type: ignore, as suggested in #12358.
This allows one to suppress type errors from mypy in particular, without suppressing other type checkers,
which can be desirable if mypy has a bug and other type checkers one runs on the same code do not.

The best way to do this would be to simply inspect the comments parsed in the ast. However, it turns out that type: ignore comments are parsed by the parser as a special case, and we don't have a general ability to look at comments for directives. This could be fixed upstream, and probably wouldn't prove too difficult, but we would still have to wait until end-of-life for current pythons before we rely on that feature. Therefore, I did things the second-best way: simply find-and-replace mypy: ignore comments into type: ignore comments before the rest of the file processing happens.

This strategy worked remarkably well. It has no noticeable impact on performance; and, due to the grammar of python, the replacement is not actually ambiguous. The feature works perfectly on current and future python versions, and it's almost like we got it for free.

Support for pythons 3.12.3 through 3.12.7 necessitated reverse-engineering a bug in tokenize; code to reverse this takes up about half of the lines of this small feature. I am grateful to mypy_primer and the fine folks over at cpython for making this discovery and repair process fabulously easy.

Pythons 3.6 through 3.8 also have a bug that would have prevented this feature from working. That one may have been more difficult to reverse the effects of (possibly impossible). My plan for supporting 3.8 was just to copy the tokenize.py file from contemporary cpython (it's a standalone utility) and import it conditionally in this project. However, before I got around to doing that, we dropped support for 3.8, saving me the trouble.

I have added one test (duplicating all the type: ignore tests seemed like overkill), and documented the new feature.

Appendix A: the original text of this pr, before I significantly improved it with the current technique. This change implements, #12358, `# mypy: ignore` comments that are analogous to `# type: ignore` comments, but specific only to mypy (or, one supposes, any other program that decides to honor them. While I was investigating this issue, it occurred to me that the only thing we need to do is make `# mypy: ignore` comments _appear_ to mypy as `# type: ignore` comments. And so, this PR implements that in what I think everyone will agree is "the wrong way", by **simply doing a re.sub on source code** before it is passed to the ast-parser. Because mypy is a typechecker, and therefore operates on the types of python programs instead of their values, it should almost never make a difference that we are making this change in the code in a textual way.

Let us briefly review why this might be a good idea:

  • Currently, mypy uses ast.parse to get the syntax tree of the python program it is analyzing. ast.parse is just a helper function wrapper over the python built-in compile. This means we unfortunately cannot(?) monkey-patch the desired behavior into our parser, which was my first instinct. It also means we can't just copy and modify the parser to suit our needs (or, we would have to use C FFI, or something, to do this, which would be inconvenient), which was my second instinct. So, if mypy wants to continue using ast.parse as its ast-parser, the "right way" to do this is for ast.parse to also tell us about other comments or directives, such as proposed in Add comments parsing flag to ast.parse cpython#101494. I think this is a great idea and should be done. However, even it if was done immediately, it would only get into CPython for 3.13, which I think means we would need to wait until 3.13 was the minimum supported version of mypy for us to use this strategy. Which might be in the year 2028? There must be a better way!
  • Alternatively, we could:
    • Just parse the ast using a different parser, one which also gives us comments or the right comments. This is conceptually simple, and probably even doable, but is probably a lot of work.
    • Parse and apply the # mypy: ignore comments ourselves, in another pass orthogonal to the ast-parsing. For instance, util.get_mypy_comments does something like this (although that problem is easier because you can count on those comments being at the start of the line). Possibly using Python's tokenize. This is a more-realistic "right way" option.
      • You could also do this parsing just to make sure that the find-and-replace on # mypy: ignore from our original strategy is really in a comment, and then enact a compromise of the two strategies.

Some of those alternatives are not ridiculous. But I already have this feature done this way. Actually now that I'm typing this I'm thinking, you know, I should just also do the tokenize thing. Oh, wait, ok, that's actually hard again because the only thing separating a #mypy: ignore from #mypy: ignore_errors is its position at the start of the line. Oh, wait, actually, you can use a (?![-_]), good.

As it stands, this pr

Breaks:
Literals of strings that would be matched by the regex .*#\s*mypy:\s*ignore.*

For instance, now this does not raise a type error, as it should:

from typing import Literal
literal_boy: Literal["# mypy: ignore"] = "# type: ignore" # My one weakness...

it's quite possible none of these have ever been used in the history of python.

(Incidentally, about that regex: I have discovered that there is no arbitrary whitespace allowed between the type and the : in type-ignore comments. So, neither is there in mypy-ignore comments.)

Does not break:
--pretty output. That's handled by another system, so the source code is not changed in that output. Which will make the problem with Literals more of a problem if it ever comes up, come to think of it.

Other downside: presumably a slight penalty hit, since we're doing another scan over each file. This is not a huge concern, I think, given that, eg, I discovered that util.get_mypy_comments does multiple passes even though it isn't strictly necessary.

@wyattscarpenter wyattscarpenter changed the title Implement support for "mypy: ignore" comments 'the wrong way', using one line of code Implement support for "mypy: ignore" comments 'the wrong way' Oct 4, 2024

This comment has been minimized.

@wyattscarpenter
Copy link
Contributor Author

wyattscarpenter commented Oct 4, 2024

@github-actions

https://github.com/awslabs/sockeye/blob/e42fbb30be9bca1f5073f092b687966636370092/sockeye/layers.py#L642

                projected_memory_kv: Optional[pt.Tensor] = None) -> pt.Tensor:  # mypy: ignore

Incredible... looks like this PR already has a consumer :)

The demand of the people for this feature is overwhelming.

OK... more like "1" than "overwhelming"... but anyway....

Note to self: go PR to remove that line over there after this PR, if they don't notice it themselves...

This comment has been minimized.

wyattscarpenter and others added 4 commits October 5, 2024 05:21
…ents

note also that the 'not start of line' constraint had to be removed from the regex, because now each comment is encountered individually and thus they are at the start
@wyattscarpenter wyattscarpenter changed the title Implement support for "mypy: ignore" comments 'the wrong way' Implement support for "mypy: ignore" comments Oct 5, 2024

This comment has been minimized.

@wyattscarpenter
Copy link
Contributor Author

Well... that's not good.

This comment has been minimized.

@wyattscarpenter
Copy link
Contributor Author

We do a lil "finding out there are bugs in tokenize". python/cpython#125008

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

@wyattscarpenter
Copy link
Contributor Author

We do a lil "finding out there are bugs in 3.8 tokenize" https://bugs.python.org/issue35107

@wyattscarpenter
Copy link
Contributor Author

I plan to just include a more recent copy of tokenize here; note to self that I will have to mention this in the copyright file (along with the several other files that have special copyright status)

@wyattscarpenter
Copy link
Contributor Author

Looks like 3 weeks ago they dropped support for 3.8, as was under discussion last time I was thinking about this project. I've successfully procrastinated my way out of another constraint!

This comment has been minimized.

@wyattscarpenter
Copy link
Contributor Author

wyattscarpenter commented Jan 13, 2025

Well, that mypy_primer diff seems, I dunno, fine. Time to finalize this adventure.

This comment has been minimized.

This comment has been minimized.

@wyattscarpenter wyattscarpenter marked this pull request as ready for review January 13, 2025 13:57
Copy link
Contributor

Diff from mypy_primer, showing the effect of this PR on open source code:

spark (https://github.com/apache/spark)
+ python/pyspark/pandas/supported_api_gen.py:398: SyntaxWarning: invalid escape sequence '\_'
+   return func_str[:-1] + "\_"  # noqa: W605

sockeye (https://github.com/awslabs/sockeye)
+ sockeye/layers.py:642: error: Unused "type: ignore" comment  [unused-ignore]

schemathesis (https://github.com/schemathesis/schemathesis)
-   File "/tmp/mypy_primer/old_mypy/venv/bin/mypy", line 8, in <module>
+   File "/tmp/mypy_primer/new_mypy/venv/bin/mypy", line 8, in <module>
-   File "/tmp/mypy_primer/old_mypy/venv/lib/python3.12/site-packages/mypy/__main__.py", line 15, in console_entry
+   File "/tmp/mypy_primer/new_mypy/venv/lib/python3.12/site-packages/mypy/__main__.py", line 15, in console_entry
-   File "/tmp/mypy_primer/old_mypy/venv/lib/python3.12/site-packages/mypy/main.py", line 119, in main
+   File "/tmp/mypy_primer/new_mypy/venv/lib/python3.12/site-packages/mypy/main.py", line 119, in main
-   File "/tmp/mypy_primer/old_mypy/venv/lib/python3.12/site-packages/mypy/main.py", line 203, in run_build
+   File "/tmp/mypy_primer/new_mypy/venv/lib/python3.12/site-packages/mypy/main.py", line 203, in run_build
-   File "/tmp/mypy_primer/old_mypy/venv/lib/python3.12/site-packages/mypy/build.py", line 191, in build
+   File "/tmp/mypy_primer/new_mypy/venv/lib/python3.12/site-packages/mypy/build.py", line 191, in build
-   File "/tmp/mypy_primer/old_mypy/venv/lib/python3.12/site-packages/mypy/build.py", line 267, in _build
+   File "/tmp/mypy_primer/new_mypy/venv/lib/python3.12/site-packages/mypy/build.py", line 267, in _build
-   File "/tmp/mypy_primer/old_mypy/venv/lib/python3.12/site-packages/mypy/build.py", line 2947, in dispatch
+   File "/tmp/mypy_primer/new_mypy/venv/lib/python3.12/site-packages/mypy/build.py", line 2947, in dispatch
-   File "/tmp/mypy_primer/old_mypy/venv/lib/python3.12/site-packages/mypy/build.py", line 976, in write_deps_cache
+   File "/tmp/mypy_primer/new_mypy/venv/lib/python3.12/site-packages/mypy/build.py", line 976, in write_deps_cache

@JukkaL
Copy link
Collaborator

JukkaL commented Jan 13, 2025

I'm worried that this approach will have a significant performance impact, especially when using a compiled mypy. It would be nice to see some performance measurements, especially when importing a large 3rd party package such as torch where parsing is a bottleneck.

I have a medium/long term plan to move away from using the ast module for parsing, and this would make it possible to support mypy: ignore without any performance impact.

@wyattscarpenter
Copy link
Contributor Author

wyattscarpenter commented Jan 14, 2025

@JukkaL That's a very reasonable concern. I myself anticipated there might be some performance hit, and was surprised when mypy primer (and also my first-hand experience) noticed no performance degradation. I will try running some benchmarks, especially on torch, and report back my results.

Edit(s), 2025-01-14:

Many confusing benchmark results. running `python misc/perf_compare.py master mypy-ignore` in WSL1 gets me ``` === Results ===

master 15.311s (0.0%)
mypy-ignore 19.242s (+25.7%)

Running it again on regular Windows 10, I got

master 21.164s (0.0%)
mypy-ignore 25.565s (+20.8%)


Running it on Windows 10 on torch after I modified the benchmark script to be able to run on other repos (and print SDs) got me

master 114.325s (0.0%) | stdev 0.979s
mypy-ignore 114.263s (-0.1%) | stdev 0.651s

which is surprising! Worrying I might have messed up the benchmark, I ran it again, and got

master 128.554s (0.0%) | stdev 7.510s
mypy-ignore 146.215s (+13.7%) | stdev 9.628s

which was more expected. However, at this point I began to suspect that maybe my computer was just shifting into low-power mode after a while. I ran it on torch again with --num-runs=1, and got

master 130.198s (0.0%) | stdev 0.000s
mypy-ignore 129.884s (-0.2%) | stdev 0.000s

Doing the mypy self-check with n=2, I again get
```master                    21.839s (0.0%) | stdev 0.178s 
mypy-ignore               26.856s (+23.0%) | stdev 0.206s

Doing the torch check with n=2, I again get

master                    128.539s (0.0%) | stdev 1.846s 
mypy-ignore               137.779s (+7.2%) | stdev 9.269s 

The slowdown is significant. A multiple seconds. The proportion of time it takes seems to be less in torch than in mypy, however. Even though the absolute amount of time it takes increases.

@hauntsaninja
Copy link
Collaborator

There may be a solution that involves mypy.build.State.parse_inline_configuration + modifying Errors.ignored_lines that performs better. It would have slightly different semantics to type ignore since it wouldn't be associated with AST nodes, but maybe that's fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants