Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Write test to make sure imports stay silenced when using silent_imports + incremental. #2036

Open
Michael0x2a opened this issue Aug 18, 2016 · 7 comments

Comments

@Michael0x2a
Copy link
Collaborator

I recently ran into a bug related to silenced modules sometimes being un-silenced and re-checked (?) when combining silent_imports and incremental mode and forcing an SCC to be rechecked (using commit fc03d438d61adf95e6a1aed6fb9fbdf986a90602, the latest version on master as of time of writing).

While I was able to find a fix and will be submitting a pull request shortly, I wasn't able to write a test case. It would be nice to fix that, though it's challenging to do so for a few reasons. More details:


This bug manifests after you...

  1. Make mypy run with silent_imports and incremental mode enabled with a cold cache.
  2. Re-run mypy to establish a baseline for what running mypy on a fully fresh cache should be. On large code-bases, this step should finish substantially faster then step 1.
  3. Make a harmless change to some file which happens to be importing some files not provided on the command line (eg are not being typechecked). This file may need to be a part of some odd import cycle shenanigans.
  4. Re-run mypy.
  5. Re-run mypy again.

The expected behavior is that on step 5, mypy should take about the same amount of time to complete as on step 2, since there was no change after step 4 finished. However, the actual behavior is that step 5 will take about the same amount of time to run as step 4. After examining the logs, it seems the reason why this is the case is because in step 4, mypy is treating the previously silenced imports from the modified file as unsilenced, which causes them to be rechecked.

Unfortunately, I wasn't able to provide a test case because this problem seems to manifest mainly in larger code bases (with import cycles?) that are hard to simplify and after running incremental mode at least 3 times -- mypy's test suite can currently only run incremental two times.

While I was able to reliably repro the issue against the large code-base and will be submitting a pull request with a fix momentarily, it would be nice if it were possible to add an actual test case for this bug.

I'll try and working on adding a test case for this (probably after overhauling parts of the test framework), but I wanted to file an issue here in case this ends up dropping off my todo list.

@Michael0x2a Michael0x2a self-assigned this Aug 18, 2016
@Michael0x2a Michael0x2a changed the title Imports are sometimes un-silenced when using silent_imports + incremental. Write test to make sure imports stay silenced when using silent_imports + incremental. Aug 18, 2016
@Michael0x2a Michael0x2a added this to the Future milestone Aug 18, 2016
@timabbott
Copy link

I encountered this issue with -i as well; I can upload a copy of a buggy mypy cache dir in Zulip (open source) if it'd be helpful for debugging.

@gvanrossum
Copy link
Member

If it's repeatable, please do. Also add the exact revs of zulip and mypy you're using. Repeatable cases have often helped me track down pernicious -i bugs.

@timabbott
Copy link

mypy 0.4.5, zulip.git 4a4664d2687a42e91ccb81c80e264e7b3023397c, I've attached a tarball containing the mypy cache data
mypy-cache.tar.gz

The way I got here was basically:

  • run ./tools/run-mypy
  • modify any file (I added a space in zerver/lib/actions.py)
  • run ./tools/run-mypy again

and the output is this:

$ ./tools/run-mypy 
Using mypy from /srv/zulip-py3-venv/bin/mypy
zerver/lib/event_queue.py: note: In member "connect_handler" of class "ClientDescriptor":
zerver/lib/event_queue.py:188: error: "module" has no attribute "ioloop"
zerver/lib/event_queue.py: note: In member "disconnect_handler" of class "ClientDescriptor":
zerver/lib/event_queue.py:205: error: "module" has no attribute "ioloop"
zerver/lib/event_queue.py: note: In function "setup_event_queue":
zerver/lib/event_queue.py:495: error: "module" has no attribute "ioloop"
zerver/lib/event_queue.py:496: error: "module" has no attribute "ioloop"

@gvanrossum
Copy link
Member

Yay! Got the repro without even needing the tarball. FWIW I'm not sure this is the best issue to report this but I'll stick to it as I don't immediately see a better one.

@gvanrossum
Copy link
Member

@timabbott This is arguably a bug in Zulip. The issue is that event_queue.py has "import tornado" and then uses "tornado.ioloop". But ioloop is a submodule and must explicitly be imported, e.g. by "import tornado.ioloop". (In fact, adding that to event_queue.py makes the problem go away.)

The reason you don't get this error when running with a cold cache is that there's another module that has "import tornado.ioloop" and hence the existence of tornado.ioloop is ensured. But with a warm cache, if none of the other modules containing that import are being processed, mypy never sees any module that states "import tornado.ioloop" and hence does not notice its existence.

The same problem exists in Python at runtime -- if I run a small file containing just

import tornado
tornado.ioloop

I get a NameError on the second line. But if I first import some other module that contains "import tornado.ioloop" I don't get the NameError.

Is this explanation and suggested fix reasonable?

@timabbott
Copy link

Ahh, OK, that makes sense, I've merged that fix in Zulip. I guess that still means there's a bug in mypy, but the bug is arguably that in non-incremental mode it doesn't complain about this :). Thanks for debugging!

@gvanrossum
Copy link
Member

OK, filed #2271.

@gvanrossum gvanrossum removed this from the Future milestone Mar 29, 2017
msullivan added a commit that referenced this issue Feb 28, 2018
This has been a bit of an adventure.

Since we are no longer doing a full `load_graph()`, we can't rely on
it fully populating `missing_modules` if it is cleared between
updates. If `missing_modules` isn't fully populated, then semanal will
spuriously reject `from x import y` where `x.y` is a missing module.

We can't just *not* clear `missing_modules`, however, because
`parse_file` uses it to decide that modules are suppressed. But this
is actually wrong! It can lead to an import failure message not being
generated because some other module has already failed to import it
(and perhaps even ignored the error).

`parse_file()` shouldn't actually *need* to compute `suppressed`,
though. If it is called on a file with no cache, `load_graph()` will
handle moving deps to `suppressed`, and if called on a file that has
had cache info loaded, then the dependency information has all been
loaded from the cache.

So we refactor things so that dep information from the cache is used
when available and `parse_file` doesn't need to deal with it.

I strongly suspect that this refactor obviates the need for
`fix_suppressed_dependencies`, but I was not able to quickly produce a
test case from the description in #2036, so I am not comfortable
making that change.
msullivan added a commit that referenced this issue Mar 1, 2018
Make fine-grained update use a Graph as the source of truth instead of a SavedCache. We modify load_graph to optionally use an existing graph as a starting point. This allows us to skip the expensive full load_graph and preserve_full_cache operations that essentially only convert between Graph and SavedCache.

This requires a nontrivial change to the loading of dependency information in build.
Since we are no longer doing a full `load_graph()`, we can't rely on
it fully populating `missing_modules` if it is cleared between
updates. If `missing_modules` isn't fully populated, then semanal will
spuriously reject `from x import y` where `x.y` is a missing module.

We can't just *not* clear `missing_modules`, however, because
`parse_file` uses it to decide that modules are suppressed. But this
is actually wrong! It can lead to an import failure message not being
generated because some other module has already failed to import it
(and perhaps even ignored the error).

`parse_file()` shouldn't actually *need* to compute `suppressed`,
though. If it is called on a file with no cache, `load_graph()` will
handle moving deps to `suppressed`, and if called on a file that has
had cache info loaded, then the dependency information has all been
loaded from the cache.

So we refactor things so that dep information from the cache is used
when available and `parse_file` doesn't need to deal with it.

I strongly suspect that this refactor obviates the need for
`fix_suppressed_dependencies`, but I was not able to quickly produce a
test case from the description in #2036, so I am not comfortable
making that change.
yedpodtrzitko pushed a commit to kiwicom/mypy that referenced this issue Mar 15, 2018
Make fine-grained update use a Graph as the source of truth instead of a SavedCache. We modify load_graph to optionally use an existing graph as a starting point. This allows us to skip the expensive full load_graph and preserve_full_cache operations that essentially only convert between Graph and SavedCache.

This requires a nontrivial change to the loading of dependency information in build.
Since we are no longer doing a full `load_graph()`, we can't rely on
it fully populating `missing_modules` if it is cleared between
updates. If `missing_modules` isn't fully populated, then semanal will
spuriously reject `from x import y` where `x.y` is a missing module.

We can't just *not* clear `missing_modules`, however, because
`parse_file` uses it to decide that modules are suppressed. But this
is actually wrong! It can lead to an import failure message not being
generated because some other module has already failed to import it
(and perhaps even ignored the error).

`parse_file()` shouldn't actually *need* to compute `suppressed`,
though. If it is called on a file with no cache, `load_graph()` will
handle moving deps to `suppressed`, and if called on a file that has
had cache info loaded, then the dependency information has all been
loaded from the cache.

So we refactor things so that dep information from the cache is used
when available and `parse_file` doesn't need to deal with it.

I strongly suspect that this refactor obviates the need for
`fix_suppressed_dependencies`, but I was not able to quickly produce a
test case from the description in python#2036, so I am not comfortable
making that change.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants