Write test to make sure imports stay silenced when using silent_imports + incremental. #2036

Michael0x2a · 2016-08-18T02:07:12Z

I recently ran into a bug related to silenced modules sometimes being un-silenced and re-checked (?) when combining silent_imports and incremental mode and forcing an SCC to be rechecked (using commit fc03d438d61adf95e6a1aed6fb9fbdf986a90602, the latest version on master as of time of writing).

While I was able to find a fix and will be submitting a pull request shortly, I wasn't able to write a test case. It would be nice to fix that, though it's challenging to do so for a few reasons. More details:

This bug manifests after you...

Make mypy run with silent_imports and incremental mode enabled with a cold cache.
Re-run mypy to establish a baseline for what running mypy on a fully fresh cache should be. On large code-bases, this step should finish substantially faster then step 1.
Make a harmless change to some file which happens to be importing some files not provided on the command line (eg are not being typechecked). This file may need to be a part of some odd import cycle shenanigans.
Re-run mypy.
Re-run mypy again.

The expected behavior is that on step 5, mypy should take about the same amount of time to complete as on step 2, since there was no change after step 4 finished. However, the actual behavior is that step 5 will take about the same amount of time to run as step 4. After examining the logs, it seems the reason why this is the case is because in step 4, mypy is treating the previously silenced imports from the modified file as unsilenced, which causes them to be rechecked.

Unfortunately, I wasn't able to provide a test case because this problem seems to manifest mainly in larger code bases (with import cycles?) that are hard to simplify and after running incremental mode at least 3 times -- mypy's test suite can currently only run incremental two times.

While I was able to reliably repro the issue against the large code-base and will be submitting a pull request with a fix momentarily, it would be nice if it were possible to add an actual test case for this bug.

I'll try and working on adding a test case for this (probably after overhauling parts of the test framework), but I wanted to file an issue here in case this ends up dropping off my todo list.

timabbott · 2016-10-17T06:17:41Z

I encountered this issue with -i as well; I can upload a copy of a buggy mypy cache dir in Zulip (open source) if it'd be helpful for debugging.

gvanrossum · 2016-10-17T14:41:23Z

If it's repeatable, please do. Also add the exact revs of zulip and mypy you're using. Repeatable cases have often helped me track down pernicious -i bugs.

timabbott · 2016-10-17T20:38:24Z

mypy 0.4.5, zulip.git 4a4664d2687a42e91ccb81c80e264e7b3023397c, I've attached a tarball containing the mypy cache data
mypy-cache.tar.gz

The way I got here was basically:

run ./tools/run-mypy
modify any file (I added a space in zerver/lib/actions.py)
run ./tools/run-mypy again

and the output is this:

$ ./tools/run-mypy 
Using mypy from /srv/zulip-py3-venv/bin/mypy
zerver/lib/event_queue.py: note: In member "connect_handler" of class "ClientDescriptor":
zerver/lib/event_queue.py:188: error: "module" has no attribute "ioloop"
zerver/lib/event_queue.py: note: In member "disconnect_handler" of class "ClientDescriptor":
zerver/lib/event_queue.py:205: error: "module" has no attribute "ioloop"
zerver/lib/event_queue.py: note: In function "setup_event_queue":
zerver/lib/event_queue.py:495: error: "module" has no attribute "ioloop"
zerver/lib/event_queue.py:496: error: "module" has no attribute "ioloop"

gvanrossum · 2016-10-17T20:53:33Z

Yay! Got the repro without even needing the tarball. FWIW I'm not sure this is the best issue to report this but I'll stick to it as I don't immediately see a better one.

gvanrossum · 2016-10-17T21:15:24Z

@timabbott This is arguably a bug in Zulip. The issue is that event_queue.py has "import tornado" and then uses "tornado.ioloop". But ioloop is a submodule and must explicitly be imported, e.g. by "import tornado.ioloop". (In fact, adding that to event_queue.py makes the problem go away.)

The reason you don't get this error when running with a cold cache is that there's another module that has "import tornado.ioloop" and hence the existence of tornado.ioloop is ensured. But with a warm cache, if none of the other modules containing that import are being processed, mypy never sees any module that states "import tornado.ioloop" and hence does not notice its existence.

The same problem exists in Python at runtime -- if I run a small file containing just

import tornado
tornado.ioloop

I get a NameError on the second line. But if I first import some other module that contains "import tornado.ioloop" I don't get the NameError.

Is this explanation and suggested fix reasonable?

timabbott · 2016-10-18T00:58:36Z

Ahh, OK, that makes sense, I've merged that fix in Zulip. I guess that still means there's a bug in mypy, but the bug is arguably that in non-incremental mode it doesn't complain about this :). Thanks for debugging!

gvanrossum · 2016-10-18T02:04:33Z

OK, filed #2271.

This has been a bit of an adventure. Since we are no longer doing a full `load_graph()`, we can't rely on it fully populating `missing_modules` if it is cleared between updates. If `missing_modules` isn't fully populated, then semanal will spuriously reject `from x import y` where `x.y` is a missing module. We can't just *not* clear `missing_modules`, however, because `parse_file` uses it to decide that modules are suppressed. But this is actually wrong! It can lead to an import failure message not being generated because some other module has already failed to import it (and perhaps even ignored the error). `parse_file()` shouldn't actually *need* to compute `suppressed`, though. If it is called on a file with no cache, `load_graph()` will handle moving deps to `suppressed`, and if called on a file that has had cache info loaded, then the dependency information has all been loaded from the cache. So we refactor things so that dep information from the cache is used when available and `parse_file` doesn't need to deal with it. I strongly suspect that this refactor obviates the need for `fix_suppressed_dependencies`, but I was not able to quickly produce a test case from the description in #2036, so I am not comfortable making that change.

Make fine-grained update use a Graph as the source of truth instead of a SavedCache. We modify load_graph to optionally use an existing graph as a starting point. This allows us to skip the expensive full load_graph and preserve_full_cache operations that essentially only convert between Graph and SavedCache. This requires a nontrivial change to the loading of dependency information in build. Since we are no longer doing a full `load_graph()`, we can't rely on it fully populating `missing_modules` if it is cleared between updates. If `missing_modules` isn't fully populated, then semanal will spuriously reject `from x import y` where `x.y` is a missing module. We can't just *not* clear `missing_modules`, however, because `parse_file` uses it to decide that modules are suppressed. But this is actually wrong! It can lead to an import failure message not being generated because some other module has already failed to import it (and perhaps even ignored the error). `parse_file()` shouldn't actually *need* to compute `suppressed`, though. If it is called on a file with no cache, `load_graph()` will handle moving deps to `suppressed`, and if called on a file that has had cache info loaded, then the dependency information has all been loaded from the cache. So we refactor things so that dep information from the cache is used when available and `parse_file` doesn't need to deal with it. I strongly suspect that this refactor obviates the need for `fix_suppressed_dependencies`, but I was not able to quickly produce a test case from the description in #2036, so I am not comfortable making that change.

Make fine-grained update use a Graph as the source of truth instead of a SavedCache. We modify load_graph to optionally use an existing graph as a starting point. This allows us to skip the expensive full load_graph and preserve_full_cache operations that essentially only convert between Graph and SavedCache. This requires a nontrivial change to the loading of dependency information in build. Since we are no longer doing a full `load_graph()`, we can't rely on it fully populating `missing_modules` if it is cleared between updates. If `missing_modules` isn't fully populated, then semanal will spuriously reject `from x import y` where `x.y` is a missing module. We can't just *not* clear `missing_modules`, however, because `parse_file` uses it to decide that modules are suppressed. But this is actually wrong! It can lead to an import failure message not being generated because some other module has already failed to import it (and perhaps even ignored the error). `parse_file()` shouldn't actually *need* to compute `suppressed`, though. If it is called on a file with no cache, `load_graph()` will handle moving deps to `suppressed`, and if called on a file that has had cache info loaded, then the dependency information has all been loaded from the cache. So we refactor things so that dep information from the cache is used when available and `parse_file` doesn't need to deal with it. I strongly suspect that this refactor obviates the need for `fix_suppressed_dependencies`, but I was not able to quickly produce a test case from the description in python#2036, so I am not comfortable making that change.

Michael0x2a added topic-incremental topic-import-cycles labels Aug 18, 2016

Michael0x2a self-assigned this Aug 18, 2016

Michael0x2a mentioned this issue Aug 18, 2016

Fix imports being unsilenced when checking stale SCCs #2037

Merged

Michael0x2a changed the title ~~Imports are sometimes un-silenced when using silent_imports + incremental.~~ Write test to make sure imports stay silenced when using silent_imports + incremental. Aug 18, 2016

Michael0x2a added this to the Future milestone Aug 18, 2016

gvanrossum mentioned this issue Oct 18, 2016

Implicitly imported submodules are not always diagnosed #2271

Closed

gvanrossum added the priority-2-low label Jan 6, 2017

gvanrossum removed this from the Future milestone Mar 29, 2017

msullivan mentioned this issue Feb 28, 2018

Optimize fine-grained update by using Graph as the cache #4622

Merged

JukkaL unassigned Michael0x2a Mar 8, 2018

ilevkivskyi added the topic-tests label May 20, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Write test to make sure imports stay silenced when using silent_imports + incremental. #2036

Write test to make sure imports stay silenced when using silent_imports + incremental. #2036

Michael0x2a commented Aug 18, 2016

timabbott commented Oct 17, 2016

gvanrossum commented Oct 17, 2016

timabbott commented Oct 17, 2016

gvanrossum commented Oct 17, 2016

gvanrossum commented Oct 17, 2016

timabbott commented Oct 18, 2016

gvanrossum commented Oct 18, 2016

Write test to make sure imports stay silenced when using silent_imports + incremental. #2036

Write test to make sure imports stay silenced when using silent_imports + incremental. #2036

Comments

Michael0x2a commented Aug 18, 2016

timabbott commented Oct 17, 2016

gvanrossum commented Oct 17, 2016

timabbott commented Oct 17, 2016

gvanrossum commented Oct 17, 2016

gvanrossum commented Oct 17, 2016

timabbott commented Oct 18, 2016

gvanrossum commented Oct 18, 2016