-
-
Notifications
You must be signed in to change notification settings - Fork 652
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Run majority of unit tests in CI with V2 Pytest runner #7724
Run majority of unit tests in CI with V2 Pytest runner #7724
Conversation
3e72aab
to
359cc48
Compare
359cc48
to
512833a
Compare
Turns out that travis_wait does not actually read the input to check if the process is still running. Instead, it says "you have up to x minutes for this process to complete, otherwise it will time out." It took 41 minutes for the Py2 unit tests to pass last time, so we set it to 50 minutes to avoid flakiness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's do it.
Do you know the cause of the 250% perf regression? How much do we think is down to each of:
|
I haven't quantitatively confirmed anything, but my intuition from working with the V2 test runner the past few weeks is that it's the result of resolving requirements for each individual test, as opposed to not having to resolve any requirements at all because we just use what's already in the venv. I'm not sure how we could solve this...copying files from the venv into the subprocess sounds horribly error prone when we introduce remoting. Precomputing a Pex with every single permutation of dependency combinations would likely take too much memory, and it's not clear how we could get that cache up to Travis. |
Interesting... We should do one resolve per 3rdparty dependency combination, right? I'd be interested to know how many resolves we're actually doing... From some quick grepping, it looks like we have 38 3rdparty deps from our python code, and I would be surprised if we were doing more than maybe 30 resolves, and 30 minutes is a long time for 30 resolves to be happening, especially in parallel... I guess the easiest way to find this out right now would be to grab a |
+1. Also, it's worth noting that caching (either pantsd, local as in #6898, or remote) should completely eliminate the resolve overhead in 99% of cases.
That or getting the zipkin patch relanded. I suspect that |
…/python/subsystems` (#7793) ### Problem `backend/python/subsystems/python_native_code.py` was depending on `pants.backend.native.subsystems`, but not declaring the dependency in its `BUILD` file. The naive solution of adding the proper depedency to its `BUILD` results in a dependency cycle, as `backend/native/subsystems/conan.py` already depends depending on `pants.backend.python.subsystems.python_tool_base`. So, `./pants test tests/python/pants_test/backend/python/tasks:pytest_run` would fail when ran directly, but pass in CI because all of the targets would get the sources combined. When changing to using the V2 test runner, we no longer allow this sort of leaky dependency usage—every dependency must be properly declared in the appropriate `BUILD` file. So, `./pants test tests/python/pants_test/backend/python/tasks:pytest_run` started failing in #7724. ### Solution Move `conan.py` into a new subdirectory `backend/native/subsystems/packaging`, as suggested by @cosmicexplorer.
@@ -947,23 +947,23 @@ matrix: | |||
- *py27_linux_test_config_env | |||
- CACHE_NAME=linuxunittests.py27 | |||
script: | |||
- ./build-support/bin/ci.sh -2lp | |||
- travis_wait 50 ./build-support/bin/ci.sh -2lp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Frustrating UX experience of using travis_wait
: the resulting std{out,err}
will be stripped off any color. We normally print those in red upon test failures to make it easier to scan.
We won't be able to remove travis_wait
until improving the performance via #7795 or via remoting.
…build#7724)" This reverts commit c9ea445.
### Problem Now that we use the V2 test runner as of #7724, unit tests both take much longer (20 minutes -> 40-50 minutes) and have become very flaky (not exclusively thanks to V2). Especially because the tests flake so much, it is frustrating to have to wait a whole 50 minutes to rerun the shard. ### Solution We can't use automated sharding because V2 does not support that yet, but we can introduce our own manual shards. One shard runs the V2 tests, and the other runs all blacklisted tests, the contrib tests, and the `pants-plugin` tests. ### Result Flakes will be slightly less painful, because when something flakes you will not have to run the entire 50 minutes of CI again, but just the subset for that specific shard.
Problem
Beyond wanting to move everything towards V2 in general, this change will allow us to start remoting our unit tests for greater parallelism and less dependance on Travis.
Solution
We must make several changes to land this:
travis_wait
to avoid a timeout. See https://docs.travis-ci.com/user/common-build-problems/#build-times-out-because-no-output-was-received.Result
CI is overall less useful. The unit tests now take 50 minutes to run, rather than 20. Likewise, the logging is less useful because the use of
travis_wait
means that it is not logged until the entire process completes, and currently we logstdout
andstderr
for successful tests, so there is more noise.However, these costs are deemed worth it to allow us to soon use remoting.