Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Disable multiprocessing parallelism in unit tests
This commit disables the multiprocessing based parallelism when running unittest jobs in CI. We historically have defaulted the use of multiprocessing in environments only where the "fork" start method is available because this has the best performance and has no caveats around how it is used by users (you don't need an `if __name__ == "__main__"` guard). However, the use of the "fork" method isn't always 100% reliable (see https://bugs.python.org/issue40379), which we saw on Python 3.9 Qiskit#6188. In unittest CI (and tox) by default we use stestr which spawns (not using fork) parallel workers to run tests in parallel. With this PR this means in unittest we're now running multiple test runner subprocesses, which are executing parallel dispatched code using multiprocessing's fork start method, which is executing multithreaded rust code. This three layers of nesting is fairly reliably hanging as Python's fork doesn't seem to be able to handle this many layers of nested parallelism. There are 2 ways I've been able to fix this, the first is to change the start method used by `parallel_map()` to either "spawn" or "forkserver" either of these does not suffer from random hanging. However, doing this in the unittest context causes significant overhead and slows down test executing significantly. The other is to just disable the multiprocessing which fixes the hanging and doesn't impact runtime performance signifcantly (and might actually help in CI so we're not oversubscribing the limited resources. As I have not been able to reproduce `parallel_map()` hanging in a standalone context with multithreaded stochastic swap this commit opts for just disabling multiprocessing in CI and documenting the known issue in the release notes as this is the simpler solution. It's unlikely that users will nest parallel processes as it typically hurts performance (and parallel_map() actively guards against it), we only did it in testing previously because the tests which relied on it were a small portion of the test suite (roughly 65 tests) and typically did not have a significant impact on the total throughput of the test suite.
- Loading branch information