Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] parallel test crates #10052

Closed
wants to merge 25 commits into from

Conversation

gilescope
Copy link
Contributor

@gilescope gilescope commented Nov 6, 2021

Fixes: #5609

@rust-highfive
Copy link

r? @Eh2406

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Nov 6, 2021
@bors
Copy link
Contributor

bors commented Nov 7, 2021

☔ The latest upstream changes (presumably #10051) made this pull request unmergeable. Please resolve the merge conflicts.

@gilescope gilescope force-pushed the parallel-test-crates branch from c3e2d69 to 7b5d94b Compare November 7, 2021 07:59
@gilescope
Copy link
Contributor Author

Wow - a green build with parallel turned on. That's super nice.

@gilescope
Copy link
Contributor Author

gilescope commented Nov 7, 2021

So if this goes green, the outstanding issues before this is a MVP:

  • The test output doesn't have color any more - fix.
  • Ideally a way to limit the amount of parallism would be nice.
  • Use existing mechanism to configure number of parallel test crates run.

@gilescope gilescope changed the title [WIP] parallel test crates Parallel test crates Nov 7, 2021
@gilescope
Copy link
Contributor Author

Todo: we can do better than that sleep with some signaling.

@gilescope gilescope force-pushed the parallel-test-crates branch from c4c7c46 to f899d0c Compare November 9, 2021 10:33
@gilescope gilescope changed the title Parallel test crates [WIP] parallel test crates Nov 10, 2021
@gilescope
Copy link
Contributor Author

Wondering how much of the existing jobserver infrastructure I can reuse - jobstate and job I think are ok. I had better stay away from the rest as it worries about dependencies which we don't have to worry about at this point (from what I can see).
The send after free bug is annoying me but there's not much point in finding it if I am going to rewrite this.

@gilescope
Copy link
Contributor Author

@Eh2406 feel free to opine :-)

@gilescope
Copy link
Contributor Author

The end game is probably to use jobserver to balance job tokens between libtest processes... that feels like an extension PR, not a MVP.

@gilescope
Copy link
Contributor Author

Hmm job queue looks a little too tied into rustc. I think I'm better off not trying to tie that all together.

@alexcrichton
Copy link
Member

I would say a few words of caution before implementing this feature. I believe there are two critical features about running subprocesses in parallel which Cargo isn't really in a position to handle today:

  • First is controlling parallelism. While Cargo can execute tasks in parallel internally there is no coordination between these tasks about running other parallel tasks. This can easily blow up into an exponential amount of parallelism that can cause problems. Cargo handles this with rustc by sharing the jobserver today, but tests are "simply executables" which have no protocol with Cargo.
  • Second is controlling the output. We'll want live output from tests but there's no way Cargo can really weave the output of the tests to the user in a meaningful way. We can present everything segmented after-the-fact but that loses the ability to stream, and otherwise weaving outputs together can be racy and confusing.

These points can be solved if Cargo controlled the test harness, but it doesn't in general, so I don't think this is really an implementable feature in Cargo.

Also as a side note it looks like you're using the exising job queueing infrastructure to implement this but there are no dependencies between tests so I don't think that this is necessary, it can be done with something much more lightweight most likely.

@gilescope
Copy link
Contributor Author

Good points. While we can't control the amount of parallelism automatically, people can set --test-jobs and --test-threads manually. I do accept that they multiply and could in theory mean that you get slower overall results. That said I have been using substrate as a benchmark (which is a pretty large project) and this feature makes a huge difference to test runtimes. Given how much time this saves on 32 core boxes (which will become more usual) I don't think this is something large projects would want to be without.

On the output front at the moment it is using the progress bar from compiling which helps give a bit more feedback. It is a little jumpy in that once it has finished tracking a crate's test output it might dump a couple more test outputs that have finished before catching onto another test crate that is in progress, but this way seems as close to the single threaded test experience as possible.

There's two other directions I would like to explore but maybe better as followup prs:

  • after 50 percent of the jobserver tokens are freed up with compilation we can start running some tests in advance. Typically machines are maxed out at the start of a compile but after that initial bulk calms down there's pleanty of cpu on the table left to run some tests - and front-running these tests could be restricted to the current one process at a time so not violating any default assumptions about test crates not running in parallel.
  • a --incremental flag on cargo test that will skip any test crate that is not rebuilt. The idea being that you edit code, run cargo test --inc and then your crate and any affected test crates would be run but test crates that depended on crates that had not rebuilt would be skipped - they will be run by the ci anyhows. Maybe not foolproof but this could prove to be a nice way to improve the coding feedback loop. (Ideally it would run the earliest compiled test crates first as they are most likely to fail as they are the tests nearest the code changes. This would be a break from the alphabetical order currently but with good reason.)

@gilescope
Copy link
Contributor Author

Thinking further on the controlling parallism front:

Can one assume that --test-threads and --list would be understood by a test executable and possibly honored?
If we know how many tests are in a crate ( --list ) then we can constrain the number of test threads proportionally to the various tests in the test processes.

@alexcrichton
Copy link
Member

While I understand this works for your project that's not necessarily enough to land in Cargo itself. The issues I mentioned are pretty fundamental and I think would block inclusion into Cargo. In the meantime for your use case I'd recommend something like cargo test --no-run with json messages from Cargo and then you can manage parallel / incremental execution of binaries externally from Cargo.

@gilescope
Copy link
Contributor Author

If I may, I'd prefer to attempt to solve the general case.

I've added control for parallism - worst case it has the same performance as before. As long as one test can be assumed to take no more than one thread, then parallelism is limited. I suspect that's as much of a guarentee as --test-threads gives (I am assuming that --test-threads doesn't take into account threads spawned inside tests). If --list isn't supported by the test runner then we act conservatively and do not run that crate in parallel with anything else.

On the UI front it does stream test crate results (and buffers the non-current crate so as not to interleave results). If a test takes longer than 60 seconds, that will still be seen in the console because we'd be tracking that test suite if its the last one running. There's also more of an indication of how many crates have been tested (before there was no Progress support). To me the test UI now seems on a par with the compilation UI.

Will take this version for a spin and see how it handles a few different projects.

@matthiaskrgr
Copy link
Member

I'm wondering on the impact on this when running x.py test on rustc itself. 🙂 Have you done some before/after benchmarks by any chance?

@gilescope
Copy link
Contributor Author

gilescope commented Feb 5, 2022 via email

@sunshowers
Copy link
Contributor

Came across this. I'm really close to releasing cargo-nextest which runs test crates in parallel -- it's available at https://nexte.st/ and will be installable from crates.io some time early next week. Hope that helps!

@gilescope
Copy link
Contributor Author

gilescope commented Feb 12, 2022 via email

@sunshowers
Copy link
Contributor

nextest is now generally available at https://nexte.st. For the reasons mentioned in the thread I think it would be pretty hard to do this sort of behavior change in cargo test itself, but nextest provides a good alternative for people who need it.

@bors
Copy link
Contributor

bors commented Feb 20, 2022

☔ The latest upstream changes (presumably #10346) made this pull request unmergeable. Please resolve the merge conflicts.

@gilescope
Copy link
Contributor Author

Let's get behind https://nexte.st/ - closing this for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

cargo test --all should run tests in parallel
7 participants