-
-
Notifications
You must be signed in to change notification settings - Fork 14.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Run check phase by default #33599
Comments
👎 on this. Checks are typically very unreliable and non-deterministic. So expect massive breakage and perpetually broken channels if we enable checks by default. Also, of course, build times will increase massively. |
…cross compiling I hope this will be a temporary measure. If there is consensus around issue NixOS#33599, then we can follow an explicit `dontCheck`, but default to not checking during cross builds when none is given.
In anticipation of what I outline in NixOS#33599, I only simplify exactly those `doCheck`s which are equal to `hostPlatform != buildPlatform`. I also stick a comment next to them so I can grep for them later.
@Ericson2314 check out #39463. |
In theory, this sounds good but in practice I agree that this will break a lot of things. Especially because most devs intend "make check" to only be run by them in their own environments. Maybe it would be nice to experiment with as an overlay/jobset or something. If you think about it, each dependency listed in Nixpkgs is weird kind of integration test in its own right (if its actually used of course). I would be interested in seeing whether big distros like Debian or Arch do testing in an automatic way at all. I would bet our build farm runs more tests right now than most (of course a large userbase is the ultimate integration test). |
My implementation breaks nothing, it just makes broken things explicit. If you look at #39463, then, firstly, it's enormous, and, secondly, there are at least a couple cases where there's clearly a bug in nixpkgs expression that the testing found (but I wasn't able to fix it, things that I did fix will go into a separate PR). I wanted to implement this for a while, then I saw this issue and @edolstra's comment several months ago and though to myself "well, maybe @edolstra's right, but what if its not 'massive', surely, nobody actually tried" and decided to bite the bullet and actually run the experiment. And running the experiment turned out to be useful. Firstly, I fixed a bunch of bugs, and secondly the blowup turned out not to be really "massive", it's less that 2x for my system, which hurts, I agree, but not "massively" :) My implementation just adds an option |
I like this reasoning… I'm +1 on this…
…On Wed, Apr 25, 2018, 6:47 AM Jan Malakhovski ***@***.***> wrote:
My implementation breaks nothing, it just makes broken things explicit. If
you look at #39463 <#39463>, then,
firstly, it's enormous, and, secondly, there are at least a couple cases
where there's clearly a bug in nixpkgs expression that the testing found
(but I wasn't able to fix it, things that I did fix will go into a separate
PR).
I wanted to implement this for a while, then I saw this issue and
@edolstra <https://github.com/edolstra>'s comment several months ago and
though to myself "well, maybe @edolstra <https://github.com/edolstra>'s
right, but what if its not 'massive', surely, nobody actually tried" and
decided to bite the bullet and actually run the experiment. And running the
experiment turned out to be useful. Firstly, I fixed a bunch of bugs, and
secondly the blowup turned out not to be really "massive", it's less that
2x for my system, which hurts, I agree, but not "massively" :)
My implementation just adds an option config.doCheckByDefault which I
would now keep set when I'm not doing experiments that require long
sequences of mass rebuilds.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#33599 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AADWlsRxAdpu02YRr2jKBOMSO--QNpOoks5tr__fgaJpZM4RV8JQ>
.
|
This is great! Do we have info on how many packages have failing tests? Could we discuss enabling this by default? Unfortunately something we'll run into that I haven't seen addressed in this issue is flakey tests. Sometimes they work, sometimes they don't. Certainly mitigated by the lack of networking but will probably happen anyway. |
Well, I have several more hundred patches that disable and fix those..., and I have not successfully built all of `release.nix` with this enabled yet, but you can grep `doCheck = false` to get an approximation on the current `master` (I build most of the basic stuff and most of non-qt desktop packages regularly with `doCheckByDefault` enabled).
My motivation for doing this was not research into statistics (which could be an interesting paper, I agree) but the fact that SLNOS overrides quite a lot of things (e.g. `libcardiacarrest` instead of `libpulse`) and debugging test failures helped to debug those (e.g. `libcardiacarrest` became absolutely perfect, IMO).
Actually, in my experience, the fact that tests "blink" between package versions is more of a problem. I added some more infra to my nixpkgs to check that derivations with `doCheck = false` actually fail tests and got somewhat depressed with the result. What failed before stops failing (which is good), what didn't fail before starts failing (which is depressing), and these changes are not monotonic between version updates.
With infra currently in `master` the result of enabling `doCheckByDefault` would be that the number of `doCheck = false` lines will grow without bounds until most packages will get them. That would be unfortunate.
In general, IMO, the main problem is that `doCheck` is just one bit of information. In my experience, that just doesn't cut it.
- I now think that `doCheckCross` should be a separate attribute (`false` by default) as I have yet to see a package where cross check succeed, and comparing build and host systems everywhere (e.g. when you need to make `doCheck` conditional on something, e.g. for bootstrap) is really ugly.
- Also, IMO `doCheck = false` should be split into `doCheck = false` (meaning "you can try, but running them will fail") and `doCheck = "never"` (or whatever, meaning "don't ever try to run those", e.g. see the expression for `numactl`, or packages that can't properly install after a failed `make check` (yes, the reality is just absolutely horrible.)).
|
@oxij You're saying the tests for a package will fail for a single version, then pass again on the next version? Do you happen to have an example? |
Yes, and then fail again for the next version, and so on. An example from the top of my head is GHC. The problem with GHC is that different versions seem to be packed by different people or something, and while some packagers include tests in the tarball, others don't, but don't stub out them in the Makefile either.
|
I might be missing something, but I added |
For `nix-build` you need to add `doCheckByDefault` into `~/.config/nixpkgs/config.nix` (or similar, see `pkgs/top-level/impure.nix`).
|
Based on #55425 and NixOS/nix#2691, can we also get a way to selectively disable the checkPhase without changing the derivation/output hash? I have use cases where I sometimes want to run tests, and someones I don't want to run tests via |
This would only be possible with something like @Profpatsch's |
Are there any examples of |
We could try a few tricks like making $out read only for the duration of checkPhase and installCheckPhase.
That will work only for installCheckPhase, checkPhase runs before installPhase so it could influence it anyway.
We can mount unionfs or whatever on top of /build for the duration of checkPhase, though. But, ideally, this should be a Nix feature, i.e. checkpoint and rollback while doing the build.
|
I hear you |
(triage) Closing this issue, as the amount of controversy around it make it sound like it would need an RFC to get people to agree on what should be done anyway. |
Not sure what other distributions do, but skipping check phases by default seems to me to be slightly unseemly to me. I rather be assuming upstream tests are valid/useful than not.
Also, this makes a better solution for #30437. We can make the default
doCheck ? hostPlatform == buildPlatform
so that one someone rightsdoCheck = true;
they indeed test in all cases as requested, but when they leave outdoCheck
altogether, only native builds get their tests run.CC @grahamc @vcunat @globin @fpletz @dezgeg
The text was updated successfully, but these errors were encountered: