-
-
Notifications
You must be signed in to change notification settings - Fork 161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC 0095] Enable doCheck by default #95
[RFC 0095] Enable doCheck by default #95
Conversation
Tangential, but maybe worth asking: Would it be possible/sensible to separate build and test phases into separate derivations, or something of that kind? Sometimes, running the tests requires some overrides (e.g. |
I think this is high cost, low benefit. Most upstream test suites don't test packaging bugs, so as long as upstream runs their own test suite before release, enabling |
@edolstra I would say all those bad test sweets should just be marked I think that is nonetheless a big improvement because it encourages maintainers to at least try the test suite, and the opt-outs will come with useful comments saying what's wrong. Blacklists are also better to indicate to upstream devs that something is wrong --- which is different and easier than actually fixing the issue ourselves! |
As somebody who uses NixOS as production infrastructructure with user data that must not be lost, I would very welcome a change to tests-by-default, with problematic tests turned off where needed. |
I agree that running test-suites by default would help detect new packaging errors, and therefore would be useful. However, a significant drawback of the proposed approach that this RFC doesn't acknowledge is the one that I reported in #55648 and https://discourse.nixos.org/t/behavior-of-undeclared-broken-packages-is-confusing/2123. The only clear solution to the above problem is an always green hydra (though it doesn't solve the issue of derivations that are not supposed to be built in hydra). Furthermore, when it is possible to test packages in separate derivations (cf. |
Co-authored-by: Niklas Hambüchen <[email protected]>
Co-authored-by: Niklas Hambüchen <[email protected]>
Co-authored-by: Niklas Hambüchen <[email protected]>
Co-authored-by: Niklas Hambüchen <[email protected]>
Co-authored-by: Niklas Hambüchen <[email protected]>
I was thinking about this - maybe one middle ground that is worth considering is running checks only during PR review stage. |
This proposal appeals to my "let's make everything strict" side. At the same time, I don't remember a single time where unit tests uncovered an issue with the packaging. Usually, the issue is something related to the runtime, which isn't exercised by most tests. Are there examples out there where |
I am quite sure that I've seen such cases in Ceph and GlusterFS builds, where the upstram test suites uncovered missing runtime dependencies (e.g. Python libs) or lack of wrapping like I didn't write them down at the time though, as at the time I wasn't aware that |
Missing runtime deps is a common problem that could be caught. I think that default-enable is the right approach. As long as disabling the checks are simple and not frowned upon, lots of software has fast and helpful test suites that wkll catch some common issues. |
I updated the section on Do you guys think that we can take this RFC to the next stages? |
I nominate myself. |
Nice going @Ericson2314 🚀 |
Somewhat tangential, but |
I'm not very experienced in the RFC discussion / calls process yet, so I don't want to be in a position of leadership for it, but I'd certainly like to participate in such things to get into it. |
As far as I can tell, this would also allow us to better separate the build and the check process. For example, at the moment the only way to build packages without also running the tests is to use overrides, which gets messy quickly. |
We need at least one more shepherd for this RFC. |
Co-authored-by: Jörg Thalheim <[email protected]>
Shepherds for this RFC are @Ericson2314, @nh2, and @edolstra @Ericson2314 would you mind being the leader? |
@gytis-ivaskevicius please also update the yaml frontmatter to incorporate the shepherds. |
A thing about language normalization: If I remember well, every phase can be turned on or off by setting attributes such as However, some of these attrs have name How sould we normalize them? Putting all as |
Just so the rest of world knows, I made a matrix channel with the author and shepherds to get the ball rolling. |
@Ericson2314 have you tried scheduling a meeting already? |
Yeah, we tried scheduling a meeting and it was unsuccessful. Will try to align the meeting with everyone later today. |
Meeting happening right now |
Meeting summary:
|
Isn't it possible to reuse/cache such build trees when enabling tests? |
- By default `doCheck` option should be enabled as long as `stdenv.hostPlatform == stdenv.buildPlatform`. | ||
- Non-reproducible test prevention should be implemented. | ||
- All failing packages should be fixed or updated with `doCheck = false;` | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Guidelines to refrain from enabling tests: | |
- If tests are taking _too_ long. (Tests aren't expected to run longer than build time. In case of quick builds - not more than 10min) | |
- If tests require additional large dependencies. | |
- If tests are flaky. (If tests randomly fail once in a while) | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Guys, sounds good?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Guidelines to refrain from enabling tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P.S.:
s|If|When|
When tests are flaky, unpredictably failing.
**New `doCheck`/`doInstallCheck` semantics:** | ||
In addition to booleans, `doCheck`/`doInstallCheck` should also accept strings. | ||
- String value should be considered as `false` | ||
- It should be used as a place for comment on why the check is disabled. For | ||
example: "Requires X11 server" or "Requires network access". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
**New `doCheck`/`doInstallCheck` semantics:** | |
In addition to booleans, `doCheck`/`doInstallCheck` should also accept strings. | |
- String value should be considered as `false` | |
- It should be used as a place for comment on why the check is disabled. For | |
example: "Requires X11 server" or "Requires network access". | |
**New semantics:** | |
- `doCheck`/`doInstallCheck` should default to `null` and work exactly the same as `false` | |
- New options should be introduced: `meta.{checksFlaky,checksLargeDependencies,checksTakeTooLong,checksDisableReason}` |
The first point is so we could evaluate checks that are not set. Derivation paths are not expected to change.
And the second point is so we could evaluate the reason why checks were disabled. Maybe defining additional attrset would be nice? like checks.{flaky,largeDependencies,takeTooLong,disableReason}
? Any thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A comment on the source code and a Boolean value shoud suffice. There is no reason to overload this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
During RFC call, we thought that it would be a very simple change which would allow us to actually evaluate the reason why tests were disabled and that is definitely handy. I am still in favor of it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That being said, it is preferrable a (short?) string, without true/false semantics attached to it.
After all, it can be useful to use the string to convey a useful information even in the case the tests are mandatory.
Something like
doCheck = false;
checkReason = "I don't wanna fetch X.Org just to verify a blinking box!";
Or even better, a new attrset:
check.enable = true;
check.reason = "This package is critical for bootstrap";
Sorry fellow nix'ers, but these days I don't have time for pretty much anything which is why I am closing this RFC. If anyone wishes to take over this RFC - feel free to do so. Also for the record - it seems that #119 addresses most things that I am interested in |
Rendered