-
Notifications
You must be signed in to change notification settings - Fork 248
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make sure build fails when test command fails #175
Conversation
Hi! Thanks for the contribution. I'm a little confused why it's needed though, since our bash script uses joerick@joerick2 ~/Desktop> cat testsh.bash 1
#!/bin/bash
set -o errexit
set -o xtrace
echo "Starting"
true
(
echo subshell 1
(
echo subshell 2
false
true
)
true
)
echo Complete
joerick@joerick2 ~/Desktop> bash testsh.bash
+ echo Starting
Starting
+ true
+ echo subshell 1
subshell 1
+ echo subshell 2
subshell 2
+ false In this test, the first failing command |
Thanks for checking this out! You're right that your test scripts exits properly, so I'm now also confused why this didn't happen in my tests. Here is an example of a failing build: https://travis-ci.org/alan-turing-institute/CleverCSV/jobs/599248169 (around line 670). The Edit: And here is a build using the proposed fix that errors out at the point of the failing test: https://travis-ci.org/alan-turing-institute/CleverCSV/jobs/599577038 (line 670) |
Here is a build that prints out |
Hmmm, so, can we then just add |
Yeah, that would be much cleaner than my solution. I'll make that change and update this PR when I have it. |
Some trial and error to make sure it works reliably. * remove test_retcode * debug local bash settings * set errexit explicitly in subshell * set errexit also in the subshell in parenthesis * debugging * more debugging (exit code of sh) * is the venv subshell returning exit 1? * what if we remove the additional subshell? * then we need to unquote the command * if statement needs test command quoted * simple if statement * echo command needs to be removed of course * remove debug statements * check if setting errexit here is needed * these settings can be removed At this point ("before") the shell already has errexit set. * setting errexit in the sh invocation is needed This is not inherited, so it's good to specify it. * just double checking that the subshell doesn't exit * use solution from fvue.nl * Solution doesn't seem to work, even popd is run So now I'm putting back the if statement, it seems to be the simplest and most effective solution. * add a comment * prove that errexit is necessary * errexit is indeed needed for chaining
The new commit simplifies the test and seems to work reliably. Bash 4.4 adds the I found that even when I place |
Great, thanks! Are we now just focusing on the |
Good question! Any failure in the subshell for the virtualenv will make the returncode of that subshell nonzero. The problem is only that that does not cause the toplevel shell to register an error. Here is an example where the The links in my previous comment give some more details about the problems of I agree that the if statement isn't a particularly clean solution, but it was the most robust approach I could find (with some trial and error). |
Ha, that's funny. But apparently this whole error-handling is quite fragile in bash, so fair enough. Then you have indeed found a very clean solution! Thanks a lot for your investigation and extensive explanation! Always good to have this for future reference, if we forget the reasons for this change! :) @joerick, I thought I've heard you mention long time ago you'd like to replace the bash script with proper Python? If we ever get to doing it, this can be added to the list of reasons why it's a maybe a good idea ;) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @GjjvdBurg for catching this!
oh boy, how many bash scripts with errexit
have I written that might have bugs in them! Okay, so I think I understand - the erroring subshell isn't being detected by the parent shell.
But, I am still confused by why my test PR #177 is passing. That test ensures that CIBW_TEST_COMMAND=false
causes the build to fail... and it passes. Any idea why? Could we get a failing test that this PR fixes?
You're not alone to be confused, @joerick. I just tried this, thinking I knew it was going to fail (well, so, fail in this case meaning "not detecting the failure"), but ... nope set -o errexit
set -o xtrace
echo A
(
echo B
sh -c "false"
echo C
)
echo D $ bash test.sh; echo $?
+ echo A
A
+ echo B
B
+ sh -c false
1 |
Thanks for your comments! I'll add a failing test that this PR fixes and get back to you soon. |
Hi @GjjvdBurg, just checking in, any progress on this? |
Apologies for the delay, I'll try to figure it out this week. |
So, essentially, I can no longer reproduce my original problem with the latest version of cibuildwheel. It seems that upgrading the linux images to manylinux2010 has solved it, probably because manylinux2010 comes with an upgraded version of bash (3.2.25 -> 4.1.2). Here is a build for a test package with the version of cibuildwheel before #155 was merged: And here is one with the version after manylinux was upgraded: https://travis-ci.org/GjjvdBurg/TestingTravis/builds/608871534#L184 (which correctly fails). As further proof that the manylinux change solved it, here is a version of cibuildwheel branched off before #155 was merged and with the test from #177 added (in a temporary copy of the repo). With manylinux1 the failing test did indeed fail. So there's no need to merge this PR anymore! The problem has gone away in another way and the test added in #177 ensures it can be detected in the future. Thanks for your time on this @joerick and @YannickJadoul! |
Huh, that's really weird! Thanks for further checking that. @joerick Yet another reason to take some time for #99? Maybe after the next release, there's a bit more time to do these kinds of refactoring. |
Yes, with manylinux1 this fix would still be needed and it can't hurt for the manylinux2010 images. I've updated my branch with the changes requested above and have clarified the comment. I'll reopen the PR so you can decide on whether to merge or not. |
We have a winner! I just ran a test with manylinux1 and indeed it's erroneously passing.
Thank you all for raising this! Very happy we spotted this before a release. |
Thanks, @GjjvdBurg, for all the work and for holding on during the discussion! |
Glad I could help! |
First of all, thank you for creating this project!
I'm running the latest version (master) to test my package in a virtualenv and noticed that on Linux the return code of the test isn't used. This means that the build continues even though the test command fails. This PR fixes this by keeping the return code of the test command and making that the return code of the bash script.
Note that this is not needed on mac or windows because they use
subprocess.check_call
.