-
-
Notifications
You must be signed in to change notification settings - Fork 666
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix hang with ginkgo -p #1192
fix hang with ginkgo -p #1192
Conversation
Trying to run this locally the interceptor binary did not exists. Lets just build this in SynchronizedBeforeSuite() automatically so users do not have to figure this out. Signed-off-by: Paul Holzinger <[email protected]>
Using the same file name for all test cause conflicts when run in parallel, this causes the tests to fail for me. To fix this use a filename suffix with GinkgoParallelProcess() which prevents conflicts in the file name. Signed-off-by: Paul Holzinger <[email protected]>
When you run any child process that stay around longer than the test suite ginkgo currently hangs. This is because the dup stdout/err fds are leaked into the child thus read() will block on it as there is at least one process still having the write pipe open. From ginkgos POV it looks like it is done, you see the ginkgo result output printed but then it just hangs and doe snot exit because of it. To fix it we set FD_CLOEXEC on the dup-ed fds, this prevents them from ever being leaking into other processes that could outlive the suite. There is a dup3() call the could be uses to set the CLOEXEC option directly but this seem linux only and thus is less portable. The fcntl call should be good enough here, we do not need to be worried about the race conditions described in the man page. Ideally we should do some error handling in that function for both the fcntl calls and the existing dup() above, however this seems like a rather big change. I am not so sure about how to test it properly, I added a test which just executes `ginkgo run -p` and a test which only starts `sleep 60`. Ginkgo then should exit right way and keep this process running. Then I just make sure the gingo exits in under 15 seconds. As long as it is below 60s it should fulfil the purpose. Fixes onsi#1191 Signed-off-by: Paul Holzinger <[email protected]>
thanks for pulling this together - it looks like CI failed but i'll merge this in later this morning and take a look at the testing strategy. i think your approach is the right one (spawn a sleep and ensure ginkgo exits in a timely manner) - but i need to think about where the test should live (right now it's in an existing fixture - whereas i think it might be better as a distinct fixture with the test itself living in ginkgo's integration suite). in any event i'll take a look in a couple hours and cut a release soon |
Yeah I wasn't sure about the test structure. Also seems like CI doesn't have access to the ginkgo binary as it seem to be run with go run not sure how to address that. |
no worries - that piece will be resolved when i move the test into the integration tests (they compile their own version of ginkgo to ensure we're testing against the actual code in the repo). testing a testing framework can be... interesting ;) |
looks good now. gonna pull this in. thanks so much! |
@@ -0,0 +1,60 @@ | |||
# Backlog |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for merging, did you intent to commit the TODO file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, sigh - i had just renamed it and updated my .gitignore on main but accidentally committed it in the pr. i've cleaned it up now, though it will live on forever in the git history. there's nothing sensitive in there
When you run any child process that stay around longer than the test
suite ginkgo currently hangs. This is because the dup stdout/err fds are
leaked into the child thus read() will block on it as there is at least
one process still having the write pipe open.
From ginkgos POV it looks like it is done, you see the ginkgo result
output printed but then it just hangs and doe snot exit because of it.
To fix it we set FD_CLOEXEC on the dup-ed fds, this prevents them from
ever being leaking into other processes that could outlive the suite.
There is a dup3() call the could be uses to set the CLOEXEC option
directly but this seem linux only and thus is less portable.
The fcntl call should be good enough here, we do not need to be worried
about the race conditions described in the man page.
Ideally we should do some error handling in that function for both the
fcntl calls and the existing dup() above, however this seems like a
rather big change.
I am not so sure about how to test it properly, I added a test which
just executes
ginkgo run -p
and a test which only startssleep 60
.Ginkgo then should exit right way and keep this process running. Then I
just make sure the gingo exits in under 15 seconds. As long as it is
below 60s it should fulfil the purpose.
Fixes #1191
Also see the other two commits for some test fixes I notcied