-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Killing webserver process should use SIGTERM on Linux, with SIGKILL as fallback #18209
Comments
Possibly related: #18041 My current workaround is to manually detect if a process has been started by playwright, then use |
I'm reading docker docs and it seems like This is a known anti-pattern when reliably managing dependent process trees - it is prone to dangling and zombie processes. Any process can die at any moment, you can't rely on SIGTERM solely to process the cleanup code. I.e. we can support graceful termination, but they must ensure hard kill still works. It looks like a docker bug to me. Even with your proposed fix, clicking Ctrl+C twice would allow container processes to survive, there is no guarantee they will die. |
A quick note on Linux signal handling in case someone reading this is unfamiliar with it: processes are allowed to handle most signals however they want, and shutting down when you receive SIGTERM is only a (universally-followed) convention: you could handle SIGTERM and do nothing with it, though that would be considered a bug in nearly every case. But processes cannot handle SIGKILL. When a process is sent SIGKILL, the Linux kernel immediately kills the process, giving it no time to do any cleanup. Which is why the pattern on Linux is usually to send SIGTERM first (which by convention means "do any necessary cleanup, then shut yourself down"), then wait a configurable number of seconds before sending SIGKILL if the process hasn't terminated in a timely fashion. So yes, docker-compose needs to have something in place to handle cases where the docker-compose process dies unexpectedly (or is sent a SIGKILL). But bugs in other software are no excuse for Playwright not doing the right thing, and sending SIGTERM first in order to allow webserver processes to do whatever cleanup they might find necessary. I experienced this issue with docker-compose, but there are many other scenarios where a webserver process might have some cleanup (deleting temporary files, etc.) to do, and sending SIGKILL first without ever attempting SIGTERM doesn't allow that cleanup to ever happen. |
We now will send `SIGTERM` to the webserver and wait for the `timeout` before sending `SIGKILL` to it. Fixes microsoft#18209
We now will send `SIGTERM` to the webserver and wait for the `timeout` before sending `SIGKILL` to it. Fixes microsoft#18209
We now will send `SIGTERM` to the webserver and wait for the `timeout` before sending `SIGKILL` to it. Fixes #18209
This should be reopened, since the initial fix was reverted. |
I'd reopen it, but I don't have permissions to do so on this repo. Any maintainers want to reopen this, or would you prefer I open a new issue linking to this one? |
The fact that the webserver is killed instead of terminating gracefully seems to break any tools that rely on Node V8 coverage of the webserver process, since coverage data is written when the process terminates (but not when it's killed). This breaks code coverage using c8, which is important for us. I don't know why the fix for this issue was correctly implemented and then reverted. SIGKILL should never be the default. |
any update? |
Note that this change is also a workaround for: microsoft/playwright#18209
Note that this change is also a workaround for: microsoft/playwright#18209
Note that this change is also a workaround for: microsoft/playwright#18209
Note that this change is also a workaround for: microsoft/playwright#18209
Note that this change is also a workaround for: microsoft/playwright#18209
Note that this change is also a workaround for: microsoft/playwright#18209
Note that this change is also a workaround for: microsoft/playwright#18209
Note that this change is also a workaround for: microsoft/playwright#18209
Note that this change is also a workaround for: microsoft/playwright#18209
Note that this change is also a workaround for: microsoft/playwright#18209
Note that this change is also a workaround for: microsoft/playwright#18209
Note that this change is also a workaround for: microsoft/playwright#18209
Note that this change is also a workaround for: microsoft/playwright#18209
Note that this change is also a workaround for: microsoft/playwright#18209
Note that this change is also a workaround for: microsoft/playwright#18209
Note that this change is also a workaround for: microsoft/playwright#18209
Note that this change is also a workaround for: microsoft/playwright#18209
Note that this change is also a workaround for: microsoft/playwright#18209
Note that this change is also a workaround for: microsoft/playwright#18209
Note that this change is also a workaround for: microsoft/playwright#18209
Note that this change is also a workaround for: microsoft/playwright#18209
Note that this change is also a workaround for: microsoft/playwright#18209
Note that this change is also a workaround for: microsoft/playwright#18209
This is a quick comment to keep this issue open, since apparently the Playwright team closes issues without warning when they haven't had much recent activity. Psst, Playwright team: it would be nice if you would drop a comment on stale issues first, say 14 days before closing them, saying "This issue has not had any recent activity and will be closed in 14 days". That way people wouldn't feel the need to add comments like this one, which is essentially spam but I don't know any other way to be sure this issue will be kept open. After all, it isn't fixed yet. If you're closing issues for not having recent activity, then "bump" comments like this are the only way, apparently, to make sure an issue that hasn't been fixed stays open. |
Following this issue! |
Implemented an option to enable this in #34130, it's released as part of the next minor. |
Context:
Config Snippets
Playwright config:
docker-compose.yml:
Describe the bug
We use a Docker container to launch our dev webserver and run Playwright tests against it. The
docker-compose up
command responds to SIGINT and SIGTERM by gracefully terminating the container(s) that it launched, but if it receives SIGKILL then it cannot terminate the containers, and they stay running. At the moment, the WebServer class does not provide any way to gracefully close the launched process, instead using thekillAndWait
function from processLauncher.ts, which sends SIGKILL to the process without ever trying SIGTERM.The result is that after I run
npx playwright test
, theapp-for-e2e
container is still running, because Playwirght never attempted to send SIGTERM to the process and just sent SIGKILL instead. I then have to stop theapp-for-e2e
container manually, since I don't want it running and taking up system resources when I'm not using it.Proposed solution
To solve this bug, I would suggest changing
killProcess
in processLauncher.ts to take a boolean argumentgracefully
, defaulting tofalse
for backwards compatibility. Ifgracefully
is true, send SIGTERM rather than SIGKILL on Unix-like platforms (Linux and MacOS), or on Windows, runtaskkill
without the/F
flag. Add the same parameter tokillProcessAndCleanup
andkillAndWait
. Then change thethis._killProcess()
method in the WebServer class to look something like this:Additional Info
This is #12299 reopened, since #12299 was closed without a good reason for closing it. (The reasons given were "It looks like this issue only has a handful of upvotes, has not been touched recently and/or we lack sufficient feedback to act on it", none of which are good reasons to close a bug without discussion in the issue first. If a bug is going to be closed for any of those reasons, a comment to that effect should be made in the bug itself first. Closing bugs with no discussion like that is bad issue-management practice.)
The text was updated successfully, but these errors were encountered: