Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[turborepo] regression on handling SIGINT #3711

Closed
zanona opened this issue Feb 9, 2023 · 28 comments · Fixed by #4313
Closed

[turborepo] regression on handling SIGINT #3711

zanona opened this issue Feb 9, 2023 · 28 comments · Fixed by #4313
Assignees
Labels
kind: bug Something isn't working

Comments

@zanona
Copy link

zanona commented Feb 9, 2023

What version of Turborepo are you using?

1.7.4

What package manager are you using / does the bug impact?

npm

What operating system are you using?

Linux

Describe the Bug

Referring to #444 and its implementation on #663, it looks like there was a behavioural regression introduced in 1.7.0 where SIGINT is no longer handled properly. The last working version was 1.6.3.

The bug is related to the fact that when interrupting a program (^C), turbo doesn't wait for the program to handle the SIGINT command and immediately terminates the process.

It is worth noting that the SIGINT instruction continues to run in the background, and the terminal becomes interactive again. This poses a problem when running turbo from a Docker container, for example, while Docker waits for the SIGINT instruction to be handled and then kills the container process. Since turbo doesn't wait for SIGINT to complete, docker simply kills the process prematurely. The expected behaviour here would be that of 1.6.3.

Expected Behavior

Turbo should wait for the program to handle the SIGINT instruction and only then exit the running process.

To Reproduce

./apps/bar/package.json

{
   "name": "@foo/bar",
   "scripts": {
       "start": "trap 'echo wait…; sleep 3; echo done!; exit 0' SIGINT; sleep 10",
   }
}

[email protected] (expected behaviour)

$ npx turbo run start --filter=@foo/bar --only
• Packages in scope: @foo/bar
• Running start in 1 packages
• Remote caching disabled
@foo/bar:start: cache miss, executing a0c8f8ecc5ad4cad
@foo/bar:start:
@foo/bar:start: > start
@foo/bar:start: > trap 'echo wait…; sleep 3; echo done!; exit 0' SIGINT; sleep 10
@foo/bar:start:
^C@foo/bar:start: wait…  # <<< SIGINT
@foo/bar:start: done!
$ echo interactive again....

[email protected] (unexpected behaviour)

$ npx turbo run start --filter=@foo/bar --only
• Packages in scope: @foo/bar
• Running start in 1 packages
• Remote caching disabled
@foo/bar:start: cache miss, executing 52f6ea6626d6482f
@foo/bar:start:
@foo/bar:start: > start
@foo/bar:start: > trap 'echo wait…; sleep 3; echo done!; exit 0' SIGINT; sleep 10
@foo/bar:start:
^C@foo/bar:start: wait… # <<< SIGINT
$ echo interactive again
interactive again
$ @foo/bar:start: done! # <<< bg process

Reproduction Repo

No response

@zanona zanona added area: turborepo kind: bug Something isn't working labels Feb 9, 2023
@notaphplover
Copy link

I am actually facing the same issue with version 1.8.3. @mehulkar, I just uploaded a reproduction repo in case it's helpful.

@gajus
Copy link

gajus commented Mar 20, 2023

Tagging @jaredpalmer for visibility

@mehulkar
Copy link
Contributor

mehulkar commented Mar 21, 2023

May be fixed by #4276, could you verify against v1.8.5?

@notaphplover
Copy link

Hey @mehulkar, I might be wrong, but I would say the issue persist. I updated the reproduction repo to use [email protected], but the issue persist

@mehulkar
Copy link
Contributor

Thanks for trying! cc @arlyon @chris-olszewski

chris-olszewski added a commit that referenced this issue Mar 31, 2023
### Description
I believe this fixes #3711

#4196 added graceful shutting down when calling `go-turbo`, this does
the same but for when we need to invoke the correct Rust binary in
`shim.rs`. I extracted the logic added in #4196 to a function that can
be used throughout the codebase.

### Testing Instructions

Currently only doing manual via the reproduction provided in #3711.
Edit: I originally wanted to add integration test to cover these cases,
this has proven to be a challenge. Will still look at trying to
orchestrate a test for this, but considering how much traction the
related issues have, I don't want to block on getting an integration
test.
@notaphplover
Copy link

@chris-olszewski I just saw a new [email protected] version was released. I tried in the reproduction repo with that version and the issue persist:

image

Could you please reopen the issue?

@notaphplover
Copy link

Update: Same with [email protected]:

image

@chris-olszewski
Copy link
Member

@notaphplover, can you confirm the expected outcomes of sending a SIGINT to pnpm run foo:pnpm? On my machine it exits with a 0 same as pnpm run foo:pnpm:trap. I've also checked and it looks like turborepo has had this behavior (exit with code 1 regardless of trap) since 1.5.0, can you confirm this on your machine?

Also, if you need an exit code 0, could you change the trap command to trap 'exit 0' INT TERM; turbo run foo and that seems to work on my machine.

@notaphplover
Copy link

notaphplover commented Apr 24, 2023

Hi @chris-olszewski 😃.

@notaphplover, can you confirm the expected outcomes of sending a SIGINT to pnpm run foo:pnpm? On my machine it exits with a 0 same as pnpm run foo:pnpm:trap. I've also checked and it looks like turborepo has had this behavior (exit with code 1 regardless of trap) since 1.5.0, can you confirm this on your machine?

On my machine pnpm run foo:pnpm exits with non zero code and pnpm run foo:pnpm:trap with zero code. If this is a problem I think I could make an effort and set a gh action reproducing the issue. I could even open a debugging port and allow you to connect through ssh with some magic tricks, but I was expecting any linux machine would behave similar in this case :(. Unlucky us I guess.

Regarding which version introduced this (I think) unexpected behavior, I tried with 1.5.6 and it had the error. It's being a little bit hard to test in other 1.5 and 1.4 versions, the daemon does not manage to start (probably related to #2034). I managed to recreate the right behavior on 1.4.6 so I would hazard to say you are right.

Also, if you need an exit code 0, could you change the trap command to trap 'exit 0' INT TERM; turbo run foo and that seems to work on my machine.

Yeah, I know, the thing is, that's not what I want. I want to exit with code zero if and only if the process is able to exit gracefully with no issues. For that reason I want to trap '' instead, propagating the exit code.

Hope all of this helps. I would prefer not to pass through the pain of setting the remote debugging session, but if you really need it I can go for it in a couple of days.

Edit: I just realized the debugging session wouldn't be of any help 😅, but the ssh connection would allow you to connect to the gh runner and recreate the issue. Probably an overkill since docker seems a much simpler way to go

@chris-olszewski chris-olszewski self-assigned this Apr 24, 2023
@chris-olszewski
Copy link
Member

If this is a problem I think I could make an effort and set a gh action reproducing the issue

No need to, I just wanted to make sure that the description in the repro was correct and my machine was doing something weird.

I managed to recreate the right behavior on 1.4.6 so I would hazard to say you are right.

Thanks for confirming, this narrows the code changes to check quite a lot.

I want to exit with code zero if and only if the process is able to exit gracefully with no issues.

Understand, just wanted to check if that would provide any intermediate relief.

Sorry again for the drop in communication.

@notaphplover
Copy link

Sorry again for the drop in communication.

All good. This is an open source project after all. Love the beautiful work you're doing. Sometimes these issues happen, I simply opened the other issue to avoid losing the tracking.

@chris-olszewski
Copy link
Member

@notaphplover I had some time to delve into this and this is a larger feature request. We currently always exit with exit code 1 if we receive a signal. In order to return the highest exit code we need to start gracefully handling signals where the first SIGINT we receive gets forwarded to the child processes and then if we receive another SIGINT we sent a SIGKILL.

I don't expect this work will get done until we finish porting the codebase to Rust. Hopefully as we port this signal code we can set the groundwork for being more graceful with our signal handling.

@EloB
Copy link

EloB commented Apr 11, 2024

Is this working or will be fixed anytime soon? I'm having some issues that docker doesn't close when I do Ctrl + C. I'm using pnpm and graceful teardown works using pnpm --filter=mypackage run dev but npx turbo run dev --filter=mypackage fails to teardown all child processes.

@giorgiogross
Copy link

giorgiogross commented May 22, 2024

Experiencing the same issue, turbo version 1.13.3; when I hit ctrl+c I get an ERROR run failed: command exited (1). That doesn't happen when I run a command with a single task from root directory and use interactive mode, but when I disable interactive or when there is more than one task I get that error. Oh, and I'm using npm! :)

@makeryi
Copy link

makeryi commented Aug 14, 2024

I found something that might be useful for addressing this issue:
crates/turborepo-lib/src/cli/mod.rs
crates/turborepo-lib/src/commands/run.rs
Not sure if changing Ok(1) to Ok(()) or Ok(0) would fix this issue.

@markkkkas
Copy link

Is this working or will be fixed anytime soon? I'm having some issues that docker doesn't close when I do Ctrl + C. I'm using pnpm and graceful teardown works using pnpm --filter=mypackage run dev but npx turbo run dev --filter=mypackage fails to teardown all child processes.

We are observing same issue :(

@GuskiS
Copy link

GuskiS commented Sep 21, 2024

Seeing same issue where docker fails to teardown when using turbo 2.1.2.

@kota65535
Copy link

Same here in turbo 2.2.3

@koteelok
Copy link

Still broken in 2.3.3. The build system for monorepos is unable to close processes, which is quite strange.

@ccaspanello
Copy link

@notaphplover I had some time to delve into this and this is a larger feature request. We currently always exit with exit code 1 if we receive a signal. In order to return the highest exit code we need to start gracefully handling signals where the first SIGINT we receive gets forwarded to the child processes and then if we receive another SIGINT we sent a SIGKILL.

I don't expect this work will get done until we finish porting the codebase to Rust. Hopefully as we port this signal code we can set the groundwork for being more graceful with our signal handling.

Is there any update on this?

Also this was a very helpful comment:

Is this working or will be fixed anytime soon? I'm having some issues that docker doesn't close when I do Ctrl + C. I'm using pnpm and graceful teardown works using pnpm --filter=mypackage run dev but npx turbo run dev --filter=mypackage fails to teardown all child processes.

But I really want to be able to run pnpm dev and my full stack spins up in dev mode. Starting each one is kind of a pain.

@dir
Copy link

dir commented Dec 30, 2024

Not a great first impression using turborepo only to find out that we... can't trap exit codes... since February 2023? lol

@Scalahansolo
Copy link

@anthonyshew Given that you removed the "ownedby" label a couple of months ago on this, is it possible you could provide some insight if the team that works on Turbo has any intention to fix this?

@anthonyshew
Copy link
Contributor

Hey, @Scalahansolo, I was only removing one of our workstream automations. We don't need to mark Issues as owned by a specific team anymore since Turbopack is no longer in this repository.

We're still working through fixes and features as usual. If you want to express your interest for this Issue, you can help us prioritize by adding your 👍 to the top-level if you haven't yet. Thanks!

@Scalahansolo
Copy link

Is adding 👍🏼 to the top level issue how Turbo / Vercel decide to do prioritization? If so, that's good to know and everyone here should jump on board

@anthonyshew
Copy link
Contributor

anthonyshew commented Jan 6, 2025

@Scalahansolo, it's one of a handful signals we use, yes. We have this GitHub Action and periodically check most-upvoted issues all-time as well (among many other signals).

@dir
Copy link

dir commented Jan 6, 2025

@anthonyshew Thanks for the info, gave it a thumbs up! That's a pretty nifty setup with that action.

@chris-olszewski
Copy link
Member

For those using npm or npx turbo to invoke Turborepo, there was a regression in npm 9.6.7 up until 10.3.0 where underlying processes (such as turbo) would not get a signal passed down from npm. See npm/cli#6684 for more details

@chris-olszewski
Copy link
Member

I am closing this issue in favor of #9694 as this was left open for the specific ask of #9694.

If you are encountering issues with signal handling, please give that issue a 👍 and include the following information which will help us understand what is happening:

  • The output of turbo info
  • The command being used to invoke turbo
  • How the signal is being sent to turbo
  • A reproduction repository would be quite helpful as well

@vercel vercel locked and limited conversation to collaborators Jan 13, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind: bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.