-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nix-daemon process leaking on nix-darwin Mac build machine #3294
Comments
This happened on a Hercules CI agent machine too.
|
I'll try to find more diagnostic info if it happens again, but I'm seeing this behavior (resource limit reached) on two 20.03 NixOS machines. One is a laptop that gets tweaked and rebuilt frequently. The other is a computer managed by NixOps. |
FWIW, here's a small module with an ugly bash hack to expire pids so they don't cause that issue: |
I marked this as stale due to inactivity. → More info |
This changes the representation of the interrupt callback list to be safe to use during interrupt handling. Holding a lock while executing arbitrary functions is something to avoid in general, because of the risk of deadlock. Such a deadlock occurs in NixOS#3294 where ~CurlDownloader tries to deregister its interrupt callback. This happens during what seems to be a triggerInterrupt() by the daemon connection's MonitorFdHup thread. This bit I can not confirm based on the stack trace though; it's based on reading the code, so no absolute certainty.
This changes the representation of the interrupt callback list to be safe to use during interrupt handling. Holding a lock while executing arbitrary functions is something to avoid in general, because of the risk of deadlock. Such a deadlock occurs in NixOS#3294 where ~CurlDownloader tries to deregister its interrupt callback. This happens during what seems to be a triggerInterrupt() by the daemon connection's MonitorFdHup thread. This bit I can not confirm based on the stack trace though; it's based on reading the code, so no absolute certainty.
This changes the representation of the interrupt callback list to be safe to use during interrupt handling. Holding a lock while executing arbitrary functions is something to avoid in general, because of the risk of deadlock. Such a deadlock occurs in NixOS#3294 where ~CurlDownloader tries to deregister its interrupt callback. This happens during what seems to be a triggerInterrupt() by the daemon connection's MonitorFdHup thread. This bit I can not confirm based on the stack trace though; it's based on reading the code, so no absolute certainty.
This changes the representation of the interrupt callback list to be safe to use during interrupt handling. Holding a lock while executing arbitrary functions is something to avoid in general, because of the risk of deadlock. Such a deadlock occurs in NixOS#3294 where ~CurlDownloader tries to deregister its interrupt callback. This happens during what seems to be a triggerInterrupt() by the daemon connection's MonitorFdHup thread. This bit I can not confirm based on the stack trace though; it's based on reading the code, so no absolute certainty, but a smoking gun nonetheless.
…lback-deadlock Fix deadlocked nix-daemon zombies on darwin #3294
With nix-darwin on a mac-build machine, the nix-daemon appears to be spawned repeatedly and not close the process out during builds. After some time, when the ulimit process limits have been hit, the build machine yields resource errors, such as:
Some diagnostic commands below:
The text was updated successfully, but these errors were encountered: