-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segfault in evaluation or libstore #4178
Comments
Valgrind warnings in boehmgc are largely meaningless since it's normal behaviour for the collector to scan uninitialized memory. |
I couldn't reproduce this on master:
Since it is doing IFD, it's possible that the bug is on the daemon side. Maybe you can get a stack trace out of the core dump ( |
The daemon's daemon raise
client crash
|
Yeah it looks like the stack frame is corrupted. The daemon crash is a separate (but real) issue: we're not handling an exception in a destructor ( |
Some observations on another commit (agent fb5e6c4f0752bda6bbd443e3d1eb02e215df3aee) The segfault occurs while a thread is in The derivation being evaluated (the pre-commit-check) has a source that produces a stack overflow when an expression equivalent to the actual source input is evaluated separately. Perhaps the segfault handler for stack overflows doesn't do its job in some cases. I'll try with tail recursion later. |
That did the trick. It appears that master can't provide as many recursions as earlier versions. Applying a patch that reduces stack usage solves the problem for this instance. |
#4200 is an example that still fails. The stack overflow is not a sufficient explanation, even if it solves one example of the bug. |
Another datapoint that seems related? Can move to another issue if it seems to be a different problem. $ nix build --impure .
querying info about missing pathsfree(): invalid pointer
[1] 15478 abort (core dumped) nix build . --impure
$ nix --version
nix (Nix) 3.0pre20201020_e0ca98c `thread apply all bt` (client)
|
Crucially this introduces BoehmGCStackAllocator, but it also adds a bunch of wiring to avoid making libutil depend on bdw-gc. Part of the solutions for NixOS#4178, NixOS#4200
I'm seeing something maybe related while evaluating a NixOS configuration. I tested with NixOS/nixpkgs#102932 and it doesn't fix the issue. Backtrace
|
@lopsided98 It seems like the stack overflow protection page was going to be scanned. This is a likely cause for your trace. I've updated NixOS/nixpkgs#102932 with a patch. Could you test again with the updated PR? |
That doesn't fix it; the backtrace looks pretty much the same to me (the exact crash location is non-deterministic, so a simple diff isn't that useful). Backtrace
|
@lopsided98 I've added a workaround you can try, although it's probably not final: #4264 (comment). |
Thanks, that fixes the problem. |
For others who land on this page. Until there's a proper fix, you may try adding
this seems to allow IFD to complete successfully for me |
If anyone wants an nix flake pinned repro (though not minimal):
The |
@willbush FYI you can skip the
More on topic: I could reproduce the error on that flake. |
#4944 solves the problem for my example and I wasn't able to reproduce it in the examples linked by others either. |
Related: - <NixOS/nix#4178> - <NixOS/nix#4178 (comment)> - <NixOS/hydra#1186> Signed-off-by: Gaoyang Zhang <[email protected]>
Describe the bug
Nix segfaults when when evaluating. (although bisect suggests it's due to libstore)
Steps To Reproduce
Expected behavior
No segfault; successful evaluation.
Version
master
since #4030system: "x86_64-linux", multi-user?: yes, version: nix-env (Nix) 3.0pre19700101_dirty, channels(user): "", channels(root): "", nixpkgs: /home/user/system-config/nix/like-nixpkgs
Additional context
I've tested run this bisect with aThis also happens when nix-daemon is 2.3.7. Also the protocol changed between commits in that PR, so the commit found by bisect actually reports a protocol incompatibility rather than a crash.nix-daemon
from master, regardless of the revision under test. The bug might not be triggered when the daemon is older than #4030.Running it in
gdb
avoided the crash.valgrind
finds fourConditional jump or move depends on uninitialised value(s)
in boehmgc and later crashes itself.valgrind log
git bisect session
The text was updated successfully, but these errors were encountered: