-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SIGINT crash during graceful shutdown with memory storage on macOS Mojave #31380
Comments
Thanks @willbuckner. I've been running into something similar today, though for me it immediately crashes the process when it starts. @benesch look what this is:
|
Ok, I think I'm closing in on a small repro (not for this exact crash, but for the one I get locally via gopsutil which I hope amounts to the same thing at the end). I've trimmed down package main
// NB: the C import doesn't make a difference, I just have it here so that
// I don't accidentally drop out of cgo mode.
// static int foo = 0;
import "C"
import (
"context"
_ "github.com/cockroachdb/cockroach/pkg/ccl" // ccl init hooks
// _ "github.com/cockroachdb/cockroach/pkg/ccl/cliccl" // cliccl init hooks
// _ "github.com/cockroachdb/cockroach/pkg/ui/distccl" // ccl web UI init hook
gopsnet "github.com/shirou/gopsutil/net"
)
func main() {
_, _ = gopsnet.IOCountersWithContext(context.TODO(), true)
} |
Amazing! This comes to mind:
|
Yeah, that's what I tracked it down to! Currently trimming the fat. |
No conclusion here yet but I doubt @benesch can resist taking this down tomorrow. The last I know is that the crash only happens when Essentially the repro boils down to func main() {
_ = C.UsesAESNI
cmd := exec.Command("echo")
cmd.Run()
} Very 🍿 to learn what this is caused by. |
Branch is here: https://github.com/tschottdorf/cockroach/tree/delme-repro
|
As part of its CPU feature detection, CryptoPP installs a SIGILL signal handler before issuing the cpuid instruction. The intent is to gracefully degrade on CPUs that don't support the cpuid instruction. The problem is that it is impossible to safely overwrite a signal handler installed by the Go runtime in go1.10 on macOS (golang/go#22805). This causes CockroachDB 2.0 to crash on macOS Mojave: cockroachdb/cockroach#31380. The situation has improved on the Go front, as go1.11 makes it possible to safely save and restore signal handlers installed by the Go runtime on macOS. Still, we can do better and support go1.10. There is no need to bother installing a SIGILL handler, as the cpuid instruction is supported by every x86-64 CPU. We can instead use conditional compilation to make sure that we never execute a cpuid instruction on a non x86-64 CPU. Note that CPU feature detection is performed at executable load time (see the attribute(constructor) on DetectX86Features); therefore any reference to function which calls DetectX86Features (notably HasAESNI) corrupts the signal handler. It's not entirely clear why this corruption later leads to the SIGTRAP seen in cockroachdb/cockroach#31380--is something in macOS or the Go runtime generating a SIGILL and trying to handle it gracefully?--but regardless, not mucking with the signal handler fixes the issue.
As part of its CPU feature detection, CryptoPP installs a SIGILL signal handler before issuing the cpuid instruction. The intent is to gracefully degrade on CPUs that don't support the cpuid instruction. The problem is that it is impossible to safely overwrite a signal handler installed by the Go runtime in go1.10 on macOS (golang/go#22805). This causes CockroachDB 2.0 to crash on macOS Mojave: cockroachdb/cockroach#31380. The situation has improved on the Go front, as go1.11 makes it possible to safely save and restore signal handlers installed by the Go runtime on macOS. Still, we can do better and support go1.10. There is no need to bother installing a SIGILL handler, as the cpuid instruction is supported by every x86-64 CPU. We can instead use conditional compilation to make sure that we never execute a cpuid instruction on a non x86-64 CPU. Note that CPU feature detection is performed at executable load time (see the attribute(constructor) on DetectX86Features); therefore any reference to function which calls DetectX86Features (notably HasAESNI) corrupts the signal handler. It's not entirely clear why this corruption later leads to the SIGTRAP seen in cockroachdb/cockroach#31380--is something in macOS or the Go runtime generating a SIGILL and trying to handle it gracefully?--but regardless, not mucking with the signal handler fixes the issue.
Bump CryptoPP to pick up a fix for cockroachdb#31380. Details reproduced below. Fix cockroachdb#31380. --- As part of its CPU feature detection, CryptoPP installs a SIGILL signal handler before issuing the cpuid instruction. The intent is to gracefully degrade on CPUs that don't support the cpuid instruction. The problem is that it is impossible to safely overwrite a signal handler installed by the Go runtime in go1.10 on macOS (golang/go#22805). This causes CockroachDB 2.0 to crash on macOS Mojave: cockroachdb#31380. The situation has improved on the Go front, as go1.11 makes it possible to safely save and restore signal handlers installed by the Go runtime on macOS. Still, we can do better and support go1.10. There is no need to bother installing a SIGILL handler, as the cpuid instruction is supported by every x86-64 CPU. We can instead use conditional compilation to make sure that we never execute a cpuid instruction on a non x86-64 CPU. Note that CPU feature detection is performed at executable load time (see the attribute(constructor) on DetectX86Features); therefore any reference to function which calls DetectX86Features (notably HasAESNI) corrupts the signal handler. It's not entirely clear why this corruption later leads to the SIGTRAP seen in cockroachdb#31380--is something in macOS or the Go runtime generating a SIGILL and trying to handle it gracefully?--but regardless, not mucking with the signal handler fixes the issue. Release note (bug fix): CockroachDB no longer crashes due to a SIGTRAP error soon after startup on macOS Mojave (cockroachdb#31380).
31516: c-deps: bump CryptoPP to avoid SIGTRAP on macOS r=mberhault a=benesch Bump CryptoPP to pick up a fix for #31380. Details reproduced below. Fix #31380. --- As part of its CPU feature detection, CryptoPP installs a SIGILL signal handler before issuing the cpuid instruction. The intent is to gracefully degrade on CPUs that don't support the cpuid instruction. The problem is that it is impossible to safely overwrite a signal handler installed by the Go runtime in go1.10 on macOS (golang/go#22805). This causes CockroachDB 2.0 to crash on macOS Mojave: #31380. The situation has improved on the Go front, as go1.11 makes it possible to safely save and restore signal handlers installed by the Go runtime on macOS. Still, we can do better and support go1.10. There is no need to bother installing a SIGILL handler, as the cpuid instruction is supported by every x86-64 CPU. We can instead use conditional compilation to make sure that we never execute a cpuid instruction on a non x86-64 CPU. Note that CPU feature detection is performed at executable load time (see the attribute(constructor) on DetectX86Features); therefore any reference to function which calls DetectX86Features (notably HasAESNI) corrupts the signal handler. It's not entirely clear why this corruption later leads to the SIGTRAP seen in #31380--is something in macOS or the Go runtime generating a SIGILL and trying to handle it gracefully?--but regardless, not mucking with the signal handler fixes the issue. Release note (bug fix): CockroachDB no longer crashes due to a SIGTRAP error soon after startup on macOS Mojave (#31380). 31517: roachtest: deflake acceptance/bank/zerosum-splits r=andreimatei a=benesch This test requires that the experimental_force_split_at session var be set to force ALTER ... SPLIT AT to work even with the merge queue enabled. gosql.DB's connection pool will occasionally open a new connection which does not have the var set. Set the session var in the same batch of statements as the ALTER ... SPLIT AT command so that the session var is always set in the session that executes the ALTER ... SPLIT AT command. Fix #31510. Release note: None Co-authored-by: Nikhil Benesch <[email protected]>
Bump CryptoPP to pick up a fix for cockroachdb#31380. Details reproduced below. Fix cockroachdb#31380. --- As part of its CPU feature detection, CryptoPP installs a SIGILL signal handler before issuing the cpuid instruction. The intent is to gracefully degrade on CPUs that don't support the cpuid instruction. The problem is that it is impossible to safely overwrite a signal handler installed by the Go runtime in go1.10 on macOS (golang/go#22805). This causes CockroachDB 2.0 to crash on macOS Mojave: cockroachdb#31380. The situation has improved on the Go front, as go1.11 makes it possible to safely save and restore signal handlers installed by the Go runtime on macOS. Still, we can do better and support go1.10. There is no need to bother installing a SIGILL handler, as the cpuid instruction is supported by every x86-64 CPU. We can instead use conditional compilation to make sure that we never execute a cpuid instruction on a non x86-64 CPU. Note that CPU feature detection is performed at executable load time (see the attribute(constructor) on DetectX86Features); therefore any reference to function which calls DetectX86Features (notably HasAESNI) corrupts the signal handler. It's not entirely clear why this corruption later leads to the SIGTRAP seen in cockroachdb#31380--is something in macOS or the Go runtime generating a SIGILL and trying to handle it gracefully?--but regardless, not mucking with the signal handler fixes the issue. Release note (bug fix): CockroachDB no longer crashes due to a SIGTRAP error soon after startup on macOS Mojave (cockroachdb#31380).
As part of its CPU feature detection, CryptoPP installs a SIGILL signal handler before issuing the cpuid instruction. The intent is to gracefully degrade on CPUs that don't support the cpuid instruction. The problem is that it is impossible to safely overwrite a signal handler installed by the Go runtime in go1.10 on macOS (golang/go#22805). This causes CockroachDB 2.0 to crash on macOS Mojave: cockroachdb/cockroach#31380. The situation has improved on the Go front, as go1.11 makes it possible to safely save and restore signal handlers installed by the Go runtime on macOS. Still, we can do better and support go1.10. There is no need to bother installing a SIGILL handler, as the cpuid instruction is supported by every x86-64 CPU. We can instead use conditional compilation to make sure that we never execute a cpuid instruction on a non x86-64 CPU. Note that CPU feature detection is performed at executable load time (see the attribute(constructor) on DetectX86Features); therefore any reference to function which calls DetectX86Features (notably HasAESNI) corrupts the signal handler. It's not entirely clear why this corruption later leads to the SIGTRAP seen in cockroachdb/cockroach#31380--is something in macOS or the Go runtime generating a SIGILL and trying to handle it gracefully?--but regardless, not mucking with the signal handler fixes the issue.
Bump CryptoPP to pick up a fix for cockroachdb#31380. Details reproduced below. Fix cockroachdb#31380. --- As part of its CPU feature detection, CryptoPP installs a SIGILL signal handler before issuing the cpuid instruction. The intent is to gracefully degrade on CPUs that don't support the cpuid instruction. The problem is that it is impossible to safely overwrite a signal handler installed by the Go runtime in go1.10 on macOS (golang/go#22805). This causes CockroachDB 2.0 to crash on macOS Mojave: cockroachdb#31380. The situation has improved on the Go front, as go1.11 makes it possible to safely save and restore signal handlers installed by the Go runtime on macOS. Still, we can do better and support go1.10. There is no need to bother installing a SIGILL handler, as the cpuid instruction is supported by every x86-64 CPU. We can instead use conditional compilation to make sure that we never execute a cpuid instruction on a non x86-64 CPU. Note that CPU feature detection is performed at executable load time (see the attribute(constructor) on DetectX86Features); therefore any reference to function which calls DetectX86Features (notably HasAESNI) corrupts the signal handler. It's not entirely clear why this corruption later leads to the SIGTRAP seen in cockroachdb#31380--is something in macOS or the Go runtime generating a SIGILL and trying to handle it gracefully?--but regardless, not mucking with the signal handler fixes the issue. Release note (bug fix): CockroachDB no longer crashes due to a SIGTRAP error soon after startup on macOS Mojave (cockroachdb#31380). diff --git a/c-deps/cryptopp b/c-deps/cryptopp index c621ce0532..6d6064445d 160000 --- a/c-deps/cryptopp +++ b/c-deps/cryptopp @@ -1 +1 @@ -Subproject commit c621ce053298fafc1e59191079c33acd76045c26 +Subproject commit 6d6064445ded787c7d6bf0a021fb718fddb63345
When running
cockroach start --store=type=mem,size=30% ...
, pressing Ctrl+C/sendingSIGINT
results in a crash during shutdown. The log below contains both stdout and stderr output.cockroach.log
Please describe the issue you observed, and any steps we can take to reproduce it:
To Reproduce
What did you do? Describe in your own words.
I started
cockroach
2.0.5 using in-memory storage, and it crashes onSIGINT
.Expected behavior
Not crashing--graceful shutdown.
Additional data / screenshots
cockroach.log
Note that it crashes regardless of any SQL being run.
Environment:
Installed from
brew install cockroach
on macOS Mojave.Additional context
The text was updated successfully, but these errors were encountered: