-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime/cgo: can't safely use signal handling on Darwin, missing support from Go runtime #22805
Comments
And @bcmills. |
Well, no. The Go runtime is using direct syscalls, but as I understand it the BSD-style platforms define libc as the supported API, not the direct syscalls—and Probably the most robust solution would be to extend https://golang.org/cl/33142 to the BSD platforms. (If there is any C code in the binary, we really ought to use libc instead of doing our own syscalls.) |
Portable C code cannot safely restore any signal handler in a multi-threaded program anyway: it is possible for some other (For that matter, portable C code cannot safely set any signal handler in a multi-threaded program: instead, it should register handlers before threads have started and change their behavior through atomic variables rather than If we can't fix it easily, it's at least worth documenting, but we should give a more robust recommendation: forward signals explicitly, don't try to restore their handlers. |
@bcmills you're raising a strawman here. The Go runtime calls sigaction once upon runtime initialization. If a Go user wishes to call a C library that plays with signals knowing that all the calls to that library will be serialized and there is no other uses of C code using signals in the rest of the program, then saving/restoring the signal configuration is safe wrt threading. At CockroachDB we ran into this when using an external library to read input in an interactive program. In any REPL the signalling logic will be perfectly single-threaded. I don't think this is an exotic case and I believe the Go runtime should help by providing either some support or using libc's own sigaction as you proposed above. |
I must be missing something. If you know for a fact that no other libraries in the entire program are making use of the signal, why restore the handler at all? |
Pseudo-code: for {
line := C.read() // <- this temporarily overrides the handler for SIGWINCH then restores it
doSomethingInGo(line) // this can take some time, during which SIGWINCH can be received
} In this code, if Go's trampoline is not restored properly by |
I understand what the example code does. What I don't understand is the underlying problem or use-case it attempts to address. That code is racy unless you somehow know that the signal is only delivered synchronously from the same thread (e.g. via In either of those cases, the program has to be prepared for the signal to arrive at the wrong handler and respond accordingly, presumably by forwarding to the correct handler. And if you're forwarding anyway, saving and restoring the handler adds system calls and potential races for little apparent benefit. |
I think that if we fix #17490 then this problem will go away. I don't think that we should provide Closing this since I think this will wind up getting fixed incidentally, and since I don't think there is anything reasonable to do until that happens. Please comment if you disagree. |
I don't believe that's true -- Go's os/signal does not expose runtime.sigtramp, and does not allow the C code to restore Go's behavior after it completes. Fixing #17490 will only "make this problem go away" if libSystem does not expose the |
What I meant by saying that the C code can cooperate with the os/signal package is that Go's os/signal package can report whenever You're correct that I am assuming that libSystem does not expose |
How does the Go code can "report that the signal occurs to the C code"? |
You're right: that case is not going to work. Signal handling in multiple-language programs has many difficulties. In any case, we are not going to add |
Of the various language run-times I've had the chance to work with (ML, Erlang, Python's), Go is the only one that causes difficulty here. I don't know what the best solution is (I'm OK reading you don't like the idea of an extra API), however the issue described by the title of this issue still exists, and the proposed strategy (addressing #17490) does not yet clearly appear to be a solution to this issue. I'd say the resolution here is ... lacking. |
Do you recognize that if used the title "runtime/cgo: can't safely use signal handling on Darwin" without writing "missing support from the runtime", the issue would still not be resolved? I can perhaps create a new issue that describes the problem at hand, and then we can spell out how exact;y having the Go runtime use libSystem directly will address the use case. At least it would help spell out the requirement that Go's signal handlers must still properly run when libc installs its own trampoline ( |
If Go code always installs signal handlers using Let's fix #17490, which we have to fix anyhow, and then see where we are. |
Change https://golang.org/cl/116875 mentions this issue: |
sigaction, sigprocmask, sigaltstack, and raiseproc. Fix bug in mstart_stub where we weren't saving callee-saved registers, so if an m finished the pthread library calling mstart_stub would sometimes fail. Update #17490 Update #22805 Change-Id: Ie297ede0997910aa956834e49e85711b90cdfaa7 Reviewed-on: https://go-review.googlesource.com/116875 Run-TryBot: Ian Lance Taylor <[email protected]> Reviewed-by: Ian Lance Taylor <[email protected]>
As part of its CPU feature detection, CryptoPP installs a SIGILL signal handler before issuing the cpuid instruction. The intent is to gracefully degrade on CPUs that don't support the cpuid instruction. The problem is that it is impossible to safely overwrite a signal handler installed by the Go runtime in go1.10 on macOS (golang/go#22805). This causes CockroachDB 2.0 to crash on macOS Mojave: cockroachdb/cockroach#31380. The situation has improved on the Go front, as go1.11 makes it possible to safely save and restore signal handlers installed by the Go runtime on macOS. Still, we can do better and support go1.10. There is no need to bother installing a SIGILL handler, as the cpuid instruction is supported by every x86-64 CPU. We can instead use conditional compilation to make sure that we never execute a cpuid instruction on a non x86-64 CPU. Note that CPU feature detection is performed at executable load time (see the attribute(constructor) on DetectX86Features); therefore any reference to function which calls DetectX86Features (notably HasAESNI) corrupts the signal handler. It's not entirely clear why this corruption later leads to the SIGTRAP seen in cockroachdb/cockroach#31380--is something in macOS or the Go runtime generating a SIGILL and trying to handle it gracefully?--but regardless, not mucking with the signal handler fixes the issue.
As part of its CPU feature detection, CryptoPP installs a SIGILL signal handler before issuing the cpuid instruction. The intent is to gracefully degrade on CPUs that don't support the cpuid instruction. The problem is that it is impossible to safely overwrite a signal handler installed by the Go runtime in go1.10 on macOS (golang/go#22805). This causes CockroachDB 2.0 to crash on macOS Mojave: cockroachdb/cockroach#31380. The situation has improved on the Go front, as go1.11 makes it possible to safely save and restore signal handlers installed by the Go runtime on macOS. Still, we can do better and support go1.10. There is no need to bother installing a SIGILL handler, as the cpuid instruction is supported by every x86-64 CPU. We can instead use conditional compilation to make sure that we never execute a cpuid instruction on a non x86-64 CPU. Note that CPU feature detection is performed at executable load time (see the attribute(constructor) on DetectX86Features); therefore any reference to function which calls DetectX86Features (notably HasAESNI) corrupts the signal handler. It's not entirely clear why this corruption later leads to the SIGTRAP seen in cockroachdb/cockroach#31380--is something in macOS or the Go runtime generating a SIGILL and trying to handle it gracefully?--but regardless, not mucking with the signal handler fixes the issue.
Bump CryptoPP to pick up a fix for cockroachdb#31380. Details reproduced below. Fix cockroachdb#31380. --- As part of its CPU feature detection, CryptoPP installs a SIGILL signal handler before issuing the cpuid instruction. The intent is to gracefully degrade on CPUs that don't support the cpuid instruction. The problem is that it is impossible to safely overwrite a signal handler installed by the Go runtime in go1.10 on macOS (golang/go#22805). This causes CockroachDB 2.0 to crash on macOS Mojave: cockroachdb#31380. The situation has improved on the Go front, as go1.11 makes it possible to safely save and restore signal handlers installed by the Go runtime on macOS. Still, we can do better and support go1.10. There is no need to bother installing a SIGILL handler, as the cpuid instruction is supported by every x86-64 CPU. We can instead use conditional compilation to make sure that we never execute a cpuid instruction on a non x86-64 CPU. Note that CPU feature detection is performed at executable load time (see the attribute(constructor) on DetectX86Features); therefore any reference to function which calls DetectX86Features (notably HasAESNI) corrupts the signal handler. It's not entirely clear why this corruption later leads to the SIGTRAP seen in cockroachdb#31380--is something in macOS or the Go runtime generating a SIGILL and trying to handle it gracefully?--but regardless, not mucking with the signal handler fixes the issue. Release note (bug fix): CockroachDB no longer crashes due to a SIGTRAP error soon after startup on macOS Mojave (cockroachdb#31380).
31516: c-deps: bump CryptoPP to avoid SIGTRAP on macOS r=mberhault a=benesch Bump CryptoPP to pick up a fix for #31380. Details reproduced below. Fix #31380. --- As part of its CPU feature detection, CryptoPP installs a SIGILL signal handler before issuing the cpuid instruction. The intent is to gracefully degrade on CPUs that don't support the cpuid instruction. The problem is that it is impossible to safely overwrite a signal handler installed by the Go runtime in go1.10 on macOS (golang/go#22805). This causes CockroachDB 2.0 to crash on macOS Mojave: #31380. The situation has improved on the Go front, as go1.11 makes it possible to safely save and restore signal handlers installed by the Go runtime on macOS. Still, we can do better and support go1.10. There is no need to bother installing a SIGILL handler, as the cpuid instruction is supported by every x86-64 CPU. We can instead use conditional compilation to make sure that we never execute a cpuid instruction on a non x86-64 CPU. Note that CPU feature detection is performed at executable load time (see the attribute(constructor) on DetectX86Features); therefore any reference to function which calls DetectX86Features (notably HasAESNI) corrupts the signal handler. It's not entirely clear why this corruption later leads to the SIGTRAP seen in #31380--is something in macOS or the Go runtime generating a SIGILL and trying to handle it gracefully?--but regardless, not mucking with the signal handler fixes the issue. Release note (bug fix): CockroachDB no longer crashes due to a SIGTRAP error soon after startup on macOS Mojave (#31380). 31517: roachtest: deflake acceptance/bank/zerosum-splits r=andreimatei a=benesch This test requires that the experimental_force_split_at session var be set to force ALTER ... SPLIT AT to work even with the merge queue enabled. gosql.DB's connection pool will occasionally open a new connection which does not have the var set. Set the session var in the same batch of statements as the ALTER ... SPLIT AT command so that the session var is always set in the session that executes the ALTER ... SPLIT AT command. Fix #31510. Release note: None Co-authored-by: Nikhil Benesch <[email protected]>
Bump CryptoPP to pick up a fix for cockroachdb#31380. Details reproduced below. Fix cockroachdb#31380. --- As part of its CPU feature detection, CryptoPP installs a SIGILL signal handler before issuing the cpuid instruction. The intent is to gracefully degrade on CPUs that don't support the cpuid instruction. The problem is that it is impossible to safely overwrite a signal handler installed by the Go runtime in go1.10 on macOS (golang/go#22805). This causes CockroachDB 2.0 to crash on macOS Mojave: cockroachdb#31380. The situation has improved on the Go front, as go1.11 makes it possible to safely save and restore signal handlers installed by the Go runtime on macOS. Still, we can do better and support go1.10. There is no need to bother installing a SIGILL handler, as the cpuid instruction is supported by every x86-64 CPU. We can instead use conditional compilation to make sure that we never execute a cpuid instruction on a non x86-64 CPU. Note that CPU feature detection is performed at executable load time (see the attribute(constructor) on DetectX86Features); therefore any reference to function which calls DetectX86Features (notably HasAESNI) corrupts the signal handler. It's not entirely clear why this corruption later leads to the SIGTRAP seen in cockroachdb#31380--is something in macOS or the Go runtime generating a SIGILL and trying to handle it gracefully?--but regardless, not mucking with the signal handler fixes the issue. Release note (bug fix): CockroachDB no longer crashes due to a SIGTRAP error soon after startup on macOS Mojave (cockroachdb#31380).
As part of its CPU feature detection, CryptoPP installs a SIGILL signal handler before issuing the cpuid instruction. The intent is to gracefully degrade on CPUs that don't support the cpuid instruction. The problem is that it is impossible to safely overwrite a signal handler installed by the Go runtime in go1.10 on macOS (golang/go#22805). This causes CockroachDB 2.0 to crash on macOS Mojave: cockroachdb/cockroach#31380. The situation has improved on the Go front, as go1.11 makes it possible to safely save and restore signal handlers installed by the Go runtime on macOS. Still, we can do better and support go1.10. There is no need to bother installing a SIGILL handler, as the cpuid instruction is supported by every x86-64 CPU. We can instead use conditional compilation to make sure that we never execute a cpuid instruction on a non x86-64 CPU. Note that CPU feature detection is performed at executable load time (see the attribute(constructor) on DetectX86Features); therefore any reference to function which calls DetectX86Features (notably HasAESNI) corrupts the signal handler. It's not entirely clear why this corruption later leads to the SIGTRAP seen in cockroachdb/cockroach#31380--is something in macOS or the Go runtime generating a SIGILL and trying to handle it gracefully?--but regardless, not mucking with the signal handler fixes the issue.
Bump CryptoPP to pick up a fix for cockroachdb#31380. Details reproduced below. Fix cockroachdb#31380. --- As part of its CPU feature detection, CryptoPP installs a SIGILL signal handler before issuing the cpuid instruction. The intent is to gracefully degrade on CPUs that don't support the cpuid instruction. The problem is that it is impossible to safely overwrite a signal handler installed by the Go runtime in go1.10 on macOS (golang/go#22805). This causes CockroachDB 2.0 to crash on macOS Mojave: cockroachdb#31380. The situation has improved on the Go front, as go1.11 makes it possible to safely save and restore signal handlers installed by the Go runtime on macOS. Still, we can do better and support go1.10. There is no need to bother installing a SIGILL handler, as the cpuid instruction is supported by every x86-64 CPU. We can instead use conditional compilation to make sure that we never execute a cpuid instruction on a non x86-64 CPU. Note that CPU feature detection is performed at executable load time (see the attribute(constructor) on DetectX86Features); therefore any reference to function which calls DetectX86Features (notably HasAESNI) corrupts the signal handler. It's not entirely clear why this corruption later leads to the SIGTRAP seen in cockroachdb#31380--is something in macOS or the Go runtime generating a SIGILL and trying to handle it gracefully?--but regardless, not mucking with the signal handler fixes the issue. Release note (bug fix): CockroachDB no longer crashes due to a SIGTRAP error soon after startup on macOS Mojave (cockroachdb#31380). diff --git a/c-deps/cryptopp b/c-deps/cryptopp index c621ce0532..6d6064445d 160000 --- a/c-deps/cryptopp +++ b/c-deps/cryptopp @@ -1 +1 @@ -Subproject commit c621ce053298fafc1e59191079c33acd76045c26 +Subproject commit 6d6064445ded787c7d6bf0a021fb718fddb63345
tl;dr: Problem: C's standard
sigaction
on Darwin cannot retrieve the current signal trampoline, therefore cgo code cannot properly restore the Go trampoline. Solution: document the problem and expose a C symbol for use by cgo code to restore Go's signal configuration.The following repository contains an example of the problem and an example of the manual, ugly solution (barring a clean solution provided by Go):
https://github.com/knz/sigtramp-bug
(Note: this code can only be compiled and run on a darwin system.)
What did you expect to see?
The following code, when called from cgo, should restore Go's signal handler properly:
The final
sigaction
call in the code above should ensure that the values saved at the beginning are restored, and thus that any change by the C code is invisible to Go when the function returns.What did you see instead?
See e.g. https://github.com/knz/sigtramp-bug/tree/master/cgo-failure
After the function completes, issuing the signal causes the following panic:
Or sometimes also:
Anyway, investigation reveals the following:
https://opensource.apple.com/source/Libc/Libc-1244.1.7/sys/sigaction.c.auto.html
In other words,
sigaction
always overrides the trampoline with Apple's own_sigtramp
. Go'sruntime.sigtramp
is lost.Moreover, the underlying syscall
__sigaction
can set but not retrieve the signal handler (see declaration in the same file: theoact
argument has typestruct sigaction
notstruct __sigaction
).This is arguably a bug/misfeature on Apple's side.
Solution - manual
The manual solution is to manually re-inject Go's own
runtime.sigtramp
in the signal configuration, as done here:https://github.com/knz/sigtramp-bug/tree/master/cgo-fix
This is cumbersome because:
runtime.sigtramp
is not exported, so one has to jump hoops to retrieve it: https://github.com/knz/go-libedit/tree/master/unix/sigtramp__sigaction
), which is not guaranteed to stay over time.Solution - desired
the Go docs about Cgo should say "a program that wishes to temporarily set signal handler and then restore Go's signal configuration must use a function
sigrestore
provided by the Go runtime` (or alternatively a pair sigsave/sigrestore)the Go runtime should provide the corresponding (
sigsave
/)sigrestore
function for use by Cgo code.The text was updated successfully, but these errors were encountered: