-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
crypto/x509: macos 10.14 SIGSEGV in crypto/x509._Cfunc_FetchPEMRoots #28092
Comments
You say that the crash is random. Is it always in CC @randall77 |
@ianlancetaylor when I said random means that it happens at random, not that it happens with random functions. The same code works fine until I just get this stack trace on the process that has been running for quite a bit, all with same compiled binary. I did not compare stack traces so far. I will catch a couple more and report back on this if they all crash in same function or not. |
Sorry for the delay, confirmed that it breaks in same place for me, always same stack trace. This happens when running anything that loads SSL/TLS related things, and as noted before it does not happen consistently, but rather at random. For now I am falling back to 1.10.4 which works fine (I guess something that has not been backported to 1.10 branch is causing this?) |
This traceback looks to me like the C code invoked by crypto/x509 had a bad memory reference. Re-titling to reflect that this is most likely an issue in crypto/x509, not in the runtime. |
@akamensky Can you check if this is still an issue at tip with the new macOS roots code? |
@FiloSottile I currently have no way to verify whether this issue is resolved or not. Meanwhile after doing completely clean install of 1.11.4 (and now 1.11.5) with removing entire $GOPATH/* and then rebuilding everything I had not seen this issue for awhile. Which possibly suggests it may have been caused by using different go compiler version / different standard library version to compile one of dependencies and then linking it with binary(?) Since I have not seen this issue for awhile, I think okay to close (I guess can reopen if starts appearing again?) |
Thank you, indeed reopen if it happens again! |
I'm seeing something very similar:
This happens intermittently when running any Go program that makes network requests (including |
Re-opening. @bitfield, could you provide a little more detail? For example, |
Here's another one:
I'd say this happens about one or two times in ten when running any Go binary that makes network calls. |
@bitfield, is it possible to run this under gdb or lldb to see where the SIGSEGV is happening in the C code? |
Sure, just let me know what command I need to run to do this. |
In gdb, just start your program under gdb with "gdb --args <command...>" and then type "run". It will stop when there's a segfault. At that point the output of running "backtrace" and "info locals" is probably the most useful starting point. I'm not an lldb user, so I'm not sure how to do this with lldb. @dr2chase? |
|
It appears it's basically the same in lldb: lldb -- <command...>
run
# wait for it to segfault
bt |
Thanks. Here's the simplest program that will demonstrate the crash: package main
import (
"fmt"
"net/http"
)
func main() {
res, _ := http.Get("https://postman-echo.com/get")
fmt.Println(res.Status)
} And here's the
|
Also, just like OP, I've had this problem on every 1.11 and 1.12 version, and it was fine before that. My Go installation is Homebrew, and I'm working outside GOPATH. |
All reports, it's good to reconfirm which version of Darwin it is, and maybe which version of XCode, too. I just did 200 repetitions of the tiny test program with no error. Darwin version is 10.14.4 (Apple glyph -> "About this Mac")
|
sigh Somehow objdump failed to symbolize that. The function that's crashing is at PC 0x1320bfa (called from PC 0x100146d), but I don't know what that is. Could you also paste |
Any news on this? It's frustrating to get constant crashes in any Go program that accesses the network. |
I matched the unsybolized listing above to a binary I have symbols for, and that's a symbol stub for CFEqual.
Judging from the calls around it, it's this function.
I'll go look at the docs of |
Yup, we are using it wrong! I'll send a CL. |
Change https://golang.org/cl/178537 mentions this issue: |
This should be now fixed at tip. Please test it with https://golang.org/dl/gotip and report back. If it works, I am going to file this for cherry-picking.
|
Woohoo! It works! Thanks @FiloSottile! |
@gopherbot please open backport issues for https://golang.org/cl/178537. This fixes a crashing bug with no known workaround for certain macOS environments. CL 178537 is very minimal and fit for backporting. (The rest of the chain, and CL 178539 in particular, are more speculative and only fix unrecognized roots for which there is a manual workaround, so let's not backport those.) I feel like we should backport to both 1.11 and 1.12, since without this it's impossible to use 1.11 on certain macOS systems. (Although I guess using the next 1.12 point release could count as a "workaround"?) |
Backport issue(s) opened: #32281 (for 1.11), #32282 (for 1.12). Remember to create the cherry-pick CL(s) as soon as the patch is submitted to master, according to https://golang.org/wiki/MinorReleases. |
Change https://golang.org/cl/179339 mentions this issue: |
Change https://golang.org/cl/179340 mentions this issue: |
Not impossible, just intensely annoying. |
…cy on macOS CFDictionaryGetValueIfPresent does not take ownership of the value, so releasing the properties dictionary before passing the value to CFEqual can crash. Not really clear why this works most of the time. See https://developer.apple.com/library/archive/documentation/CoreFoundation/Conceptual/CFMemoryMgmt/Concepts/Ownership.html Fixes #32281 Updates #28092 Updates #30763 Change-Id: I5ee7ca276b753a48abc3aedfb78b8af68b448dd4 Reviewed-on: https://go-review.googlesource.com/c/go/+/178537 Reviewed-by: Adam Langley <[email protected]> (cherry picked from commit a3d4655) Reviewed-on: https://go-review.googlesource.com/c/go/+/179340 Run-TryBot: Dmitri Shuralyov <[email protected]> TryBot-Result: Gobot Gobot <[email protected]> Reviewed-by: Dmitri Shuralyov <[email protected]>
…cy on macOS CFDictionaryGetValueIfPresent does not take ownership of the value, so releasing the properties dictionary before passing the value to CFEqual can crash. Not really clear why this works most of the time. See https://developer.apple.com/library/archive/documentation/CoreFoundation/Conceptual/CFMemoryMgmt/Concepts/Ownership.html Fixes #32282 Updates #28092 Updates #30763 Change-Id: I5ee7ca276b753a48abc3aedfb78b8af68b448dd4 Reviewed-on: https://go-review.googlesource.com/c/go/+/178537 Reviewed-by: Adam Langley <[email protected]> (cherry picked from commit a3d4655) Reviewed-on: https://go-review.googlesource.com/c/go/+/179339 Run-TryBot: Dmitri Shuralyov <[email protected]> TryBot-Result: Gobot Gobot <[email protected]> Reviewed-by: Dmitri Shuralyov <[email protected]>
Change https://golang.org/cl/227037 mentions this issue: |
+----------------------------------------------------------------------+ | Hello, if you are reading this and run macOS, please test this code: | | | | $ GO111MODULE=on go get golang.org/dl/gotip@latest | | $ gotip download | | $ GODEBUG=x509roots=1 gotip test crypto/x509 -v -run TestSystemRoots | +----------------------------------------------------------------------+ We currently have two code paths to extract system roots on macOS: one uses cgo to invoke a maze of Security.framework APIs; the other is a horrible fallback that runs "/usr/bin/security verify-cert" on every root that has custom policies to check if it's trusted for SSL. The fallback is not only terrifying because it shells out to a binary, but also because it lets in certificates that are not trusted roots but are signed by trusted roots, and because it applies some filters (EKUs and expiration) only to roots with custom policies, as the others are not passed to verify-cert. The other code path, of course, requires cgo, so can't be used when cross-compiling and involves a large ball of C. It's all a mess, and it broke oh-so-many times (#14514, #16532, #19436, #20990, #21416, #24437, #24652, #25649, #26073, #27958, #28025, #28092, #29497, #30471, #30672, #30763, #30889, #32891, #38215, #38365, ...). Since macOS does not have a stable syscall ABI, we already dynamically link and invoke libSystem.dylib regardless of cgo availability (#17490). How that works is that functions in package syscall (like syscall.Open) take the address of assembly trampolines (like libc_open_trampoline) that jump to symbols imported with cgo_import_dynamic (like libc_open), and pass them along with arguments to syscall.syscall (which is implemented as runtime.syscall_syscall). syscall_syscall informs the scheduler and profiler, and then uses asmcgocall to switch to a system stack and invoke runtime.syscall. The latter is an assembly trampoline that unpacks the Go ABI arguments passed to syscall.syscall, finally calls the remote function, and puts the return value on the Go stack. (This last bit is the part that cgo compiles from a C wrapper.) We can do something similar to link and invoke Security.framework! The one difference is that runtime.syscall and friends check errors based on the errno convention, which Security doesn't follow, so I added runtime.syscallNoErr which just skips interpreting the return value. We only need a variant with six arguments because the calling convention is register-based, and extra arguments simply zero out some registers. That's plumbed through as crypto/x509/internal/macOS.syscall. The rest of that package is a set of wrappers for Security.framework and Core Foundation functions, like syscall is for libSystem. In theory, as long as macOS respects ABI backwards compatibility (a.k.a. as long as binaries built for a previous OS version keep running) this should be stable, as the final result is not different from what a C compiler would make. (One exception might be dictionary key strings, which we make our own copy of instead of using the dynamic symbol. If they change the value of those strings things might break. But why would they.) Finally, I rewrote the crypto/x509 cgo logic in Go using those wrappers. It works! I tried to make it match 1:1 the old logic, so that root_darwin_amd64.go can be reviewed by comparing it to root_cgo_darwin_amd64.go. The only difference is that we do proper error handling now, and assume that if there is no error the return values are there, while before we'd just check for nil pointers and move on. I kept the cgo logic to help with review and testing, but we should delete it once we are confident the new code works. The nocgo logic is gone and we shall never speak of it again. Fixes #32604 Fixes #19561 Fixes #38365 Awakens Cthulhu Change-Id: Id850962bad667f71e3af594bdfebbbb1edfbcbb4 Reviewed-on: https://go-review.googlesource.com/c/go/+/227037 Reviewed-by: Katie Hockman <[email protected]>
What version of Go are you using (
go version
)?go version go1.11.1 darwin/amd64
Does this issue reproduce with the latest release?
Yes
What operating system and processor architecture are you using (
go env
)?What did you do?
Just running (and debugging) the code that was working perfectly fine on 1.10 (since 1.11 release a lot of weird issues like debugger cannot kill process sometimes, and runaway CPU usage randomly etc etc)
What did you expect to see?
No crashing
What did you see instead?
The text was updated successfully, but these errors were encountered: