Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime/trace: crash during traceAdvance when collecting call stack for cgo-calling goroutine [1.23 backport] #69087

Closed
gopherbot opened this issue Aug 27, 2024 · 2 comments
Labels
CherryPickApproved Used during the release process for point releases compiler/runtime Issues related to the Go compiler and/or runtime.
Milestone

Comments

@gopherbot
Copy link
Contributor

@mknyszek requested issue #69085 to be considered for backport to the next 1.23 minor release.

@gopherbot Please open a backport issue for Go 1.23.

This bug is causing random crashes when using the tracer that have no workaround. It is also a regression that only exists in Go 1.23. The fix is also small and simple, but admittedly subtle.

@gopherbot gopherbot added the CherryPickCandidate Used during the release process for point releases label Aug 27, 2024
@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Aug 27, 2024
@gopherbot gopherbot added this to the Go1.23.1 milestone Aug 27, 2024
@gopherbot
Copy link
Contributor Author

Change https://go.dev/cl/608835 mentions this issue: [release-branch.go1.23] runtime: store bp on cgocallback as unsafe.Pointer

@prattmic prattmic added the CherryPickApproved Used during the release process for point releases label Aug 28, 2024
@gopherbot gopherbot removed the CherryPickCandidate Used during the release process for point releases label Aug 28, 2024
gopherbot pushed a commit that referenced this issue Aug 28, 2024
…inter

As of CL 580255, the runtime tracks the frame pointer (or base pointer,
bp) when entering syscalls, so that we can use fpTracebackPCs on
goroutines that are sitting in syscalls. That CL mostly got things
right, but missed one very subtle detail.

When calling from Go->C->Go, the goroutine stack performing the calls
when returning to Go is free to move around in memory due to growth,
shrinking, etc. But upon returning back to C, it needs to restore
gp.syscall*, including gp.syscallsp and gp.syscallbp. The way syscallsp
currently gets updated is automagically: it's stored as an
unsafe.Pointer on the stack so that it shows up in a stack map. If the
stack ever moves, it'll get updated correctly. But gp.syscallbp isn't
saved to the stack as an unsafe.Pointer, but rather as a uintptr, so it
never gets updated! As a result, in rare circumstances, fpTracebackPCs
can correctly try to use gp.syscallbp as the starting point for the
traceback, but the value is stale.

This change fixes the problem by just storing gp.syscallbp to the stack
on cgocallback as an unsafe.Pointer, like gp.syscallsp. It also adds a
comment documenting this subtlety; the lack of explanation for the
unsafe.Pointer type on syscallsp meant this detail was missed -- let's
not miss it again in the future.

Now, we have a fix, what about a test? Unfortunately, testing this is
going to be incredibly annoying because the circumstances under which
gp.syscallbp are actually used for traceback are non-deterministic and
hard to arrange, especially from within testprogcgo where we don't have
export_test.go and can't reach into the runtime.

So, instead, add a gp.syscallbp check to reentersyscall and
entersyscallblock that mirrors the gp.syscallbp consistency check. This
probably causes some miniscule slowdown to the syscall path, but it'll
catch the issue without having to actually perform a traceback.

For #69085.
Fixes #69087.

Change-Id: Iaf771758f1666024b854f5fbe2b2c63cbe35b201
Reviewed-on: https://go-review.googlesource.com/c/go/+/608775
Reviewed-by: Nick Ripley <[email protected]>
LUCI-TryBot-Result: Go LUCI <[email protected]>
Reviewed-by: Michael Pratt <[email protected]>
Reviewed-by: Cherry Mui <[email protected]>
(cherry picked from commit 54fe0fd)
Reviewed-on: https://go-review.googlesource.com/c/go/+/608835
@gopherbot
Copy link
Contributor Author

Closed by merging CL 608835 (commit 9c939a1) to release-branch.go1.23.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CherryPickApproved Used during the release process for point releases compiler/runtime Issues related to the Go compiler and/or runtime.
Projects
None yet
Development

No branches or pull requests

3 participants
@prattmic @gopherbot and others