-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: "non in-use span in unswept list" flake on darwin-amd64 #22987
Comments
That sweepgen value looks super bogus. I'm thinking it must be software memory corruption. (Hardware memory corruption would probably be just 1-bit off, rather than a completely different value.) There was also #22988, which was also a darwin-amd64 flake. This makes me suspect there's something darwin-amd64-specific going on. |
Change https://golang.org/cl/83016 mentions this issue: |
Change https://golang.org/cl/83015 mentions this issue: |
heapBits.bits is used during bulkBarrierPreWrite via heapBits.isPointer, which means it must not be preempted. If it is preempted, several bad things can happen: 1. This could allow a GC phase change, and the resulting shear between the barriers and the memory writes could result in a lost pointer. 2. Since bulkBarrierPreWrite uses the P's local write barrier buffer, if it also migrates to a different P, it could try to append to the write barrier buffer concurrently with another write barrier. This can result in the buffer's next pointer skipping over its end pointer, which results in a buffer overflow that can corrupt arbitrary other fields in the Ps (or anything in the heap, really, but it'll probably crash from the corrupted P quickly). Fix this by marking heapBits.bits go:nosplit. This would be the perfect use for a recursive no-preempt annotation (#21314). This doesn't actually affect any binaries because this function was always inlined anyway. (I discovered it when I was modifying heapBits and make h.bits() no longer inline, which led to rampant crashes from problem 2 above.) Updates #22987 and #22988 (but doesn't fix because it doesn't actually change the generated code). Change-Id: I60ebb928b1233b0613361ac3d0558d7b1cb65610 Reviewed-on: https://go-review.googlesource.com/83015 Run-TryBot: Austin Clements <[email protected]> Reviewed-by: Matthew Dempsky <[email protected]> Reviewed-by: Rick Hudson <[email protected]> TryBot-Result: Gobot Gobot <[email protected]>
Currently, wbBufFlush does nothing if the goroutine is dying on the assumption that the system is crashing anyway and running the write barrier may crash it even more. However, it fails to reset the buffer's "next" pointer. As a result, if there are later write barriers on the same P, the write barrier will overflow the write barrier buffer and start corrupting other fields in the P or other heap objects. Often, this corrupts fields in the next allocated P since they tend to be together in the heap. Fix this by always resetting the buffer's "next" pointer, even if we're not doing anything with the pointers in the buffer. Updates #22987 and #22988. (May fix; it's hard to say.) Change-Id: I82c11ea2d399e1658531c3e8065445a66b7282b2 Reviewed-on: https://go-review.googlesource.com/83016 Run-TryBot: Austin Clements <[email protected]> TryBot-Result: Gobot Gobot <[email protected]> Reviewed-by: Rick Hudson <[email protected]> Reviewed-by: Matthew Dempsky <[email protected]>
@aclements Do you think this might be fixed, or should we kick it to 1.11? |
Actually I'm going to close this as a likely dup of #22988. (The same questions arise on that issue: close or kick?) |
https://go-review.googlesource.com/c/go/+/81775 flaked on darwin-amd64 due to:
https://storage.googleapis.com/go-build-log/5cf4a738/darwin-amd64-10_11_e7ffe203.log
/cc @aclements
The text was updated successfully, but these errors were encountered: