-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: scanobject can use incorrect span metadata #11267
Comments
I can think of a few possible ways to fix this.
3 seems like the cleanest solution, assuming it doesn't have other ramifications. |
... from a heap object. This code path is also used for scanning stacks, which can of course contain pointers into themselves (and, maybe, in the runtime, to another stack). Don't want to blow up on those. But if we're not scanning a stack, we shouldn't see a stack pointer. |
We already do #2. See stack1.go:620-628. |
I see. Did we just miss the fact that the same logic applies equally to the growing branch of that if? Still, the number of stacks that can shrink during a GC cycle is limited, since the GC is in charge of doing the shrink. These zombie stacks can't last more than the duration of mark termination. If we do the same for growing stacks, user code can leak an arbitrary amount of stack memory during a GC cycle. We don't have a mechanism for pushing back on that like we do for pushing back on allocation. Hence, I'm still leaning toward solution 3. I think that may also eliminate the need for the the delayed free of shrunk stacks. |
I took a stab at 3, and quickly ran into pointers that could point in to either the stack or the heap for deep reasons (like g.sched.ctxt), which throws a monkey-wrench into this solution. But there's a fourth solution that's like delayed freeing of stacks, but I think solves its downsides: free stacks immediately, but delay freeing stack spans. It's perfectly fine to reuse the stacks. It's fine to reuse the stack spans. The danger is only the state transition from stack span to free span (and possibly to heap span after that), so that's all we need to prevent. This looks easy to do for small stacks: just don't immediately free them in stackpoolfree when ref==0. At the end of mark termination, walk the stackpools and free spans with ref==0. For large stacks, I'm not sure what the best way to do this is that also lets us reuse that memory for new stacks. |
I chatted with @rsc about this. The plan is to simply return small stacks to the stack cache without freeing their spans until the end of GC. That way small stacks can be reused, so a program rapidly creating and exiting goroutines won't chew through memory. This should be a fairly small change. For large stacks, for now, we'll simply delay freeing them and not attempt to reuse that memory during a GC cycle. We should be able to reuse the existing delay free mechanism to do this, so that should also be a low risk change. When 1.6 opens, we can reconsider whether and how we could reuse the memory of large stacks. |
CL https://golang.org/cl/11502 mentions this issue. |
When scanobject looks up the span for a pointer it read from the object it's scanning, it's possible for it to get incorrect information for the span, causing it to ignore the pointer, or, much worse, attempt to mark uninitialized memory.
or
This is quite easy to reproduce with a well-placed sleep:
With this sleep,
GOGC=10 ./runtime.test -test.short
crashes reliably for me.@RLH @rsc
The text was updated successfully, but these errors were encountered: