-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposal: runtime: GC pacer redesign #44167
Comments
Change https://golang.org/cl/290489 mentions this issue: |
By the way: this design feels solid to me, but has not gone through any rounds of feedback yet. In the interest of transparency, I'm hoping to get feedback and work on this here on GitHub going forward. So, given that, I would not be surprised if there are errors in the document. Please take a look when you have a chance! |
Do I understand correctly that the forcegcperiod is required because the current pacer does not consider non-heap sources of GC work? Is it necessary to call GC periodically in application with effectively zero heap allocation rate to collects stacks, etc.? If I understood your proposal correctly, it seems like it should be possible to remove these periodic calls of GC, and applications that don't create new goroutines and don't allocate anything on heap should never trigger garbage collections, which is a good benefit by itself. |
@storozhukBM I believe Anyway, I have to look into this again so don't quote me. My memory is hazy. :) I'll dig into the reasons why next week (I don't see them documented anywhere). |
For golang/go#44167. Change-Id: I468aa78edb8588b4e48008ad44cecc08544a8f48 Reviewed-on: https://go-review.googlesource.com/c/proposal/+/290489 Reviewed-by: Michael Pratt <[email protected]> Reviewed-by: Jeremy Faller <[email protected]>
Change https://golang.org/cl/292789 mentions this issue: |
Change https://golang.org/cl/293790 mentions this issue: |
A couple of the graphs were wrong (from the wrong scenario, that is) because I copied them in manually. Fatal mistake. Regenerate the graphs following the usual pipeline. Because there's a degree of jitter and randomness in these graphs they end up slightly different, but they're all mostly the same. By regenerating these graphs, it also adds a new line to each graph for the live heap size. I think this is nice for readability, so I'll let that get updated too. For golang/go#44167. Change-Id: I097f812ba07ca7fd740d8460e2830de6492b3945 Reviewed-on: https://go-review.googlesource.com/c/proposal/+/293790 Reviewed-by: Michael Pratt <[email protected]>
Change https://golang.org/cl/295509 mentions this issue: |
I realized I neglected to talk about initial conditions, even though all the simulations clearly set *something*. For golang/go#44167. Change-Id: Ia1727d5c068847e9192bf87bc1b6a5f0bb832303 Reviewed-on: https://go-review.googlesource.com/c/proposal/+/295509 Reviewed-by: Michael Pratt <[email protected]>
Change https://golang.org/cl/306605 mentions this issue: |
Change https://golang.org/cl/306603 mentions this issue: |
Change https://golang.org/cl/306599 mentions this issue: |
Change https://golang.org/cl/306600 mentions this issue: |
Change https://golang.org/cl/306596 mentions this issue: |
Change https://golang.org/cl/306597 mentions this issue: |
Change https://golang.org/cl/306602 mentions this issue: |
Change https://golang.org/cl/306604 mentions this issue: |
Change https://golang.org/cl/306601 mentions this issue: |
Change https://golang.org/cl/306598 mentions this issue: |
Change https://golang.org/cl/308690 mentions this issue: |
Change https://golang.org/cl/309274 mentions this issue: |
Change https://golang.org/cl/309273 mentions this issue: |
Change https://golang.org/cl/350429 mentions this issue: |
Change https://golang.org/cl/353353 mentions this issue: |
Change https://golang.org/cl/353354 mentions this issue: |
Currently gcController.gcPercent is read non-atomically by gcControllerState.revise and gcTrigger.test, but these users may execute concurrently with an update to gcPercent. Although revise's results are best-effort, reading it directly in this way is, generally speaking, unsafe. This change makes gcPercent atomically updated for concurrent readers and documents the complete synchronization semantics. Because gcPercent otherwise only updated with the heap lock held or the world stopped, all other reads can remain unsynchronized. For #44167. Change-Id: If09af103aae84a1e133e2d4fed8ab888d4b8f457 Reviewed-on: https://go-review.googlesource.com/c/go/+/308690 Trust: Michael Knyszek <[email protected]> Run-TryBot: Michael Knyszek <[email protected]> TryBot-Result: Go Bot <[email protected]> Reviewed-by: Michael Pratt <[email protected]>
The sweeper's pacing state is global, so detangle it from the GC pacer's state updates so that the GC pacer can be tested. For #44167. Change-Id: Ibcea989cd435b73c5891f777d9f95f9604e03bd1 Reviewed-on: https://go-review.googlesource.com/c/go/+/309273 Trust: Michael Knyszek <[email protected]> Run-TryBot: Michael Knyszek <[email protected]> TryBot-Result: Go Bot <[email protected]> Reviewed-by: Michael Pratt <[email protected]>
Currently GC pacer updates are applied somewhat haphazardly via direct field access. To facilitate ease of testing, move these field updates into methods. Further CLs will move more of these updates into methods. For #44167. Change-Id: I25b10d2219ae27b356b5f236d44827546c86578d Reviewed-on: https://go-review.googlesource.com/c/go/+/309274 Trust: Michael Knyszek <[email protected]> Run-TryBot: Michael Knyszek <[email protected]> Reviewed-by: Michael Pratt <[email protected]>
This change moves heapLive and heapScan updates on gcController into a method for better testability. It's also less error-prone because code that updates these fields needs to remember to emit traces and/or call gcController.revise; this method now handles those cases. For #44167. Change-Id: I3d6f2e7abb22def27c93feacff50162b0b074da2 Reviewed-on: https://go-review.googlesource.com/c/go/+/309275 Trust: Michael Knyszek <[email protected]> Run-TryBot: Michael Knyszek <[email protected]> TryBot-Result: Go Bot <[email protected]> Reviewed-by: Michael Pratt <[email protected]>
This change adds two fields to gcControllerState: stackScan, used for pacing decisions, and scannableStackSize, which directly tracks the amount of space allocated for inuse stacks that will be scanned. scannableStackSize is not updated directly, but is instead flushed from each P when at an least 8 KiB delta has accumulated. This helps reduce issues with atomics contention for newly created goroutines. Stack growth paths are largely unaffected. StackGrowth-48 51.4ns ± 0% 51.4ns ± 0% ~ (p=0.927 n=10+10) StackGrowthDeep-48 6.14µs ± 3% 6.25µs ± 4% ~ (p=0.090 n=10+9) CreateGoroutines-48 273ns ± 1% 273ns ± 1% ~ (p=0.676 n=9+10) CreateGoroutinesParallel-48 65.5ns ± 5% 66.6ns ± 7% ~ (p=0.340 n=9+9) CreateGoroutinesCapture-48 2.06µs ± 1% 2.07µs ± 4% ~ (p=0.217 n=10+10) CreateGoroutinesSingle-48 550ns ± 3% 563ns ± 4% +2.41% (p=0.034 n=8+10) For #44167. Change-Id: Id1800d41d3a6c211b43aeb5681c57c0dc8880daf Reviewed-on: https://go-review.googlesource.com/c/go/+/309589 Trust: Michael Knyszek <[email protected]> Run-TryBot: Michael Knyszek <[email protected]> TryBot-Result: Go Bot <[email protected]> Reviewed-by: Michael Pratt <[email protected]>
For #44167. Change-Id: I2cd13229d88f630451fabd113b0e5a04841e9e79 Reviewed-on: https://go-review.googlesource.com/c/go/+/309590 Trust: Michael Knyszek <[email protected]> Run-TryBot: Michael Knyszek <[email protected]> TryBot-Result: Go Bot <[email protected]> Reviewed-by: Michael Pratt <[email protected]>
…plicitly This is to facilitate testing of the pacer, since otherwise this is accessing global state, which is impossible to stub out properly. For #44167. Change-Id: I52c3b51fc0ffff38e3bbe534bd66e5761c0003a8 Reviewed-on: https://go-review.googlesource.com/c/go/+/353353 Trust: Michael Knyszek <[email protected]> Run-TryBot: Michael Knyszek <[email protected]> TryBot-Result: Go Bot <[email protected]> Reviewed-by: Michael Pratt <[email protected]>
This change creates a formal exported interface for the GC pacer and creates tests for it that simulate some series of GC cycles. The tests are completely driven by the real pacer implementation, except for assists, which are idealized (though revise is called repeatedly). For #44167. Change-Id: I0112242b07e7702595ca71001d781ad6c1fddd2d Reviewed-on: https://go-review.googlesource.com/c/go/+/353354 Trust: Michael Knyszek <[email protected]> Run-TryBot: Michael Knyszek <[email protected]> Reviewed-by: Michael Pratt <[email protected]> TryBot-Result: Go Bot <[email protected]>
This change implements the GC pacer redesign outlined in #44167 and the accompanying design document, behind a GOEXPERIMENT flag that is on by default. In addition to adding the new pacer, this CL also includes code to track and account for stack and globals scan work in the pacer and in the assist credit system. The new pacer also deviates slightly from the document in that it increases the bound on the minimum trigger ratio from 0.6 (scaled by GOGC) to 0.7. The logic behind this change is that the new pacer much more consistently hits the goal (good!) leading to slightly less frequent GC cycles, but _longer_ ones (in this case, bad!). It turns out that the cost of having the GC on hurts throughput significantly (per byte of memory used), though tail latencies can improve by up to 10%! To be conservative, this change moves the value to 0.7 where there is a small improvement to both throughput and latency, given the memory use. Because the new pacer accounts for the two most significant sources of scan work after heap objects, it is now also safer to reduce the minimum heap size without leading to very poor amortization. This change thus decreases the minimum heap size to 512 KiB, which corresponds to the fact that the runtime has around 200 KiB of scannable globals always there, up-front, providing a baseline. Benchmark results: https://perf.golang.org/search?q=upload:20211001.6 tile38's KNearest benchmark shows a memory increase, but throughput (and latency) per byte of memory used is better. gopher-lua showed an increase in both CPU time and memory usage, but subsequent attempts to reproduce this behavior are inconsistent. Sometimes the overall performance is better, sometimes it's worse. This suggests that the benchmark is fairly noisy in a way not captured by the benchmarking framework itself. biogo-igor is the only benchmark to show a significant performance loss. This benchmark exhibits a very high GC rate, with relatively little work to do in each cycle. The idle mark workers are quite active. In the new pacer, mark phases are longer, mark assists are fewer, and some of that time in mark assists has shifted to idle workers. Linux perf indicates that the difference in CPU time can be mostly attributed to write-barrier slow path related calls, which in turn indicates that the write barrier being on for longer is the primary culprit. This also explains the memory increase, as a longer mark phase leads to more memory allocated black, surviving an extra cycle and contributing to the heap goal. For #44167. Change-Id: I8ac7cfef7d593e4a642c9b2be43fb3591a8ec9c4 Reviewed-on: https://go-review.googlesource.com/c/go/+/309869 Trust: Michael Knyszek <[email protected]> Run-TryBot: Michael Knyszek <[email protected]> TryBot-Result: Go Bot <[email protected]> Reviewed-by: Austin Clements <[email protected]> Reviewed-by: Michael Pratt <[email protected]>
Change https://golang.org/cl/368137 mentions this issue: |
The new minimum heap of 512 KiB has been the cause of some build slowdown (~1%) and microbenchmark slowdown (usually ~0%, up to ~50%) because of two reasons: 1. Applications with lots of small short-lived processes execute many more GC cycles. 2. Applications with heaps <4 MiB GC up to 8x more often. In many ways these consequences are inevitable given how GOGC works, however we need to investigate more as to whether the apparent slowdowns are indeed unavoidable or if the GC has issues scaling down, which it's too late for for this release. Given that this release is already huge, it's OK to push this back. We'll take a closer look at it next cycle, so place block it behind a new goexperiment to allow users and ourselves to easily experiment with it. Fixes #49744. Updates #44167. Change-Id: Ibad51f7873de7517490c89802f3c593834e77ff0 Reviewed-on: https://go-review.googlesource.com/c/go/+/368137 Trust: Michael Knyszek <[email protected]> Run-TryBot: Michael Knyszek <[email protected]> TryBot-Result: Gopher Robot <[email protected]> Reviewed-by: Austin Clements <[email protected]> Reviewed-by: David Chase <[email protected]>
This change updates the GC pacer redesign design document to remove a few inaccuracies and update two points that became apparent after experimentation. Firstly, the inaccuracies were mostly around what was ignored. For instance, goroutines already donate their debt or credit back to the global pool on death. Secondly, the definition of the heap goal included S_n and G_n twice erroneously. It was written that way with _overall_ GC-work-related memory used in mind, but the definition is _just_ for heap memory. Lastly, it turns out that the current pacer does (in its own indirect way) account for idle priority GC in some way, and not accounting for it in the new pacer leads to a performance regression. This change adds a section describing how to account for it. For golang/go#44167. Change-Id: I396bbcb87fc3acd84584b10769e31d7da699fdb9 Reviewed-on: https://go-review.googlesource.com/c/proposal/+/350429 Reviewed-by: Michael Knyszek <[email protected]>
Change https://go.dev/cl/399300 mentions this issue: |
For #44167. Change-Id: I2dcd13cbe74e88de00e9fc51f9bd86e604a167df Reviewed-on: https://go-review.googlesource.com/c/go/+/399300 Reviewed-by: Michael Knyszek <[email protected]> Reviewed-by: Emmanuel Odeke <[email protected]> Run-TryBot: Emmanuel Odeke <[email protected]> Auto-Submit: Emmanuel Odeke <[email protected]> TryBot-Result: Gopher Robot <[email protected]>
GC Pacer Redesign
Author: Michael Knyszek (with lots of input from Austin Clements, David Chase, and Jeremy Faller)
Abstract
Go's tracing garbage collector runs concurrently with the application, and thus requires an algorithm to determine when to start a new cycle. In the runtime, this algorithm is referred to as the pacer. Until now, the garbage collector has framed this process as an optimization problem, utilizing a proportional controller to achieve a desired stopping-point (that is, the cycle completes just as the heap reaches a certain size) as well as a desired CPU utilization. While this approach has served Go well for a long time, the design has accrued many corner cases due to resolved issues, as well as a backlog of unresolved issues.
I propose redesigning the garbage collector's pacer from the ground up to capture the things it does well and eliminate the problems that have been discovered.
More specifically, I propose:
(1) will resolve long-standing issues with small heap sizes, allowing the Go garbage collector to scale down and act more predictably in general.
(2) will eliminate offset error present in the current design, will allow turning off mark-assist almost entirely outside of exceptional cases, improving allocation latency, and will enable clearer designs for setting memory limits on Go applications.
(3) will enable smooth and consistent response to large changes in the live heap size with large GOGC values.
Full design
Found here.
The text was updated successfully, but these errors were encountered: