-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: remove idle GC workers for the 2-minute GC cycle #44163
Comments
So who's doing the GC work then? Is it all handled during assists? If so, then I worry about an application that stops allocating mid-cycle while the write barrier is active. Then is the write barrier active forever? Is all the floating garbage never collected (and resulting free pages returned to the OS)? Or is some of the GC work done during write barriers also? That would eliminate the write-barrier-on-forever problem, and lessen, but maybe not eliminate, the floating garbage one (to still have a problem you'd need an application that runs but never writes a pointer). |
@randall77 I only mean idle GC workers (which are dedicated low-priority workers that soak up additional idle Ps in addition to the 25% dedicated), not the 25% dedicated workers. I think it's very important to keep dedicated workers, for the reasons you mention (and others). |
This issue is currently labeled as early-in-cycle for Go 1.18. |
thanks @cagedmantis, I'm preparing a change for this, it's on my radar. |
ah, though I will note I probably won't land it before the tree opens. I don't see a need. |
Change https://golang.org/cl/351131 mentions this issue: |
Bumping to 1.19 early-in-cycle. |
After some experimentation in the last development cycle, I think this is no longer feasible. Not for any direct technical reason -- the experiment above seems to work. The problem is that removing idle GC workers costs about 1% throughput and latency, across the board. Furthermore, I think the new pacer (#44167) accounts for the idle GC workers a little bit better, and I've discovered that the latency cost of scheduling the idle GC workers is dwarfed by the latency cost of having write barriers and goroutine assists on longer (because the GC cycle is longer without the idle GC workers). I'd like to make an alternative proposal: disable idle GC workers only in the case where they're actively harmful, that is, when the 2-minute GC runs. This is fairly easy to check for and implement. |
I updated the original post with all the details of the new proposal. |
FWIW, I think that would be well worthwhile, including it can very disconcerting to see it happen when you have a large heap, and it can mean go is not being a nice citizen of whatever environment it is running in. It also happens to have exacerbated some of the issues cited in that Discord blog (e.g., see #37116). I think in #37116 it might also have been theorized that the idle workers on the 2 min cycle were causing sporadic latency issues, but I would need to re-read that issue to confirm. |
This issue is currently labeled as early-in-cycle for Go 1.19. |
Change https://go.dev/cl/393394 mentions this issue: |
I have to pivot this issue once more. The idle priority mark workers are critical to ensuring GC progress currently, for two reasons. One is #44313 which is fixable (working on it). The other is more subtle. If we only have a fractional worker, even with a fix for #44313, it's possible that the scheduler will choose not to schedule a fractional worker, for example if the fractional worker's goal utilization has been reached. Consider a case where I don't think this has a general solution other than fundamentally reworking how fractional workers operate. The quick hack is to have sysmon force it to get scheduled, but giving yet-more-responsibility to sysmon isn't great. An alternative is to schedule the fractional worker anyway (allowing it to go over its quota), but make it eager to yield. This means a lot of spinning in and out of the scheduler, but it's better than what we have today. A third alternative, which I've implemented in the latest version of https://go.dev/cl/393394, is a compromise. Keep idle mark workers, but give them a limit, and set that limit to 1 for periodic GCs. In the grand scheme of things, "idle mark worker" is really the correct thing designation here: the system is fully idle. I think if #44313 is fixed then it's also OK to set that limit to 0 if we have at least one dedicated worker, because that worker will ensure GC progress. I plan to first land https://go.dev/cl/393394 (if there are no objections), try to fix #44313 (which is turning out to be a little tricky), then drop the idle GC worker limit from 1 to 0 if at least one dedicated worker is running. Does this sound reasonable, @aclements @prattmic ? |
What if the problem isn't that we need idle workers to ensure progress, but just fractional workers need to be less special? If the fractional worker used a timer for when it needed to run again rather than relying on the scheduler to poll for if it needs to run, then it would be able to wake a stopped M. |
I think timers could work. Two potential problems, though, from experience using timers for the scavenger:
(1) is probably a non-issue, since a GC cycle has a pretty short duration compared to completing all the scavenge work, and I have honestly no idea whether (2) will be a problem in practice. It depends a lot on the length of the GC cycle. The shortest GC cycles can definitely be < 1 ms in length. As a result, I don't think this is just a striaghtforward "let's do it this way," though I can be convinced. Currently, I think it needs a bit of experimentation and tuning, which I'll fold into my "fundamentally reworking how fractional workers operate" comment above. :) |
I agree this would be nice. In fact, it would be nice if dedicated and fractional workers were closer to "normal" goroutines that could simply move to Working with timers will be a bit awkward since there isn't an obvious time we need to run again, we first need to pick a period to use. To Michael's point about timer granularity, perhaps we could have a pretty long period. e.g., fractional workers will run for at least 10ms before sleeping. On the other hand, this means a 25% fractional worker will sleep for 30ms, which while not technically deadlocked may be a long time to wait if all other goroutines are blocked waiting for GC to make progress.
I'm not sure precisely what you mean, but this just sounds like a description of #44343. |
That's a good point. In theory if the application is active, this is fine; assists will finish off the GC. Another problem with a long period is something like At first glance this might not seem any worse than idle mark workers eating up an entire scheduling quantum, but idle mark workers only pop up when there's actually nothing to do.
That's correct. (And more precisely, I was referring to the scavenger's controller, which assumes that sleeping longer has some kind of proportional effect on CPU utilization. At smaller sleep increments than 1 ms, the amount actually slept varies wildly enough that error in the controller accumulates in an unbounded manner.) I think your comments serve my broader point that to "fix" the fractional worker would require (and certainly benefit from) a rethink of how GC scheduling works today. While I think that's worthwhile, I'd also like to offer my preference to not block this issue on that. That being said, if y'all don't like this interim solution of just limiting idle mark workers (turning them down to zero if we have at least one dedicated mark worker), I'm willing to put this aside and find time later to look into making this work better. |
I'm OK with it. IMO, the whole discussion above is an argument in favor of having some kind of idle mark workers. At least in the (admittedly rare) case of a program blocked on GC completion it is difficult to argue that you wouldn't want at least one worker running as soon as the entire program is idle. |
I'm fine with the solution of limiting the idle workers, too. I just wanted to make sure we'd thought through the problem. |
Change https://go.dev/cl/395634 mentions this issue: |
This change reduces the maximum number of idle mark workers during periodic (currently every 2 minutes) GC cycles to 1. Idle mark workers soak up all available and unused Ps, up to GOMAXPROCS. While this provides some throughput and latency benefit in general, it can cause what appear to be massive CPU utilization spikes in otherwise idle applications. This is mostly an issue for *very* idle applications, ones idle enough to trigger periodic GC cycles. This spike also tends to interact poorly with auto-scaling systems, as the system might assume the load average is very low and suddenly see a massive burst in activity. The result of this change is not to bring down this 100% (of GOMAXPROCS) CPU utilization spike to 0%, but rather min(25% + 1/GOMAXPROCS*100%, 100%) Idle mark workers also do incur a small latency penalty as they must be descheduled for other work that might pop up. Luckily the runtime is pretty good about getting idle mark workers off of Ps, so in general the latency benefit from shorter GC cycles outweighs this cost. But, the cost is still non-zero and may be more significant in idle applications that aren't invoking assists and write barriers quite as often. We can't completely eliminate idle mark workers because they're currently necessary for GC progress in some circumstances. Namely, they're critical for progress when all we have is fractional workers. If a fractional worker meets its quota, and all user goroutines are blocked directly or indirectly on a GC cycle (via runtime.GOMAXPROCS, or runtime.GC), the program may deadlock without GC workers, since the fractional worker will go to sleep with nothing to wake it. Fixes #37116. For #44163. Change-Id: Ib74793bb6b88d1765c52d445831310b0d11ef423 Reviewed-on: https://go-review.googlesource.com/c/go/+/393394 Reviewed-by: Michael Pratt <[email protected]> Run-TryBot: Michael Knyszek <[email protected]> TryBot-Result: Gopher Robot <[email protected]>
Background
Today the Go GC will spin up dedicated GC workers which use 25% of
GOMAXPROCS
resources. But it will also soak up any additionalGOMAXPROCS
that are idle and run "idle GC workers" to do so.The idea here is that those cores are idle anyway, so why not use them to speed up the GC cycle? There are negative performance implications to having the GC on longer than necessary (floating garbage accumulation for memory, and write barrier enabled for CPU time).
EDIT: To be clear, I'm not proposing removing dedicated GC workers, which use 25% of CPU resources during a GC cycle. Those are very, very important. Just the idle workers that can increase that number to 100% if the application is idle.
Why?
When a Go process is mostly idle, idle enough that the "every-2-minutes" GC kicks in, it's going to go full blast on all CPU cores, resulting in a CPU utilization spike. While this on it's own isn't a big deal, it makes a difference in the face of monitoring systems and auto-scaling systems. The former is surprising; one might see these spikes in their monitoring and spend time investigating them, only to find there's nothing to do about that (today). The latter is dangerous: if an auto-scaling system decides to give an application more resources because it notices this spike, then the Go GC is now going to soak up those resources, resulting in a significant (if temporary, and generally bounded) increase in resources.
From the earlier proposal to remove idle GC workers (no longer relevant):
Why not?
The application could suddenly become not idle during the GC cycle, and there would be a latency cost in the form of assists (vs. idle GC workers, which tend to introduce very little latency cost). Given that the 2-minute GC kicks in, however, I really doubt that this will every be an issue. The application's steady-state would have to change mid-GC, and on the time scale of 2 minutes, that's already extremely unlikely.
From the earlier proposal to remove idle GC workers (no longer relevant):
Why not now?
I don't think there's any reason not to do this very targeted fix. It will require some additional testing to make sure everything still works correctly, because the 2-minute GC is a fairly rare case, so bugs here will manifest as surprising and very rare failures.
From the earlier proposal to remove idle GC workers (no longer relevant):
CC @aclements @prattmic
The text was updated successfully, but these errors were encountered: