-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DATAS tuning fixes #105545
DATAS tuning fixes #105545
Conversation
Tagging subscribers to this area: @dotnet/gc |
85c1d9c
to
f6734fc
Compare
…f runs; fix to get STRESS_DYNAMIC_HEAP_COUNT compiled
… recommission; fix for gen1 hitting assert (free_list_space_decrease <= dd_fragmentation (dd))
…tribute_free and trigger an initial gen2 if needed
f6734fc
to
f4fe2d4
Compare
results from selected aspnet benchmarks -
in general I'm seeing good results - they reduce volatility and avoids OOM in certain scenarios. some benchmarks use more memory but with increased RPS, mostly due to the fact we are updating the stable size more accurately. unfortunately this does make PlaintextMvc's RPS lower albeit with lower memory usage - the continuous gen1 problems due to the incorrect unusable frag is fixed which is definitely improvement but there are other problems that makes baseline have better RPS by chance. baseline did a gen1 right after HC change because skip ratio is 0 for the new heaps and that made us do a gen1. This gen1 increased the gen2 size from ~1mb to 3.4mb. Fix did a BGC with a gen0 GC and didn’t increase the gen2 obj size - After this baseline kept doing gen1s due to the unusable frag and kept promoting while fix kept not promoting therefore gen2 size stayed small. The unusable frag problem is fixed but the promotion decision is still a problem especially in the baseline. We promote because we kept hitting the 1st condition in
but the thing is because we have >10 heaps some heap’s gen2 is simply very small so of course threshold could be more than old_gen_size like this
(68864 is And in fix we don’t hit this because gen1 size is big enough and we don’t consume gen1 budget since we don’t promote. So this is a bad cycle. fix should be doing gen1 GCs - right now it's not because we never detect that we have too many cards coming from gen1 to gen0 ( I will be working on these fixes when I'm back from OOF to make them into RC2. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM based on the extensive testing conducted!
This includes the following tuning fixes
distribute_free_regions
/age_free_regions
and setting BCD - what happened was each time we had to get new regions for alloc because BCD can be >> BCS. And since we don't get regions from decommit list, we keep accumulating more. For this I also cleaned up decommit_ephemeral_segment_pages and related methods and only have them defined when needed.assert (free_list_space_decrease <= dd_fragmentation (dd));
inmerge_fl_from_other_heaps
because frag simply hasn't been updated after a BGC. This can also be reproed by checking for frag at the beginning of a GC.and some instrumentation changes