-
-
Notifications
You must be signed in to change notification settings - Fork 371
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Decrease contention in Progress reporting #2357
Conversation
f053f11
to
6c94505
Compare
Nice! I suppose an increase in allocations could be acceptable if we are gaining in throughput due to less contention across CPUs. Did you try to compare eventlogs in tracy or threadscope? |
I am actually seeing some interesting perf gains in Sigma.
|
Looking at this in Threadscope with 40 capabilities is not very enlightening. The Activity tab is a bit less sparse in the lock-less version which confirms the improvement. |
We also use a lot over other MVars as part of the IDE state, including the shake store itself, which seems like it would limit the possible upside of this, since all the threads will need to synchronize on those anyway. |
True. I'll try with a lockless values collection when I find some time |
3ef1028
to
d6ca1da
Compare
b3a9b34
to
33eca72
Compare
I have now removed all or most of the contentious MVars in the ghcide codebase and used For Sigma using the 'edit' experiment:
Using the cabal example from the benchmark suite in a 40core machine:
In the same machine and example, performance regresses for
I'm pretty happy with those results and plan to land it before the next release. |
I am a bit concerned about the new dependencies after a brief look at the Github repos. For instance, this PR has been open and unactioned for >6 months: |
@nikita-volkov are you still maintaining the stm-containers ecosystem? |
@pepeiborra Thanks for the mention. Yeah, it is maintained. I somehow missed all the activity in the related repos. Fixes are applied and released now. |
Awesome! Thank you for confirming @nikita-volkov |
baf6714
to
75428e0
Compare
75428e0
to
e887a50
Compare
bde97a3
to
b8d9081
Compare
b8d9081
to
a5e7798
Compare
The gist of the changes left in this PR after merging all the previous bits is an improvement to the progress reporting code in order to decrease contention. Some stats: BEFORE
AFTER
|
BEFORE ====== ``` STM transaction statistics (2021-12-12 09:30:40.138006 UTC): Transaction Commits Retries Ratio _anonymous_ 15297 118 0.01 action queue - pop 2 2 1.00 actionQueue - done 2 0 0.00 actionQueue - peek 29 0 0.00 actionQueue - push 2 0 0.00 builder 282354 853 0.00 compute 16882 16 0.00 debouncer 6842 195 0.03 define - dirtyKeys 16895 2 0.00 define - read 1 10710 11 0.00 define - read 2 6232 5 0.00 define - write 6225 1 0.00 diagnostics - hidden 6871 9 0.00 diagnostics - publish 4073 188 0.05 diagnostics - read 6886 4 0.00 diagnostics - update 6871 23 0.00 incDatabase 10966 0 0.00 lastValueIO 4 2200 0 0.00 lastValueIO 5 2200 0 0.00 recordProgress 31238 13856 0.44 updateReverseDeps 64994 358 0.01 ``` AFTER ===== ``` STM transaction statistics (2021-12-12 09:24:24.769304 UTC): Transaction Commits Retries Ratio _anonymous_ 15199 134 0.01 action queue - pop 2 2 1.00 actionQueue - done 2 0 0.00 actionQueue - peek 29 0 0.00 actionQueue - push 2 0 0.00 builder 282244 744 0.00 compute 16882 26 0.00 debouncer 6847 220 0.03 define - dirtyKeys 16908 1 0.00 define - read 1 10710 8 0.00 define - read 2 6244 2 0.00 define - write 6236 1 0.00 diagnostics - hidden 6876 18 0.00 diagnostics - publish 3978 184 0.05 diagnostics - read 6886 2 0.00 diagnostics - update 6876 24 0.00 incDatabase 10966 0 0.00 lastValueIO 4 2200 1 0.00 lastValueIO 5 2200 0 0.00 recordProgress 31252 403 0.01 recordProgress2 31252 207 0.01 updateReverseDeps 64994 430 0.01 ```
After ===== ``` STM transaction statistics (2021-12-12 22:11:20.016977 UTC): Transaction Commits Retries Ratio _anonymous_ 15227 116 0.01 action queue - pop 2 2 1.00 actionQueue - done 2 0 0.00 actionQueue - peek 29 0 0.00 actionQueue - push 2 0 0.00 builder 282373 771 0.00 compute 16882 32 0.00 debouncer 6864 215 0.03 define - dirtyKeys 16900 0 0.00 define - read 1 10710 3 0.00 define - read 2 6254 3 0.00 define - write 6248 1 0.00 diagnostics - hidden 6893 10 0.00 diagnostics - publish 4006 200 0.05 diagnostics - read 6901 1 0.00 diagnostics - update 6893 22 0.00 incDatabase 10966 0 0.00 lastValueIO 4 2200 0 0.00 lastValueIO 5 2200 0 0.00 recordProgress 31238 387 0.01 recordProgress2 31238 79 0.00 updateReverseDeps 64994 387 0.01 ```
a5e7798
to
d6643e2
Compare
Unrelated to contention, but is it suspicious that "action queue" appears to be used only twice? Maybe not, it just stood out to me. |
Out of curiosity I put together a lock-less version of hls-graph using the sim-containers package as suggested by @wz1000 and others in IRC a few months ago.
So far I haven't seen any perf. improvement, in fact the amount of work done (allocations) is much higher. Perhaps with more tuning it could be competitive? Leaving it here in case anyone wants to have a play
EDIT: The original PR was broken in chunks, and now that there is only one major change remaining, it's worth reviewing and merging this PR to preserve the discussion instead of extracting it into yet another PR.