-
Notifications
You must be signed in to change notification settings - Fork 696
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce allocations in NuGet.DependencyResolver.Tracker #5267
Conversation
This object keeps track of graph items as they are observed. The internal implementation has been optimised to pack data in a more compact fashion, as this type was identified as contributing to GC pauses in VS by GCPauseWatson. Unit tests were added to document the previous behaviour, then the class was rewritten to use less memory. Some of the optimisations: - Don't create an entry for every look up operation. Only create it when some state actually needs to be written (adding a graph item, or marking ambiguous). - Don't allocate the HashSet for every entry in the tracker. Only allocate one once there is more than one item for a given library. - Don't call Contains on the HashSet before calling Add. If the item already exists, Add is a no-op.
This PR has been automatically marked as stale because it has no activity for 7 days. It will be closed if no further activity occurs within another 7 days of this comment. If it is closed, you may reopen it anytime when you're ready again, as long as you don't delete the branch. |
Pinging @NuGet/nuget-client to make sure this PR doesn't get buried. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Few small comments.
test/NuGet.Core.Tests/NuGet.DependencyResolver.Core.Tests/TrackerTests.cs
Show resolved
Hide resolved
/// memory. For large graphs, this can amount to a significant saving in memory, | ||
/// which helps overall performance. | ||
/// </summary> | ||
private object? _storage; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given casting has a perf cost, plus object
obscures types, is it worthwhile having two fields instead, private TItem? _single
and private HashSet<TItem>? _collection
instead? It gives more compile time static type analysis.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The goal here was to reduce memory footprint, and a second field would increase size again. Type checking is not free, but it's still very fast. The scenario where this was flagged as contributing to GC, the CPU was spending a huge amount of time on GC. My sense is that reducing GC pressure would offset any overhead from type checking. To write a benchmark you'd need realistic data loads, which would take time to gather. I don't think it'd be worth the effort to investigate given how much low hanging fruit exists throughout NuGet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with Drew's assessment here.
Bug
Fixes: https://devdiv.visualstudio.com/DevDiv/_workitems/edit/1826391
Fixes: NuGet/Home#12719
Regression? Last working version:
Description
NuGet.DependencyResolver.Tracker
keeps track of graph items as they are observed. The internal implementation has been optimised to pack data in a more compact fashion, as this type was identified as contributing to GC pauses in VS by GCPauseWatson.Unit tests were added to document the previous behaviour, then the class was rewritten to use less memory.
Some of the optimisations:
Entry
for every look up operation. Only create it when some state actually needs to be written (adding a graph item, or marking ambiguous).HashSet
for every entry in the tracker. Only allocate one once there is more than one item for a given library.Contains
on theHashSet
before callingAdd
. If the item already exists,Add
is a no-op.PR Checklist
PR has a meaningful title
PR has a linked issue.
Described changes
Tests
Documentation