Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compute PatternMatching result in parallel when refreshing completion list #67249

Merged
merged 4 commits into from
Mar 10, 2023

Conversation

genlu
Copy link
Member

@genlu genlu commented Mar 9, 2023

The change is in 2f70efd. First 3 commits are from #67232, which is needed for this change.

I have compared 4 different implementation for pattern matching locally:

  • Current is the current implmentation
  • PLinq without merging (implementation used in this PR)
                    Enumerable.Range(0, _snapshotData.InitialSortedItemList.Count)
                        .AsParallel()
                        .WithCancellation(cancellationToken)
                        .ForAll(CreateMatchResult);   // using lock to add to shared list as part of `CreateMatchResult`
  • PLinq with merging
                    var results = Enumerable.Range(0, _snapshotData.InitialSortedItemList.Count)
                        .AsParallel()
                        .WithMergeOptions(ParallelMergeOptions.AutoBuffered)
                        .WithCancellation(cancellationToken)
                        .Select(CreateMatchResult);   // no lock, return PatternMatch instead

                    foreach (var result in results)
                    {
                        if (result is not null)
                            list.Add(result.Value);
                    }
  • Linq
                    var results = Enumerable.Range(0, _snapshotData.InitialSortedItemList.Count).Select(CreateMatchResult);
                    foreach (var result in results)
                    {
                        if (result is not null)
                            list.Add(result.Value);
                    }

My machine has 16 logical processors, I have observed all 16 being utilized when PLinq is used. The test scenario has 29k items to process, which is about 97th percentile of item counts based on telemetry. Here's the average result (in milliseconds):

When filter text is not empty (i.e. actually calls PatternMatcher)

implementation pattern matching cost sorting cost
Current 53 9
PLinq without merging 9 10
PLinq with merging 9 10
Linq 56 8

When filter text is empty (i.e. no call to PatternMatcher)

implementation pattern matching cost sorting cost
Current 6 5
PLinq without merging 6 8
PLinq with merging 7 10
Linq 8 5

@genlu genlu requested a review from a team as a code owner March 9, 2023 22:57
@@ -128,20 +132,24 @@ private bool ShouldSelectSuggestionItemWhenNoItemMatchesFilterText
// since the completion list could be long with import completion enabled.
var itemsToBeIncluded = s_listOfMatchResultPool.Allocate();
var cancellationTokenSource = CancellationTokenSource.CreateLinkedTokenSource(cancellationToken);
var threadLocalPatternMatchHelper = new ThreadLocal<PatternMatchHelper>(() => new PatternMatchHelper(_filterText), trackAllValues: true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does 'trackAllValues' do?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking at the docs, unclear why we're setting that to true. can you doc in the code why?

also, doc in the code why we're using a ThreadLocal here (specifically that PatternMatchHelper is not threadsafe, so we want a dedicated one per thread that is active doing the filtering.

Enumerable.Range(0, _snapshotData.InitialSortedItemList.Count)
.AsParallel()
.WithCancellation(cancellationToken)
.ForAll(CreateMatchResultAndProcessMatchingDefaults);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious why .ForAll, instead of .Where. Then we don't need a lock, right?

Copy link
Member Author

@genlu genlu Mar 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are right. I saw in my local measurement ForAll has small (but largely negligible) edge over Where, and since ForAll uses no buffer, I assume there would be less allocation (unconfrimed) That's why I picked ForAll.


// Dispose PatternMatchers
foreach (var helper in threadLocalPatternMatchHelper.Values)
helper.Dispose();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah. interesting. you use it for disposal. interesting!

foreach (var helper in threadLocalPatternMatchHelper.Values)
helper.Dispose();

threadLocalPatternMatchHelper.Dispose();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you just place the threadLocalPatternMatchHelper in a using block instead?

roslynInitialTriggerKind, roslynFilterReason, _recentItemsManager.GetRecentItemIndex(itemData.RoslynItem), _highlightMatchingPortions, currentIndex,
out var matchResult))
if (threadLocalPatternMatchHelper.Value!.TryCreateMatchResult(itemData.RoslynItem, roslynInitialTriggerKind, roslynFilterReason,
_recentItemsManager.GetRecentItemIndex(itemData.RoslynItem), _highlightMatchingPortions, index, out var matchResult))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
_recentItemsManager.GetRecentItemIndex(itemData.RoslynItem), _highlightMatchingPortions, index, out var matchResult))
_recentItemsManager.GetRecentItemIndex(itemData.RoslynItem), _highlightMatchingPortions, index, out var matchResult))

{
list.Add(matchResult);
lock (_gate)
list.Add(matchResult);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't love that inside the parallel we have to lock just to append to a list. would plinq's Where/Select methods (https://learn.microsoft.com/en-us/dotnet/api/system.linq.parallelenumerable.where?view=net-7.0) method work better? then you can just enumerate over those results that succeeded.

@@ -56,34 +56,39 @@ Namespace Microsoft.CodeAnalysis.Editor.UnitTests.IntelliSense
Dim culture = New CultureInfo("tr-TR", useUserOverride:=False)

Dim workspace = New TestWorkspace
Dim helper = New CompletionHelper(isCaseSensitive:=False)
Dim helper = New PatternMatchHelper(pattern)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using

End Sub

Private Shared Sub TestNotMatches(pattern As String, wordsToNotMatch() As String)
Dim culture = New CultureInfo("tr-TR", useUserOverride:=False)
Dim workspace = New TestWorkspace
Dim helper = New CompletionHelper(isCaseSensitive:=True)
Dim helper = New PatternMatchHelper(pattern)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using?

initialTriggerKind == CompletionTriggerKind.Invoke ||
initialTriggerKind == CompletionTriggerKind.Deletion;
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was all this code basically moved unchanged?


public void Dispose()
{
lock (_gate)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably don't need this gate... but i'm not going to complain if you keep it.

{
cancellationToken.ThrowIfCancellationRequested();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would not mind in a followup pr if we basically tried to be much clearer about setting up parallel state, and what that state mutated. e.g. by extracting a helper function and passing in things like the collections it is expected to mutate. it's pretty hard to reason about things here given how complex this is and how much state is implicitly captured.

@arunchndr
Copy link
Member

This looks great, thanks for the fix @genlu! Would we be able to tweak completion speedometer prior to reflect the impact?

@genlu
Copy link
Member Author

genlu commented Mar 10, 2023

@CyrusNajmabadi I will address the comments in a follow up PR. I'm planning to add more parallelization for filters and highlighting. Thanks

@genlu genlu enabled auto-merge March 10, 2023 00:18
@genlu genlu merged commit 1bc5258 into dotnet:main Mar 10, 2023
@ghost ghost added this to the Next milestone Mar 10, 2023
@genlu genlu deleted the PLinqItemManager branch March 10, 2023 17:59
@allisonchou allisonchou modified the milestones: Next, 17.6 P3 Mar 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants