Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Process projects in parallel in FAR. #45074

Merged
merged 4 commits into from
Jun 11, 2020

Conversation

CyrusNajmabadi
Copy link
Member

In Roslyn.sln:

  1. This has a minor 12% benefit in workstation-gc.
  2. However, in server GC we get a 33% improvement.

We don't get full linear speedup because, while there are opportunities for parallel speedup, there are also several times where things become linear as several projects wait on the compilation for a dependent project.

@CyrusNajmabadi CyrusNajmabadi requested a review from a team as a code owner June 11, 2020 05:11
@CyrusNajmabadi CyrusNajmabadi requested a review from dibarbet June 11, 2020 08:23
// have to create all their dependent compilations in order to get their compilation.
// This would be very expensive and would take a lot of time before we got our first
// result.
var connectedProjects = _dependencyGraph.GetDependencySets(_cancellationToken);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is no longer an issue as we can still get results uickly by processing hte compilations in parallel, thus returning results quickly for the compilations we can get quickly.

connectedProjectSet, projectToDocumentMap).ConfigureAwait(false);
}
foreach (var (project, documentMap) in projectToDocumentMap)
tasks.Add(Task.Run(() => ProcessProjectAsync(project, documentMap), _cancellationToken));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is Task.Run here just to pass cancellationToken?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's to ensure that the work is truly parallel and the main lol can get to the next item to add to the work queue. If there work doesn't actually ever yield, and if entirely synchronous, then not doing this could have no benefit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's to ensure that the work is truly parallel and the main lol can get to the next item to add to the work queue.

I looked back here to read your answer again. But I think that's not what I was asking.

I was wondering why not: tasks.Add(ProcesssProjectAsync(...))? Is cancellationToken the only reason?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering why not: tasks.Add(ProcesssProjectAsync(...))? Is cancellationToken the only reason?
@CyrusNajmabadi

No. It's so the work is actually parallel. Adding lots of async work doesn't make anything acutally parallel. Consider something as simple as this:

public Task ExpensiveAsync()
{
    Thread.Sleep(10000);
    return Task.CompletedTask;
}

if we go in a loop 10 times and attempt to add ExpensiveAsync to an array of tasks, it will take 100 seconds. By doing Task.Run, we explicitly move this work to a threadpool thread, and allow the work to operate entirely concurrently, allowing it all to finish in 10 seconds.

This is obviously an extreme case (total thread blocking, no actual asynchrony), but the general idea still holds. Putting a lot of async work into a queue doesn't make it necessary concurrent. Doing the Task.Run does ensure this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is obviously an extreme case (total thread blocking, no actual asynchrony), but the general idea still holds.

Got it. Just so that I understand, it does depend on the ProcesssProjectAsync implementation to be possibly CPU-bound/Blocking, and even for IO-bound scenarios, Task.Run will ensure any "additional work" that is being done as part of it to run in parallel.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. tasks.Add(ProcesssProjectAsync(...)) is possibly concurrent. tasks.Add(Task.Run(() is definitely concurrent.

// Now actually process the project.
await ProcessProjectAsync(project, documentMap).ConfigureAwait(false);
}

private async Task ProcessProjectAsync(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the only bit of code in this file, could probably just be moved to FindReferencesSearchEngine.cs now that it's much simpler

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. can clean up in my continuing followup PRs :)

@CyrusNajmabadi CyrusNajmabadi merged commit 635bc48 into dotnet:master Jun 11, 2020
@ghost ghost added this to the Next milestone Jun 11, 2020
@CyrusNajmabadi CyrusNajmabadi deleted the farProjectParallel branch June 11, 2020 20:42
@dibarbet dibarbet modified the milestones: Next, 16.7.P4 Jun 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants