Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lock contention thread pool issues caused by LoadSettings not passing settingsLoadingContext to LoadSettingsForSpecificConfigs #12737

Closed
lifengl opened this issue Jul 11, 2023 · 7 comments · Fixed by NuGet/NuGet.Client#5314
Labels
Area:Settings NuGet.Config and related issues Priority:2 Issues for the current backlog. Product:MSBuildSDKResolver The NuGet powered SDK resolver. Owned by MSBuild team Tenet:Performance Performance issues Type:Bug
Milestone

Comments

@lifengl
Copy link

lifengl commented Jul 11, 2023

NuGet Product Used

Other/NA

Product Version

Visual Studio 17.6

Worked before?

No response

Impact

Other

Repro Steps & Context

Thread pool exhaustion issue eventType=threadpool&failureHash=8c6a84c3-cc30-9e23-84e0-7b34342ef23c (PRISM)

I believe the source of this problem is that LoadSettings function shown below hasn't passed settingsLoadingContext to LoadSettingsForSpecificConfigs. This leads the machine level settings file is not cached during project evaluation, so each time it is read again, and when we have multiple evaluations on the same time in the thread pool, it leads bad contention to wait on this to be read and blocks multiple threads at the same time.

It seems important to cache the machine level cache as it is commonly used.

image

Verbose Logs

No response

@marcpopMSFT
Copy link

CC @jeffkl

@erdembayar erdembayar added Tenet:Performance Performance issues Area:Settings NuGet.Config and related issues and removed Triage:Untriaged labels Jul 11, 2023
@nkolev92
Copy link
Member

nkolev92 commented Jul 13, 2023

The SettingsLoadContext is supposed to be used under scoped scenarios where the configs are not expected to change.

Which codepaths in particular are you referring to in this case?

Edit; The failure hash gives me a KeyedLock.Enter issue, is that it?

@nkolev92 nkolev92 added the WaitingForCustomer Applied when a NuGet triage person needs more info from the OP label Jul 13, 2023
@nkolev92
Copy link
Member

It seems like it's the SDK Resolver codepath.
Off the top of my head, given that the resolver is run per project, we would need to be creative to get the config loads to be shared.

cc @jeffkl Any ideas?

@nkolev92 nkolev92 added Product:MSBuildSDKResolver The NuGet powered SDK resolver. Owned by MSBuild team Priority:2 Issues for the current backlog. and removed WaitingForCustomer Applied when a NuGet triage person needs more info from the OP labels Jul 13, 2023
@lifengl
Copy link
Author

lifengl commented Jul 14, 2023

@nkolev92 : the problem here is that LoadSettings is called with settingLoadingContext, and it should pass to the LoadSettingsForSpecificConfigs (which takes this as the latest optional parameter). Because it missed that parameter, LoadSettingsForSpecificConfigs would be called without the current loading context, and therefor, it will never get any file cache. It will basically keep reading the machine level configuration file repeatedly. In most case, that file is most unlikely changed. This PRISM bug is only a sample, we can see fairly amount of UI delays are blocked on reading settings files. (It turns out QueryStatus on traditional project could land to call into this), it would be better we have an efficient cache to prevent all the extra IO.

For the PRISM issue linked here, we may have large amount of projects to be evaluated at the same time (during branch switching for example), they would land to load setting files (and most likely, they would read the same set setting file). If the file is not in cache, all threads will read them, but it happens to have a file lock to prevent them to do at the same time. They will go through a shared semaphore, read the file one by one. I believe that is what is showing in the PRISM.

We need setting files to be cached in the memory, either by some context, or by timestamp. In any case, it should only be read when it is changed.

@nkolev92
Copy link
Member

That was the 1 min triage :D

Looking at it more, I see your point. We already have a static cache of settings file in the resolver that needs to be refreshed.

In VS we have other means and it's cached there. Regular restore doesn't really read the files on the UI thread either, so it's most likely the SDK resolver.

@jeffkl
Copy link
Contributor

jeffkl commented Jul 14, 2023

Does this pull request address the issue then? NuGet/NuGet.Client#5314

@lifengl
Copy link
Author

lifengl commented Jul 14, 2023

The PR looks good to me. Hopefully, it is the simple reason of quite some perf issues showing in PRISM. Let's get it in soon so we can see whether we can see other issues or it would clear things up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area:Settings NuGet.Config and related issues Priority:2 Issues for the current backlog. Product:MSBuildSDKResolver The NuGet powered SDK resolver. Owned by MSBuild team Tenet:Performance Performance issues Type:Bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants