Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NuGet's dependency graph resolution algorithm should scale better for large graphs #13692

Closed
jeffkl opened this issue Aug 11, 2024 · 1 comment · Fixed by NuGet/NuGet.Client#5963
Assignees
Labels
Area:NewDependencyResolver Issues related to the new dependency graph resolver Functionality:Restore Priority:1 High priority issues that must be resolved in the current sprint. Type:DCR Design Change Request
Milestone

Comments

@jeffkl
Copy link
Contributor

jeffkl commented Aug 11, 2024

NuGet Product(s) Affected

MSBuild.exe, dotnet.exe

Current Behavior

NuGet's current dependency graph resolution algorithm can be rather slow for large, complex graphs. It walks the entire tree and for each level expands out every single node. This creates more of a tree representation but also includes duplicates. Consider a popular package, PackageX, that might show up as a common dependency for lots of packages. NuGet will create a graph in memory like this:

ProjectA
 ├─ PackageA
 │  └─ PackageX
 │     ├─ PackageY
 │     └─ PackageZ
 ├─ PackageB
 │  └─ PackageX
 │     ├─ PackageY
 │     └─ PackageZ
 ├─ PackageC
 │  └─ PackageX
 │     ├─ PackageY
 │     └─ PackageZ
 ├─ PackageD
 │  └─ PackageX
 │     ├─ PackageY
 │     └─ PackageZ
 └─ PackageE
    └─ PackageX
       ├─ PackageY
       └─ PackageZ

In restores for large repos internal to Microsoft, individual projects end up with millions of nodes in the graph.

After create the graph, NuGet then walks through the graph N times looking for versions to eclipse, detecting cycles, detecting version conflicts, and detecting downgrades. Walking millions of nodes several times causes restore times for 2,500 projects to be around 15 minutes.

Desired Behavior

We should rewrite the dependency graph resolution algorithm to be linear by having it only walk each node once, resolving versions as it goes. Doing this in a single pass will greatly reduce the amount of time it takes to resolve complex graphs.

Additional Context

No response

@jeffkl jeffkl added Priority:1 High priority issues that must be resolved in the current sprint. Type:DCR Design Change Request labels Aug 11, 2024
@jeffkl jeffkl self-assigned this Aug 11, 2024
@meokullu
Copy link

If there are more than one to solve, walk each node once, wouldn't it increase the amount of time? If version conflicts are detecting by comparing only versions of packages, I think it should keep looking on background. User will eventually attempt to solve those too.

@nkolev92 nkolev92 added Functionality:Restore Area:NewDependencyResolver Issues related to the new dependency graph resolver labels Oct 16, 2024
@zivkan zivkan added this to the 6.12 milestone Oct 25, 2024
@jeffkl jeffkl changed the title NuGet's dependency graph resolution algorithm does not scale well for large graphs NuGet's dependency graph resolution algorithm should scale better for large graphs Nov 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area:NewDependencyResolver Issues related to the new dependency graph resolver Functionality:Restore Priority:1 High priority issues that must be resolved in the current sprint. Type:DCR Design Change Request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants