Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: KeyNotFoundException in LockFileBuilder.CreateLockFile #11028

Open
KirillOsenkov opened this issue Jul 14, 2021 · 15 comments
Open

[Bug]: KeyNotFoundException in LockFileBuilder.CreateLockFile #11028

KirillOsenkov opened this issue Jul 14, 2021 · 15 comments
Labels
Area:ErrorHandling warnings and errors/log messages & related error codes. Area:RestoreRepeatableBuild The lock file features Priority:3 Issues under consideration. With enough upvotes, will be reconsidered to be added to the backlog. Triage:NeedsMoreInfo Type:Bug

Comments

@KirillOsenkov
Copy link

NuGet Product Used

dotnet.exe

Product Version

5.0.301

Worked before?

No response

Impact

Other

Repro Steps & Context

I ran dotnet restore on my solution on CI after first installing .NET 5.0.301.

I saw a NullReferenceException and a KeyNotFoundException.

2021-07-14T00:34:23.5238067Z   Determining projects to restore...
2021-07-14T00:34:23.8636897Z C:\a\_tool\dotnet\sdk\5.0.301\Sdks\Microsoft.NET.Sdk\targets\Microsoft.NET.EolTargetFrameworks.targets(28,5): warning NETSDK1138: The target framework 'netcoreapp3.0' is out of support and will not receive security updates in the future. Please refer to https://aka.ms/dotnet-core-support for more information about the support policy. [C:\a\1\s\src\benchmarks\*********.Protocol.Extensions.Performance\*******.Protocol.Extensions.Performance.csproj]
2021-07-14T00:34:36.1138535Z C:\a\1\s\src\product\*******.Client.Definition\*********.Client.Definition.csproj : error NU3027: Package 'System.Runtime.CompilerServices.Unsafe 4.7.1' from source 'https://********.pkgs.visualstudio.com/_packaging/*******/nuget/v3/index.json': The signature should be timestamped to enable long-term signature validity after the certificate has expired. [C:\a\1\s\src\My.sln]
2021-07-14T00:34:36.1141907Z C:\a\_tool\dotnet\sdk\5.0.301\NuGet.targets(131,5): warning : Unable to delete temporary file 'C:\Users\AzDevOps\.nuget\packages\microsoft.servicehub.analyzers\2.7.339\pu1bfgvw.4q0'. Error: 'Could not find a part of the path 'C:\Users\AzDevOps\.nuget\packages\microsoft.servicehub.analyzers'.'. [C:\a\1\s\src\RemoteLanguage.sln]
2021-07-14T00:34:37.1075558Z C:\a\_tool\dotnet\sdk\5.0.301\NuGet.targets(131,5): error : Object reference not set to an instance of an object. [C:\a\1\s\src\RemoteLanguage.sln]
2021-07-14T00:34:39.6826641Z Restoring packages for C:\a\1\s\src\My.sln...
2021-07-14T00:34:51.0328823Z Acquiring lock for the installation of Microsoft.VisualStudio.Composition 16.4.11
2021-07-14T00:34:51.2187717Z ##[error]System.Collections.Generic.KeyNotFoundException: The given key was not present in the dictionary.
   at System.ThrowHelper.ThrowKeyNotFoundException()
   at System.Collections.Generic.Dictionary`2.get_Item(TKey key)
   at NuGet.Commands.LockFileBuilder.CreateLockFile(LockFile previousLockFile, PackageSpec project, IEnumerable`1 targetGraphs, IReadOnlyList`1 localRepositories, RemoteWalkContext context)
   at NuGet.Commands.RestoreCommand.<ExecuteAsync>d__33.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at NuGet.Commands.RestoreRunner.<ExecuteAsync>d__7.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at NuGet.Commands.RestoreRunner.<ExecuteAndCommitAsync>d__6.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at NuGet.Commands.RestoreRunner.<CompleteTaskAsync>d__9.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at NuGet.Commands.RestoreRunner.<RunAsync>d__2.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at NuGet.Commands.RestoreRunner.<RunAsync>d__0.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at NuGet.Commands.RestoreRunner.<RunAsync>d__1.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at NuGet.CommandLine.RestoreCommand.<ExecuteCommandAsync>d__52.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at NuGet.CommandLine.Command.Execute()
   at NuGet.CommandLine.Program.MainCore(String workingDirectory, String[] args)
C:\a\1\s\init.ps1 : ScriptHalted

Verbose Logs

https://devdiv.visualstudio.com/DevDiv/_build/results?buildId=4976460&view=results
@KirillOsenkov
Copy link
Author

We shouldn't be throwing NullReferenceException or KeyNotFoundException no matter what.

@KirillOsenkov
Copy link
Author

I have skimmed through the source for LockFileBuilder.CreateLockFile and I have to be honest, that method body scares me a little bit. It's too long, it could clearly benefit from careful refactoring, cleanup etc. A lot of invariant checking needs to be added. I don't even know which one of these dictionary accesses is failing, but they all look like they aren't being nearly careful enough.

I'm not surprised this bug happened and my recommendation is to spend time to very carefully refactor the method and add a bunch of checks and cover it with unit-tests.

@heng-liu
Copy link
Contributor

Could you provide more info for us to better understand the scenario? Thanks!
1.Is this running dotnet restore or nuget.exe restore? (I just glanced in the repo and it seems running nuget.exe restore? )
2.Could you provide detailed repro steps and sample project?

@heng-liu heng-liu added Area:RestoreRepeatableBuild The lock file features Area:ErrorHandling warnings and errors/log messages & related error codes. WaitingForCustomer Applied when a NuGet triage person needs more info from the OP Triage:NeedsRepro labels Jul 15, 2021
@KirillOsenkov
Copy link
Author

This is dotnet restore.

I don't have the repro steps or a sample project unfortunately, but glancing at the call stack should give you a starting point to reason about the possible exceptions there.

@ghost ghost added WaitingForClientTeam Customer replied, needs attention from client team. Do not apply this label manually. and removed WaitingForCustomer Applied when a NuGet triage person needs more info from the OP labels Jul 15, 2021
@heng-liu
Copy link
Contributor

heng-liu commented Jul 15, 2021

Hi @KirillOsenkov , thanks for the quick response.
Actually the stack trace indicates it's running NuGet.exe restore (NuGet.CommandLine.Command.Execute()). So I'm confused why it's there if it's running dotnet restore.
Then I checked your repo, and found there were two restore commands: https://devdiv.visualstudio.com/DevDiv/_git/VSLanguageServerClient?version=GBdevelop&path=%2Finit.ps1

    # Restore VS solution dependencies
    Get-ChildItem $searchRoot -rec |? { $_.FullName.EndsWith('.sln') -and $_.FullName -notmatch '\\TestFiles\\?' } |% {
        Write-Host "Restoring packages for $($_.FullName)..." -ForegroundColor $HeaderColor
        dotnet restore $_.FullName
    }

    # Restore VS solution dependencies
    Get-ChildItem $searchRoot -rec |? { $_.FullName.EndsWith('.sln') -and $_.FullName -notmatch '\\TestFiles\\?' } |% {
        Write-Host "Restoring packages for $($_.FullName)..." -ForegroundColor $HeaderColor
        & "$PSScriptRoot\tools\scripts\Restore-NuGetPackages.ps1" -Path $_.FullName -Verbosity "Detailed"
    }

According to the log, it's downloading NuGet.exe 5.8.1. which is from https://devdiv.visualstudio.com/DevDiv/_git/VSLanguageServerClient?version=GBdevelop&path=%2Ftools%2Fscripts%2FRestore-NuGetPackages.ps1
So I think the exception is thrown when running the second restore command, which is NuGet.exe 5.8.1.

The reason why I asked for a repro is, there are many combinations:
tool type(dotnet or nuget or others) * tool version(means checking different branch in the repo) * package management type(package reference or package.config) * many other factors(e.g. the settings in NuGet.Config file). It's difficult for me to repro without knowing all those info. And those info are not able to get from the stack trace.

Now we know that it's running NuGet.exe 5.8.1, could you repro by running the exact same command locally? That might be a good start for a repro.

@heng-liu heng-liu added WaitingForCustomer Applied when a NuGet triage person needs more info from the OP and removed WaitingForClientTeam Customer replied, needs attention from client team. Do not apply this label manually. labels Jul 15, 2021
@KirillOsenkov
Copy link
Author

Hi @heng-liu, according to the log, the NullReferenceException happens in dotnet restore:
image

And you're right, the subsequent KeyNotFoundException happens in Nuget.exe.

I'm really doubtful we'll be able to find a solid repro, because this was flakiness on a CI agent, that didn't repro since.

Instead of a repro, we should just look at the code, reason about it and solidify by refactoring, adding checks and assertions and adding unit-tests.

The repro is basically look at the pipeline that the log is from, and do what it does locally (clone VSLanguageServerClient repo and run init.cmd). You can try that of course, but I'm most certain it will work fine for you.

One clue I have is from the failed build that the log is from, you can download the entire bin directory (it's attached as an artifact), and if we're lucky it will still contain the obj directory with the project.assets.json and other NuGet files. I'm expecting one of those files to be malformed or incorrect, causing downstream KeyNotFoundException.

@ghost ghost added WaitingForClientTeam Customer replied, needs attention from client team. Do not apply this label manually. and removed WaitingForCustomer Applied when a NuGet triage person needs more info from the OP labels Jul 15, 2021
@KirillOsenkov
Copy link
Author

The bad news is I just checked and there are no published artifacts from init.cmd.

The good news is I think if we run dotnet nuget locals all --clear followed by init.cmd in a loop, we are bound to catch it. Maybe install a poor network connection simulator that messes with download speed and quality.

@KirillOsenkov
Copy link
Author

Another bug is that the NullReference exception doesn't print the entire stack trace to the Console, so we don't know where exactly it happened.

@KirillOsenkov
Copy link
Author

Our Init.ps1 also has this:

$env:NUGET_PLUGIN_HANDSHAKE_TIMEOUT_IN_SECONDS=20
$env:NUGET_PLUGIN_REQUEST_TIMEOUT_IN_SECONDS=20

We can probably decrease the timeout to 1 second to simulate whatever network fluke we've been seeing.

@aortiz-msft aortiz-msft added Priority:3 Issues under consideration. With enough upvotes, will be reconsidered to be added to the backlog. and removed WaitingForClientTeam Customer replied, needs attention from client team. Do not apply this label manually. labels Jul 22, 2021
@danmoseley
Copy link

I got this just now while running an "update all" in VS 17.8.0 preview 2. (I assume it's the same -- slightly different stack)

I don't know how to make a repro, but posting here to say it's still an issue.

System.Collections.Generic.KeyNotFoundException: The given key was not present in the dictionary.
   at System.ThrowHelper.ThrowKeyNotFoundException()
   at System.Collections.Generic.Dictionary`2.get_Item(TKey key)
   at NuGet.Commands.LockFileBuilder.<GetLibraryDependenciesForCentralTransitiveDependencies>d__13.MoveNext()
   at System.Collections.Generic.List`1..ctor(IEnumerable`1 collection)
   at System.Linq.Enumerable.ToList[TSource](IEnumerable`1 source)
   at NuGet.Commands.LockFileBuilder.AddCentralTransitiveDependencyGroupsForPackageReference(PackageSpec project, LockFile lockFile, IEnumerable`1 targetGraphs)
   at NuGet.Commands.LockFileBuilder.CreateLockFile(LockFile previousLockFile, PackageSpec project, IEnumerable`1 targetGraphs, IReadOnlyList`1 localRepositories, RemoteWalkContext context, LockFileBuilderCache lockFileBuilderCache)
   at NuGet.Commands.RestoreCommand.<ExecuteAsync>d__68.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at NuGet.Commands.RestoreRunner.<ExecuteAsync>d__7.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at NuGet.Commands.RestoreRunner.<CompleteTaskAsync>d__10.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at NuGet.Commands.RestoreRunner.<RunWithoutCommit>d__3.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at NuGet.PackageManagement.DependencyGraphRestoreUtility.<PreviewRestoreAsync>d__4.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at NuGet.PackageManagement.NuGetPackageManager.<PreviewBuildIntegratedProjectsActionsAsync>d__89.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at NuGet.PackageManagement.NuGetPackageManager.<PreviewBuildIntegratedProjectActionsAsync>d__88.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at NuGet.PackageManagement.NuGetPackageManager.<PreviewUpdatePackagesForBuildIntegratedAsync>d__64.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at NuGet.PackageManagement.NuGetPackageManager.<>c__DisplayClass61_1.<<PreviewUpdatePackagesAsync>b__1>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at NuGet.PackageManagement.NuGetPackageManager.<CompleteTaskAsync>d__63`1.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at NuGet.PackageManagement.NuGetPackageManager.<PreviewUpdatePackagesAsync>d__61.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at NuGet.PackageManagement.VisualStudio.NuGetProjectManagerService.<>c__DisplayClass24_0.<<GetUpdateActionsAsync>b__0>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at NuGet.PackageManagement.VisualStudio.NuGetProjectManagerService.<CatchAndRethrowExceptionAsync>d__33`1.MoveNext()

@danmoseley
Copy link

					if (libraryIncludeFlags != LibraryIncludeFlags.All)
					{
						yield return new LibraryDependency
						{
							LibraryRange = new LibraryRange(centralPackageVersion.Name, centralPackageVersion.VersionRange, LibraryDependencyTarget.Package),
							ReferenceType = LibraryDependencyReferenceType.Transitive,
							VersionCentrallyManaged = true,
							IncludeType = dictionary[centralPackageVersion.Name], /// <----
							SuppressParent = libraryIncludeFlags
						};

dictionary contains

<title>Document</title>
  Name Value Type
dictionary Count = 4 System.Collections.Generic.Dictionary<string, NuGet.LibraryModel.LibraryIncludeFlags>
  ▶ [0] {[MediatR, All]} System.Collections.Generic.KeyValuePair<string, NuGet.LibraryModel.LibraryIncludeFlags>
  ▶ [1] {[System.Reflection.TypeExtensions, All]} System.Collections.Generic.KeyValuePair<string, NuGet.LibraryModel.LibraryIncludeFlags>
  ▶ [2] {[MediatR.Contracts, Runtime | Compile | Native | BuildTransitive]} System.Collections.Generic.KeyValuePair<string, NuGet.LibraryModel.LibraryIncludeFlags>
  ▶ [3] {[Microsoft.Bcl.AsyncInterfaces, Runtime | Compile | Native | BuildTransitive]} System.Collections.Generic.KeyValuePair<string, NuGet.LibraryModel.LibraryIncludeFlags>
<title>Document</title>
  Name Value Type
centralPackageVersion {Microsoft.Extensions.DependencyInjection.Abstractions [8.0.0-rtm.23468.17, )} NuGet.LibraryModel.CentralPackageVersion
  Name "Microsoft.Extensions.DependencyInjection.Abstractions" string
  ▶ VersionRange {[8.0.0-rtm.23468.17, )} NuGet.Versioning.VersionRange

@heng-liu is this sufficient info? I'm using central package management and may well have configured something wrong, but it presuambly should tell me what. I have a dump file if you want.

@jeffkl
Copy link
Contributor

jeffkl commented Sep 21, 2023

@danmoseley based on your stack trace, I think you're hitting this issue instead: #12464

It should be fixed in the latest 17.8?

@danmoseley
Copy link

@jeffkl you're probably right. It's not fixed in today's latest 8.0.100 .NET SDK. Is that expected?
(I verified the fix is not in the NuGet.DependencyResolver.Core.dll in that SDK)

@jeffkl
Copy link
Contributor

jeffkl commented Nov 1, 2023

@danmoseley I think I see the fix in 8.x based on what I see in ILSpy:

// C:\Program Files\dotnet\sdk\8.0.100-rc.2.23502.2\NuGet.DependencyResolver.Core.dll
// NuGet.DependencyResolver.Core, Version=6.8.0.117, Culture=neutral, PublicKeyToken=31bf3856ad364e35
private static void DetectAndMarkAmbiguousCentralTransitiveDependencies<TItem>(Tracker<TItem> tracker, List<GraphNode<TItem>> centralTransitiveNodes)
{
    int count = centralTransitiveNodes.Count;
    bool flag;
    do
    {
        flag = false;
        for (int i = 0; i < count; i++)
        {
            if (centralTransitiveNodes[i].Disposition == Disposition.Acceptable && !centralTransitiveNodes[i].ParentNodes.Any((GraphNode<TItem> p) => p.Disposition != Disposition.Rejected && !tracker.IsDisputed(p.Item) && !tracker.IsAmbiguous(p.Item)) && !tracker.IsAmbiguous(centralTransitiveNodes[i].Item))
            {
                flag = true;
                centralTransitiveNodes[i].ForEach(delegate(GraphNode<TItem> x)
                {
                    tracker.MarkAmbiguous(x.Item);
                }, (GraphNode<TItem> pn) => tracker.IsAmbiguous(pn.Item));
            }
        }
    }
    while (flag);
}

That sort of looks like our current code and not the previous implementation. You said you're not sure how to repro it before, is that still the case?

@KirillOsenkov
Copy link
Author

Is this the same as #12464?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area:ErrorHandling warnings and errors/log messages & related error codes. Area:RestoreRepeatableBuild The lock file features Priority:3 Issues under consideration. With enough upvotes, will be reconsidered to be added to the backlog. Triage:NeedsMoreInfo Type:Bug
Projects
None yet
Development

No branches or pull requests

6 participants