Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RC1 regression: dependencies no longer work correctly in source generators across <ProjectReference> #56442

Closed
roji opened this issue Sep 16, 2021 · 30 comments · Fixed by #56662
Assignees
Milestone

Comments

@roji
Copy link
Member

roji commented Sep 16, 2021

When using a source generator via a <ProjectReference> inside a solution, and when the source generator needs to have dependencies, the solution documented by @sharwell here can be used. Unfortunately, while this still worked in 6.0.0-preview.7, it no longer seems to work in 6.0.0-rc.1.

For a minimal repro, see https://github.com/roji/Test/tree/SourceGeneratorRc1. Building that solution works in preview7, but in rc1 I'm getting the following:

CSC : warning CS8785: Generator 'NpgsqlConnectionStringBuilderSourceGenerator' failed to generate source. It will not contribute to the output and compilation errors may occur as a result. Exception was of type 'FileNotFoundException' with message 'Could not load file or assembly 'Scriban.Signed, Version=4.0.1.0, Culture=neutral, PublicKeyToken=5675fb69b15f2433'. The system cannot find the file specified. [/home/roji/projects/test/Test/Test.csproj]
@dotnet-issue-labeler dotnet-issue-labeler bot added Area-Compilers untriaged Issues and PRs which have not yet been triaged by a lead labels Sep 16, 2021
@RikkiGibson RikkiGibson self-assigned this Sep 16, 2021
@RikkiGibson
Copy link
Contributor

This could be related to recent changes we made to analyzer assembly loading, since source generators are loaded using the same mechanism.

@RikkiGibson RikkiGibson added this to the 17.0 milestone Sep 16, 2021
@RikkiGibson RikkiGibson added Bug Feature - Source Generators Source Generators and removed untriaged Issues and PRs which have not yet been triaged by a lead labels Sep 16, 2021
@roji
Copy link
Member Author

roji commented Sep 16, 2021

Thanks for looking into this. Some sort of workaround would be really appreciated while this is being fixed, since this is blocking some builds.

@RikkiGibson
Copy link
Contributor

RikkiGibson commented Sep 20, 2021

Hi @roji. I apologize for the delay on this. I was able to reproduce the problem. In the binlog I see the following arguments passed to the compiler:

/analyzer:C:\Users\rikki\.nuget\packages\scriban.signed\4.0.1\lib\netstandard2.0\Scriban.Signed.dll
/analyzer:C:\Users\rikki\src\roji-test\TestSourceGenerator\bin\Debug\netstandard2.0\TestSourceGenerator.dll

(below I use the terms "analyzer" and "source generator" interchangeably because the loading mechanism for them is the same)

The problem here is that we introduced a new AssemblyLoadContext-based analyzer loading mechanism when compiling on .NET Core. Basically, now when we build on .NET Core we make a few deep assumptions:

  1. Any given analyzer has all of its dependencies (aside from framework and Microsoft.CodeAnalysis.* stuff) right next to it in the same directory. This has been a requirement for nuget packaging of analyzers for some time.
  2. Every analyzer and analyzer dependency has its full path passed to the compiler via the /analyzer option.

The design of our ALC-based system makes the assumption about (1) into a requirement, when before it was perhaps just an assumption, at least in the ProjectReference scenario. When a dependency is in a different folder, the analyzer's ALC just won't find it any more.

From talking with @jmarolf I found you can set the msbuild property <CopyLocalLockFileAssemblies>true</CopyLocalLockFileAssemblies> in the analyzer project to ensure that its dependencies get copied to the output directory. However this doesn't solve the problem because msbuild still ends up using a path to the dependency which is located within the nuget cache. I think if we could get msbuild to use the paths to the copied dependencies in the output directory then we would be all good. This might be a solution, but I'm not yet certain whether there's a better one.

To fix this scenario in the short term I think we need to figure out what changes should be made to the analyzer project and consuming project to meet the compiler's new invariants. I will follow up when I have an answer for what that change looks like. (It's possible that the compiler's invariants or discovery mechanism should be adjusted, but I'm less certain of that.) cc @chsienki @jaredpar @jkoritzinsky

In the medium term: it seems like ProjectReferences to analyzers are not a well tested scenario in the SDK, and that a bigger investment may be warranted to make the experience much more straightforward and productive.

@roji
Copy link
Member Author

roji commented Sep 21, 2021

@RikkiGibson thanks for the in-depth explanation and for the attention. Yeah, I did try to work around this in various ways and failed, and ended up committing generated sources into the repo for now.

Looking forward to seeing a proper fix and general improvements for PackageReference support!

@jmarolf
Copy link
Contributor

jmarolf commented Sep 21, 2021

I will need to experiment with project references https://github.com/dotnet/msbuild/blob/main/documentation/ProjectReference-Protocol.md but fundamentally this is the take away:

ProjectReferences to analyzers are not a well tested scenario in the SDK, and that a bigger investment may be warranted to make the experience much more straightforward and productive.

I will make an even stronger statement and say that any analyzer that worked via ProjectReferences did so by accident, not design. @chsienki I recommend us removing any examples that lead developers down this path until we can properly invest in making this work.

@roji
Copy link
Member Author

roji commented Sep 22, 2021

I will make an even stronger statement and say that any analyzer that worked via ProjectReferences did so by accident, not design [...]

@jmarolf understood, but I hope that doesn't mean that at least some sort of workaround won't be provided for 6.0; there are various existing samples around the Internet and developers have been using source generators in their solutions without publishing needless nuget packages - so this definitely seems like a regression. It should hopefully be possible to somehow make this work even if the experience isn't good in 6.0.

@RikkiGibson
Copy link
Contributor

RikkiGibson commented Sep 22, 2021

@roji Agreed. On our side we've considered introducing a feature flag for .NET 6.0 which would simply force loading all analyzers and dependencies in a single ALC (maybe using whatever ALC the compiler is loaded into, or maybe one per analyzer loader.) This would fix your scenario but break other scenarios. So the consuming project would probably need to set one or the other value for the flag depending on their constraints. If the previous scheme in preview7 worked in your project then this feature flag would probably get you up and running again.

There are some other alternatives that I haven't fully thought through yet, but at any rate I agree that we should do something here for .NET 6 to ensure people aren't just left out in the cold.

@roji
Copy link
Member Author

roji commented Sep 22, 2021

Sounds great, thanks @RikkiGibson

@PathogenDavid
Copy link
Contributor

This is affecting me too, although in my case the source generator depends on another project reference rather than a package reference. I didn't actually notice until recently either since it only happens on the command line and not with Visual Studio.


@RikkiGibson Thanks for the detailed explanation for what's going on. (It also helped to look through the PR which caused this #55098)

I ended coming up with a workaround sort-of similar to what you mentioned in #56442 (comment). It identifies Analyzer items which came from a project reference and copies them to a project-specific intermediate directory and tells Roslyn to load them from there instead: (This target goes in the analyzer consumer.)

<Target Name="Roslyn56442Workaround" AfterTargets="ResolveProjectReferences" BeforeTargets="CoreCompile">
  <Error Message="IntermediateOutputPath is empty." Condition="'$(IntermediateOutputPath)' == '' and '$(IntermediateAnalyzersPath)' == ''" />
  <PropertyGroup>
    <IntermediateAnalyzersPath Condition="'$(IntermediateAnalyzersPath)' == ''">$(IntermediateOutputPath)collected-analyzers\</IntermediateAnalyzersPath>
    <IntermediateAnalyzersPath Condition="!HasTrailingSlash('$(IntermediateAnalyzersPath)')">$(IntermediateAnalyzersPath)\</IntermediateAnalyzersPath>
  </PropertyGroup>

  <!-- Identify all analyzers which originated from a ProjectReference and remove them -->
  <ItemGroup>
    <_AnalyzerFromProjectReference Include="@(Analyzer)" Condition="'%(Analyzer.ReferenceSourceTarget)' == 'ProjectReference'" />
    <Analyzer Remove="@(_AnalyzerFromProjectReference)" />
  </ItemGroup>
  
  <!-- Copy the analyzers to a common directory -->
  <Copy SourceFiles="@(_AnalyzerFromProjectReference)" DestinationFolder="$(IntermediateAnalyzersPath)" SkipUnchangedFiles="true" />
  
  <!-- Re-add the analyzers using the copied analyzer instead -->
  <ItemGroup>
    <Analyzer Include="@(_AnalyzerFromProjectReference->'$(IntermediateAnalyzersPath)%(Filename)%(Extension)')" OriginalProjectReferenceOutputPath="%(Identity)" />
  </ItemGroup>
</Target>

Another idea I had while looking over the new loader logic that seems like it should be possible: Rather than use a single ALC per directory, maybe the concept of an "analyzer context" can be introduced. If the analyzer context is not explicitly specified, it defaults to the directory which contains the analyzer (or maybe the NuGet package ID which provided the analyzer.) Some MSBuild logic similar to above could make all ProjectReference analyzers default going into their own context.

@RikkiGibson
Copy link
Contributor

Allowing a custom ID for the analyzer context is an interesting idea. I wonder where such a value would be stored/looked up? It seems like we would need a mapping somewhere of each assembly or assembly path to what context it's supposed to be loaded into.

@jmarolf
Copy link
Contributor

jmarolf commented Sep 23, 2021

I believe setting <CopyLocalLockFileAssemblies>true</CopyLocalLockFileAssemblies> in your project copies all dependencies to the output directory so you shouldn't need to do that in your workaround @PathogenDavid you just need to remap the analyzers to point to that directory.

@ebekker
Copy link

ebekker commented Sep 23, 2021

I'm curious, why is it working fine in Visual Studio (2022 preview 4.1)? Do analyzers/generators resolve their own dependencies differently?

@RikkiGibson
Copy link
Contributor

@jaredpar @chsienki have suggested another solution which adjusts the DirectoryLoadContext logic:

When DirectoryLoadContext.Load is called, we search for the assembly in the following locations:

  1. First ask the compiler's load context for the assembly. (i.e. whatever ALC we know loaded Microsoft.CodeAnalysis)
  2. Then try to find the assembly in the directory for this DirectoryLoadContext.
  3. Failing that, search through the entire list of /analyzer arguments for a matching assembly and if one is found, load it.

The effect here is that in the event of a dependency conflict we prefer any assembly with the right name that's in the same directory as the dependent assembly. But if we can't find one in that directory, then we fall back to essentially searching everywhere.

@RikkiGibson
Copy link
Contributor

I'm curious, why is it working fine in Visual Studio (2022 preview 4.1)? Do analyzers/generators resolve their own dependencies differently?

I don't know for sure whether 2022 preview 4.1 corresponds to .NET 6 RC1, but to clarify:

Analyzers and generators use the same loading/dependency resolution mechanism. However, we do different things depending on whether the compiler is running on .NET (Core) or .NET Framework. The above issue isn't applicable when the compiler is running on .NET Framework (when you hit "build" in msbuild.exe or in VS or when you're just using the editor). It's only applicable when the compiler is running on .NET/.NET Core (for example when you use 'dotnet build' on the command line to build).

@jaredpar
Copy link
Member

When building and running from the context of Visual Studio you are using the .NET Framework compiler, not the .NET Core one. Those use two different forms of Assembly loading (by necessity). The change we recently made that caused this occurred because we tried to make our .NET Core loading support conflicting dependencies better. There weren't any changes to .NET Framework loading hence those scenarios didn't regress.

@ebekker
Copy link

ebekker commented Sep 23, 2021

Based on marketing messaging and blogs, 2022 preview 4.x release corresponds to 6 RC1 (co-released, and VS 2022 4.x depends on 6 RC1).

In my case it's strictly .NET Standard 2.0 for the Source Generator (as is required) and .NET 6 Web App (TFM = net6.0, a Blazor Server app). Both methods were working fine up to preview 7, but the CLI method broke in RC1 as described above.

@ebekker
Copy link

ebekker commented Sep 23, 2021

When building and running from the context of Visual Studio you are using the .NET Framework compiler, not the .NET Core one. Those use two different forms of Assembly loading (by necessity). The change we recently made that caused this occurred because we tried to make our .NET Core loading support conflicting dependencies better. There weren't any changes to .NET Framework loading hence those scenarios didn't regress.

Ah, OK, that makes sense then. I assume this also explains the requirement for Analyzers/Generators to stick to .NET Standard 2.0?

BTW, is there any way to for CLI compiles to use the .NET Framework tool chain (for now)? Or is that just simply invoking an older MSBuild tool chain? Would it understand and support SDK-style project files same as VS?

@jkoritzinsky
Copy link
Member

@RikkiGibson would it be possible to use an AssemblyDependencyResolver as an additional resolution step before failing in DirectoryLoadContext? That would allow people to have assemblies that live side-by-side with their analyzer assembly and mentioned in the .deps.json (which would work pretty cleanly for analyzer/generator dependencies) that aren't included in the command line as /analyzer: options.

@jmarolf
Copy link
Contributor

jmarolf commented Sep 23, 2021

would it be possible to use an AssemblyDependencyResolver as an additional resolution step before failing in DirectoryLoadContext?

I believe that this is what .NET Interactive needed to do for dependency resolution

@jaredpar
Copy link
Member

That would allow people to have assemblies that live side-by-side with their analyzer assembly and mentioned in the .deps.json

The deps.json itself is a problem because today it is not passed to the compiler. In order for it to be passed to the compiler we would need to change the entire build change. Far too late in the cycle for that.

that aren't included in the command line as /analyzer: options.

That case is explicitly meant to fail. Basically if you don't include a dependency in /analyzer: and it's not something that comes with the runtime then the expectation is that it should result in a compilation diagnostic. The reason why is that the compiler requires that all inputs to compilation be specified on the command line. It's a fundamental requirement for having repeatable builds. Once we start loading dependencies that aren't on the command line then we're essentially introducing non-determinism into the build.

By that I mean the same compilation, with the same command line, referencing the same items on disk can produce different assembly outputs because the compiler ends up loading unstated dependencies. That violates a core tenant of compilation. That is why we don't allow unstated dependencies to load.

I agree it would be ideal to use deps.json to load dependencies but at this point there is no setup available to us that meets our basic requirements.

@jkoritzinsky
Copy link
Member

That would allow people to have assemblies that live side-by-side with their analyzer assembly and mentioned in the .deps.json

The deps.json itself is a problem because today it is not passed to the compiler. In order for it to be passed to the compiler we would need to change the entire build change. Far too late in the cycle for that.

The .deps.json doesn't need to be passed to the compiler for AssemblyDependencyResolver to work. You pass the "plugin dll" path and the deps.json is resolved from there.

that aren't included in the command line as /analyzer: options.

That case is explicitly meant to fail. Basically if you don't include a dependency in /analyzer: and it's not something that comes with the runtime then the expectation is that it should result in a compilation diagnostic. The reason why is that the compiler requires that all inputs to compilation be specified on the command line. It's a fundamental requirement for having repeatable builds. Once we start loading dependencies that aren't on the command line then we're essentially introducing non-determinism into the build.

This requirement effectively causes custom workarounds in MSBuild to be required to get local end-to-end testing working reliably, which is a high bar especially today without docs about this exact problem. For example, for DllImportGenerator, here's the workarounds I needed to add to get ProjectReferences working again:

In DllImportGenerator.csproj (this change would also be required for AssemblyDependencyResolver):

  <ItemGroup>
    <ProjectReference Include="..\Microsoft.Interop.SourceGeneration\Microsoft.Interop.SourceGeneration.csproj" Private="true" />
  </ItemGroup>

In the code that enables the project references, I had to add a custom target to resolve the dependency in the right folder (ProjectReference with OutputItemType=Analyzer doesn't work since it's in a different folder):

  <Target Name="ConfigureDllImportGenerator"
          DependsOnTargets="ResolveReferences">
    <MSBuild Projects="$(LibrariesProjectRoot)System.Runtime.InteropServices/gen/DllImportGenerator/DllImportGenerator.csproj">
      <Output TaskParameter="TargetOutputs" PropertyName="DllImportGeneratorOutputPath" />
    </MSBuild>

    <!-- We add the copy of Microsoft.Interop.SourceGeneration.dll that lives next to Microsoft.Interop.DllImportGenerator.dll
         to work around https://github.com/dotnet/roslyn/issues/56442 -->
    <ItemGroup>
      <Analyzer Include="$([MSBuild]::NormalizePath('$([System.IO.Path]::GetDirectoryName('$(DllImportGeneratorOutputPath)'))', 'Microsoft.Interop.SourceGeneration.dll'))" />
    </ItemGroup>
  </Target>

By that I mean the same compilation, with the same command line, referencing the same items on disk can produce different assembly outputs because the compiler ends up loading unstated dependencies. That violates a core tenant of compilation. That is why we don't allow unstated dependencies to load.

If there was some built-in SDK mechanism to make this experience cleaner/more usable, then I'm all for doing that instead of using AssemblyDependencyResolver. This just happens to be a very good use case for it other than the "must be on the command line" requirement.

Btw, how does Roslyn today ensure that the items on disk are the same between different deterministic runs of the same compilation? That ends up being a requirement for deterministic builds with source generators, so I'm curious how that's handled. Would it be possible to extend that mechanism to log the additionally-loaded dependencies and only use the recorded assemblies during playback of a deterministic compilation?

@jaredpar
Copy link
Member

The .deps.json doesn't need to be passed to the compiler for AssemblyDependencyResolver to work. You pass the "plugin dll" path and the deps.json is resolved from there

As stated later that violates our build principals.

This just happens to be a very good use case for it other than the "must be on the command line" requirement.

Yes. It's a very good solution except it doesn't meet our requirements 😄

Btw, how does Roslyn today ensure that the items on disk are the same between different deterministic runs of the same compilation? That ends up being a requirement for deterministic builds with source generators, so I'm curious how that's handled.

There are different responsibilities here. Concretely the compiler itself must meet the following properties:

  1. Given the same inputs to compilation it must produce byte for byte equivalent output
  2. Given a PE + PDB file the compiler must be able to list all of the inputs that went into it. For items like files that includes uniquely identifiable information like checksums, MVID, etc ...

In scenarios like where they are attempting to repeat builds they are responsible for supplying the artifacts identified in (2) above.

Generators and analyzers do pose a bit of a problem for determinism. We can't fully guarantee they are deterministic. We prescribe that they should be but we have no real way to enforce it. But we can eliminate this as a source of problem when doing repeatable build verification. In that scenario we can use the source code output from the initial compilation. Essentially just skip the generator altogether.

@RikkiGibson
Copy link
Contributor

It feels like .deps.json is something we should look at using in the medium to long term, possibly with adjustments to our build tasks, other asks to the team that owns it, etc.

Because the fix for this probably needs to ship within about 1 week, it likely won't be how we solve the issue for the .NET 6.0.1xx SDK.

@jkoritzinsky
Copy link
Member

For the .NET 6.0.1xx SDK, could we at least document the workarounds mentioned in this issue in the source-generator-cookbook if we can't come up with a fix?

@jaredpar
Copy link
Member

For the .NET 6.0.1xx SDK, could we at least document the workarounds mentioned in this issue in the source-generator-cookbook if we can't come up with a fix?

We've outlined a fix above that we're going to implement. We've been pushing back on your alternative fix because it fails to meet our build requirements. But we're still moving forward with the proposed fix.

@jkoritzinsky
Copy link
Member

@jaredpar is the fix the feature flag to go back to the old loading model? If so, I agree with that fix for .NET 6, and I can keep my workaround for the future if required (since setting a custom flag for that in dotnet/runtime will probably eventually cause some problems since the repo infrastructure is absurdly complex).

If there's no mechanism now or in the future to use my proposed fix, then I'm okay with that and I will present no argument otherwise. I'd like there to be some investigation in .NET 7 to see if there's some way to fulfill the compiler's requirements, provide dependency isolation, and work within how the SDK handles ProjectReferences (or extending the ProjectReference protocol if necessary), but I fully understand if this is low-pri enough that it doesn't get looked at again.

@RikkiGibson
Copy link
Contributor

We are currently running with #56442 (comment)

@jkoritzinsky
Copy link
Member

Oh I like that idea. I totally missed that comment when I was catching up on the thread. Sorry about that.

@PathogenDavid
Copy link
Contributor

Allowing a custom ID for the analyzer context is an interesting idea. I wonder where such a value would be stored/looked up?

@RikkiGibson From an MSBuild perspective I envisioned it as being item metadata, IE: <Analyzer Include="MyCoolAnalyzer.dll" Context="MyCoolContext" />

For command line my initial instinct was /analyzer:MyCoolAnalyzer.dll:MyCoolContext, but /analyzer:MyCoolContext=MyCoolAnalyzer.dll might be more suitable since it's similar to /reference:alias=file or /pathmap.


I believe setting <CopyLocalLockFileAssemblies>true</CopyLocalLockFileAssemblies> in your project copies all dependencies to the output directory so you shouldn't need to do that in your workaround @PathogenDavid you just need to remap the analyzers to point to that directory.

@jmarolf I didn't want to write a novel about the different things I tried, but this is actually what I started with. As I satarted to implement it though the solution was looking to be much more complicated or less general-purpose and britle.

(My situation is slightly different from @roji's since my source generator's dependencies are also project references, my source generator csproj doesn't have/need any special logic like his and as such doesn't have paths to directly modify in the first place.)


another solution which adjusts the DirectoryLoadContext logic:

@RikkiGibson I built #56662 locally and can confirm it fixes things for me!

jkoritzinsky added a commit to dotnet/runtime that referenced this issue Sep 24, 2021
This fixes an issue where a project that targets net7.0-windows accidentally ends up getting an output path pointing to the wrong directory for Microsoft.Interop.SourceGeneration.dll, which causes us to hit dotnet/roslyn#56442.
@nikolaidk
Copy link

nikolaidk commented Nov 11, 2021

I had a situation alike in some regards. I have a source generator with a project reference to a utility lib. The utility lib is used in a VSIX extenstion and in a source generator in webapi project. To make this work i had to reference both the utility project and the source generator project as analyser references in the webapi project, and everything works. Now the source generator project picks up the util project without errors.

<ItemGroup>
   ...
    <ProjectReference Include="..\XXX.Utils\XXX.Utils.csproj" OutputItemType="Analyzer" ReferenceOutputAssembly="false"/> 
    <ProjectReference Include="..\XXX\XXX.Generator.csproj" OutputItemType="Analyzer" ReferenceOutputAssembly="false" />
...
  </ItemGroup>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants