Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speclet for task isolation and dependency resolution #4133

Merged
merged 2 commits into from
Feb 12, 2019
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 38 additions & 0 deletions documentation/specs/task-isolation-and-dependencies.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
# Task isolation
## Problem definition

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will the isolation be enforced for all tasks? Or is this an abstraction that will be exposed by MSBuild so that tasks (or groups of tasks) can opt in/out wrt isolation?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's unclear. I personally think it should be "enforced". Tasks which don't do weird things should not notice. The only downside is performance, the upsides are that things just work.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to agree. We should probably add a global escape hatch switch just in case something does go wrong, but I think the isolation is basically what a user would hope for/expect without having to know the details.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the "one version per ALC", I think it's basically broken to not isolate. On framework, each task could at least load its dependencies of different versions (when strong named), even without opting in to app domain isolation. I don't think we should allow an opt out for ALC. We can see if there's a real problem (perf or otherwise) in 3.0 feedback and add opt out if there's a compelling case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this mechanism prevent tasks from seeing static state set by other tasks? I assume not (although that would be a good thing, I am pretty sure there are tasks that use statics for caching)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@danmosemsft that depends what assemblies are loaded into the default load context. Currently the assumption is only those needed by MSBuild itself. So if those have static state, that would be shared across all tasks (and visible).
Tasks coming from the same assembly would also see the same static state (the isolation is on assembly level, not task level).
Tasks from different assemblies would not see each others static state, even if they use the same assembly dependency as those would be duplicated/isolated.

Copy link
Contributor

@nguerrera nguerrera Feb 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that is basically what we want.

Ignoring hacks, I think there is one proper pattern where task assemblies can store and share static state in MSBuild itself:

https://github.com/Microsoft/msbuild/blob/e03296fe14152b4c84e6610d2fd2fd10f1ecba82/src/Framework/IBuildEngine4.cs#L28-L32

Today you could share say a parsed NewtonSoft.Json tree between task assemblies using this. However, I've really only seen task assemblies put their own data into this without expecting other task assemblies to use it. And I've even seen bugs where people copy and paste code from one task assembly into another and get InvalidCastException trying to pull things out of this API that they think they put in, but in reality it was another task assembly.

We should probably consider isolating this state as well by changing the backing dictionaries to be unique per ALC / task assembly.

Tasks in MSBuild are dynamically loaded assemblies with potentially separate and colliding dependency trees. Currently MSBuild on .NET Core has no isolation between tasks and as such only one version of any given assembly can be loaded. Prime example of this is Newtonsoft.Json which has multiple versions, but all the tasks must agree on it to work.
nguerrera marked this conversation as resolved.
Show resolved Hide resolved
This problem is also described in #1754.

## Possible solution
Use [`AssemblyLoadContext`](https://docs.microsoft.com/en-us/dotnet/api/system.runtime.loader.assemblyloadcontext?view=netcore-2.2) (ALC) to provide binding isolation for task assemblies. Each task assembly would be loaded into its own ALC instance.
* The ALC would resolve all dependencies of the task assemblies (see dependency resolution below)
* ALC would fallback to the Default for dependencies which the assembly doesn't carry with itself (frameworks and so on)
* ALC would probably have to forcefully fallback for MSBuild assemblies since it's possible that tasks will carry these, but the system requires for the MSBuild assemblies to be shared.
We would probably also want to load groups of tasks which belong together into the same ALC (for example based on their location on disk) to improve performance. This will need some care as there's no guarantee that two random tasks have compatible dependency trees.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think MSBuild should carry policy (ex: path) to load tasks into the same ALC. If isolation contexts are exposed, then this can be handled correctly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is potential performance optimization. I would prefer for the isolation to be mostly transparent to tasks. I worry that if tasks can somehow participate in isolation decisions, they will become much less likely to work in random combinations.


## Potential risks
* Has some small probability of causing breaks. Currently all assemblies from all tasks are loaded into the default context and thus are "visible" to everybody. Tasks with following properties might not work:
* Task has a dependency on an assembly, but it doesn't declare this dependency in its .deps.json and this dependency gets loaded through some other task. This is mostly fixable by implementing probing similar to today's behavior.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think we'll have to have fallback behavior exactly as we do today to keep current tasks working.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it might be a good idea to add some way to "disable" the isolation as a workaround for "broken" tasks.

* Two tasks from different assemblies which somehow rely on sharing certain types. If the new system decides to load these in isolation they won't share types anymore and might not work.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that this is possible but difficult in MSBuild and I don't think I've actually seen a case where it's done, so this seems ok.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even better then.

* Performance - task isolation inherently (and by design) leads to loading certain assemblies multiple times. This increases memory pressure and causes additional JITing and other related work.

## Additional consideration
* None of these changes would have any effect on MSBuild on .NET Framework
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dream of being able to produce a netstandard msbuild task assembly that can work on .NET framework or .NET core, but I get that this would be out of scope. The hard problem there is when your dependencies have different implementations for each.

* Task isolation alone could be achieved on existing MSBuild
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With an ALC but no .deps.json handling?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be nice if deps.json were optional to get ALC isolation. Can custom ALC behave as SetAppPaths option in runtimeconfig for default ALC / app?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If deps.json files is missing, the resolver works with the contents of the published directory of the task.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To answer @rainersigwald question: yes, we could add isolation logic into MSBuild even on .NET Core 2.*. MSBuild seems to have complex enough probing logic to make most tasks work (find their dependencies). The isolation would be about solving problems with tasks which have incompatible dependencies.
.deps.json handling makes the whole thing even more robust and enables additional scenarios, but it requires 3.0 APIs.




# Task dependency resolution
## Problem definition
Tasks with complex and specifically platform specific dependencies don't work out of the box. For example if a task uses [`LibGit2Sharp`](https://www.nuget.org/packages/LibGit2Sharp) package it will not work as is. `LibGit2Sharp` has native dependencies which are platform specific. While the package carries all of them, there's no built in support for the task to load the right ones. For example [source link](https://github.com/dotnet/sourcelink/blob/master/src/Microsoft.Build.Tasks.Git/GitLoaderContext.cs) runs into this problem.

## Possible solution
.NET Core uses `.deps.json` files to describe dependencies of components. It would be natural to treat task assemblies as components and use associated .deps.json file to determine their dependencies. This would make the system work nicely end to end with the .NET Core CLI/SDK and VS integration.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the authoring experience for .deps.json for "plugin" DLLs. What will a task author need to do on their end to make this work correctly?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We generate one today for all .net core projects. But I filed dotnet/sdk#2662

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As @nguerrera mentioned above, today it's enough to dotnet new classlib and dotnet build (or dotnet publish).

In .NET Core 3 there's a new type [`AssemblyDependencyResolver`](https://github.com/dotnet/coreclr/blob/master/src/System.Private.CoreLib/src/System/Runtime/Loader/AssemblyDependencyResolver.cs) which implements parsing and processing of a `.deps.json` for a component (or assembly). The usage is to create an instance of the resolver pointing to the assembly (in MSBuild case the task assembly). The resolver parses the `.deps.json` and stores the information. It exposes two methods to resolve managed and native dependencies.
It was designed to be used as the underlying piece to implement custom ALC. So it would work nicely with task isolation above.

## Potential risks
* Small probability of breaking tasks which have `.deps.json` with them and those are not correct. With this change the file would suddenly be used and could cause either load failures or different versions of assemblies to get loaded.

## Additional consideration
* Task dependency resolution requires APIs which are only available in .NET Core 3.0 (no plan to backport), as such MSBuild will have to target netcoreapp3.0 to use these APIs.