Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Break Issue in Net9.0 #10020

Closed
umanagarajan12 opened this issue Oct 30, 2024 · 26 comments
Closed

Break Issue in Net9.0 #10020

umanagarajan12 opened this issue Oct 30, 2024 · 26 comments
Assignees
Labels
Bug Product bug (most likely) 🚧 work in progress

Comments

@umanagarajan12
Copy link

Description

We've updated all our projects to support .NET 9.0. Our projects use a custom theme, which has been causing issues specifically in .NET 9.0, even though the same setup works seamlessly in .NET 8.0 and other frameworks. We've attached sample projects to illustrate this issue.

Replication Procedure

  1. Run the sample.
  2. Click the ComboBox to open the dropdown.

Screenshot

Image

WpfAppNet90.zip
WpfAppNet80.zip

@himgoyalmicro himgoyalmicro added the Investigate Requires further investigation by the WPF team. label Nov 1, 2024
@harshit7962 harshit7962 self-assigned this Nov 1, 2024
@harshit7962
Copy link
Member

@umanagarajan12, the issue is due to one of our changes to optimize the usage of DynamicResource #5610. At this juncture, you can opt-out from this behaviour to avoid the crash. Please include the following in the csproj.

<ItemGroup>
    <RuntimeHostConfigurationOption Include="Switch.System.Windows.Controls.DisableDynamicResourceOptimization" Value="true" />
</ItemGroup>

Hope this resolves the current issue. In the meantime, we will continue to look further, identify the root cause, and ensure that this will be addressed in the next service release of .NET 9.

cc: @batzen

@harshit7962 harshit7962 added Bug Product bug (most likely) 📭 waiting-author-feedback To request more information from author. and removed Investigate Requires further investigation by the WPF team. labels Nov 1, 2024
@batzen
Copy link
Contributor

batzen commented Nov 1, 2024

Ok. I finally found out what's happening here.
@dipeshmsft
After explaining the issue i strongly advise the WPF team to disable the changes i made by default and before shipping .NET 9 until we find a solution for the issues.

The issue is that the code DeferredResourceReference.GetValue freezes the resolved value depending on which valueSource requested the value and after the value has been resolved once the code, which might freeze the value, never runs again as dictionary is null afterwards.
In this one case there is a brush being referenced via StaticResource.
While StaticResource calls GetValue of DeferredResourceReference the supplied value source is "Unknown" thus the value is not frozen.
When the style later tries to retrieve the brush it's not frozen which then causes the code in StyleHelper.GetChildValueHelper to call StyleHelper.GetInstanceValue (because the check for Freezable and IsFrozen tells it to).
But the passed dataField does not have a value set.

So it's either a bug in the StyleHelper, a bug in DeferredResourceReference or my approach to optimize dynamic resource usage is totally flawed.

With my changes there is only one DeferredResourceReference per key, freezing that one only based on the value source passed in seems very random and i didn't pay enough attention when making my changes.
I am not sure what will happen if we just unconditionally freeze the value.

Table of Frozen state based on BaseValueSourceInternal:
Unknown = Not frozen
Default = Not frozen
Inherited = Not frozen
ThemeStyle = Frozen
ThemeStyleTrigger = Frozen
Style = Frozen
TemplateTrigger = Frozen
StyleTrigger = Frozen
ImplicitReference = Not frozen
ParentTemplate = Frozen
ParentTemplateTrigger = Frozen
Local = Not frozen

@batzen
Copy link
Contributor

batzen commented Nov 1, 2024

As i can't compile this repo, again using .\build.cmd -platform x64 -configuration debug because the build complains about

wpf\src\Microsoft.DotNet.Wpf\src\Themes\PresentationFramework.Luna\ref\PresentationFramework.Luna
-ref.csproj : error NU1604: Project dependency Microsoft.Net.Compilers.Toolset.Framework does not contain an inclusive
lower bound. Include a lower bound in the dependency version to ensure consistent restore results. 

i can't test any changes...

@harshit7962
Copy link
Member

@batzen, thank you for looking into the issue. Regarding the build failure, I am not facing this issue on my machine. Can you help me with the configurations that you are using.

Configurations on my machine:
.NET: 10.0.0-alpha.1.24526.1 and 9.0.0 as well
OS: Windows 11, 23H2
using the x64 machine

Will keep things updated on mitigation of the issue in .NET 9.

@batzen
Copy link
Contributor

batzen commented Nov 5, 2024

@harshit7962 after updating main i now get different errors (same in three different places):

wpf\src\Microsoft.DotNet.Wpf\src\System.Printing\CPP\src\PrintSystemPathResolver.cpp(308,50): err
or C5307: 'int System::String::IndexOf(wchar_t,int)': argument (1) converted from 'char' to 'wchar_t'. Missing 'L' enco
ding-prefix for character literal?

@harshit7962
Copy link
Member

@batzen, this can be resolved by adding the char L on the mentioned lines.

Try changing

IndexOf('\\', 3)

to

IndexOf(L'\\', 3)

We are waiting on some confirmation from the runtime before pushing this fix to the main as of now.

@batzen
Copy link
Contributor

batzen commented Nov 9, 2024

@harshit7962 As the 9.0 branch already got it's branding updated to 9.0.1 does that mean you will ship 9.0 with my faulty optimizations enabled by default?
If that's the case, it will be a PR disaster for WPF.

@batzen
Copy link
Contributor

batzen commented Nov 10, 2024

This change in DeferredResourceReference.GetValue should fix the issue we are encountering here:

if (FrameworkAppContextSwitches.DisableDynamicResourceOptimization)
{
    // Freeze if this value originated from a style or template
    bool freezeIfPossible =
        valueSource == BaseValueSourceInternal.ThemeStyle ||
        valueSource == BaseValueSourceInternal.ThemeStyleTrigger ||
        valueSource == BaseValueSourceInternal.Style ||
        valueSource == BaseValueSourceInternal.TemplateTrigger ||
        valueSource == BaseValueSourceInternal.StyleTrigger ||
        valueSource == BaseValueSourceInternal.ParentTemplate ||
        valueSource == BaseValueSourceInternal.ParentTemplateTrigger;

    // This is to freeze values produced by deferred
    // references within styles and templates
    if (freezeIfPossible)
    {
        StyleHelper.SealIfSealable(value);
    }
}
else
{
    StyleHelper.SealIfSealable(value);
}

Not sure what the negative consequences of this change might be, but always freezing freezables, no matter who asked for the resource, feels more logical.

We really need test applications inside this repo. It's so complicated/time consuming to modify various apps to use the locally built WPF framework. At least the gallery application should be part of this repo. Currently there is not a single application that's part of this repo.
It's even more difficult on main while there is no public version of .NET 10 to target applications to.

@miloush
Copy link
Contributor

miloush commented Nov 10, 2024

there is no public version of .NET 10 to target applications to

Daily builds:

@miloush
Copy link
Contributor

miloush commented Nov 10, 2024

It's so complicated/time consuming to modify various apps to use the locally built WPF framework.

Merging the two repos or even adding a submodule wouldn't really help with that, you would have to have to update all the project files during build (which can arguably be done on a separate repo too) - and possibly could lower the value of tests if they don't run on a final installed runtime.

However, I think I would support having an environmental variable pointing to the local WPF repo and updating the target file to pick it up.

@h3xds1nz
Copy link
Contributor

there is no public version of .NET 10 to target applications to

Also, if you need to run it under VS debugger (Use preview for .NET 10 by the way), set this for the env:

VSDebugger_ValidateDotnetDebugLibSignatures=0

You can test either by referencing the libraries directly from the project (using private assets) or by just simply replacing it in the dotnet\shared folder (that's what you need to do when running under separate host, e.g. using BenchmarkDotNet builds).

@batzen
Copy link
Contributor

batzen commented Nov 10, 2024

@miloush @h3xds1nz is there some page where all that is documented?
Pages like https://github.com/dotnet/wpf/blob/main/Documentation/developer-guide.md don't list all that or it's buried between internal Microsoft documentation steps.

@miloush Merging the repos would improve the inner cycle a lot. Now you have to dance around to align everything instead of having just one repo that manages everything. If the tests are also meant to run on just the installed runtime that should be the exception and special cased instead.
My changes for example made the Gallery app non-functional, even though they passed all tests. That doesn't make me trust the tests very much.
It's clearly my fault that my changes weren't correct, but being told that all tests passed and it's ready for merge made me assume my changes don't cause catastrophic failures.

@miloush
Copy link
Contributor

miloush commented Nov 10, 2024

Ah yeah I certainly don't trust the tests that way. I think they are mainly automated UI tests.

is there some page where all that is documented?

In the readme of the .NET Runtime repo, there is a dogfooding link to https://github.com/dotnet/runtime/blob/main/docs/project/dogfooding.md (it seems to be manually maintained though so still has 9.0 links). The desktop/other runtime links I found here: dotnet/installer#11040 (comment)

@h3xds1nz
Copy link
Contributor

@miloush @h3xds1nz is there some page where all that is documented? Pages like https://github.com/dotnet/wpf/blob/main/Documentation/developer-guide.md don't list all that or it's buried between internal Microsoft documentation steps.

Yeah I don't think so, though there's quite some stuff in dotnet/sdk though but if you don't know what exactly you're looking for, it ain't gonna give you much. There's no step-by-step straightforward guide on how to deal with all this.

As for the testing, I basically selected few of my apps everytime, or wrote targeted tests / created targeted app where applicable because yeah, the tests can't be really trusted and since there's no real way to improve them atm, well, that's the best effort I could do.

By the way, @batzen, I only just noticed that the WinDbg team used ControlzEx for the "new" GUI, so don't break it, alright? They're close, on .NET 8 at the moment, so it would be good to still be able to use it, I don't wanna go back 20 years again. 😀

@harshit7962
Copy link
Member

As the 9.0 branch already got it's branding updated to 9.0.1 does that mean you will ship 9.0 with my faulty optimizations enabled by default?

Yes, the .NET 9 is going in with the optimizations enabled. The code closure for the same was a couple of weeks back, prior to our knowledge of the issue. We are in talks to fast forward the next service release to make amends in this regards.

@harshit7962
Copy link
Member

We really need test applications inside this repo. It's so complicated/time consuming to modify various apps to use the locally built WPF framework. At least the gallery application should be part of this repo.

@batzen, we have quite a few tests here. Agreed that there are some gaps in test infrastructure and we would like to cover them up as much as possible. We are also working to improve our unit tests in this repository itself.

The README in test repository offers initial steps to get started with the testing. Please let us know if any more information is required on testing front. (There are possibly some gaps in documentations, will try to update them by this week).

cc: @miloush, @h3xds1nz

@umanagarajan12, Removing the waiting... label as there is a confirmation about this particular issue getting resolved with the switch. Don't want the issue to be/get closed for better tracking of the resolution.

@harshit7962 harshit7962 removed the 📭 waiting-author-feedback To request more information from author. label Nov 11, 2024
@harshit7962
Copy link
Member

harshit7962 commented Nov 11, 2024

@batzen, regarding the fix, wouldn't we unconditionally seal the value via StyleHelper.SealIfSealable() even if, lets say, valueSource is BaseValueSourceInternal.Unknown, which is very often the case, specially when we validate the DeferredResourceReference?

My changes for example made the Gallery app non-functional, even though they passed all tests. That doesn't make me trust the tests very much.

Also, Can you help us with the issues you are encountering here?

@batzen
Copy link
Contributor

batzen commented Nov 11, 2024

@batzen, regarding the fix, wouldn't we unconditionally seal the value via StyleHelper.SealIfSealable() even if, lets say, valueSource is BaseValueSourceInternal.Unknown, which is very often the case, specially when we validate the DeferredResourceReference?

Yes, that's exactly what my changes are about.
Without my changes there was a new DeferredResourceReference with nearly every call to FetchResource, some of those were sealed and some of those weren't, depending on which source was passed.
With my changes there, most of the time, only one DeferredResourceReference with many calls to FetchResource. Those resources are then only sealed if requested by a style etc.. If, for example, something other than a style requests the resource first the retrieved value is not sealed. If, on the other side, a style requests the resource first the retrieved value is sealed. That's not really deterministic and the reason why i suggest always sealing it.
Doing it that way fixes the issue reported here.
We still should do more testing on various apps to see if my change has any negative effect.
But always sealing the value seems the right approach.
If we can't always seal the value, if it's sealable of course, we will have to roll back all my changes or replace them with a completely different approach.

My changes for example made the Gallery app non-functional, even though they passed all tests. That doesn't make me trust the tests very much.

Also, Can you help us with the issues you are encountering here?

Well, as far as i remember the gallery app crashed with my changes, even though not a single test seems have to have failed during the test runs.

@lindexi
Copy link
Member

lindexi commented Nov 13, 2024

@batzen, this can be resolved by adding the char L on the mentioned lines.

Try changing

IndexOf('\', 3)
to

IndexOf(L'\', 3)
We are waiting on some confirmation from the runtime before pushing this fix to the main as of now.

Seems fixed in #10063

@harshit7962
Copy link
Member

@batzen, we had the suggested fix tested and found multiple failures. I have created a sample repro from one of those failures, please find it here. On starting the test you could see the following failure

System.InvalidOperationException: 'Specified value of type 'System.Windows.Media.Animation.Storyboard' must have IsFrozen set to false to modify.'

which I believe clearly suggests that the fix is not enough.

Since we are closing in on the next service release, we would like to discuss the mitigation of this failure first before looking for the cause and solution. We are inclining towards removal of this feature for .NET 9 completely given the unknowns, either by opt-in or reverting the changes altogether. Thoughts?

@batzen
Copy link
Contributor

batzen commented Nov 15, 2024

Will have a look at the repo during the weekend.

My suggestion is: Disable my changes by default, so make it opt-in.
That way people who are willing to try it and want the performance gains, knowing the risks, can benefit from it.

For .NET 10 or, depending on the amount of required changes, a .NET 9 servicing release I can try to fix the issues we encounter or try a different approach to improve the performance.
The performance gains are so massive in this area that it's worth investing more of my time. I benchmarked again and the gains are from 15 seconds with the opt-out switch, to 0.150 without opt-out, when unloading controls. That's a factor of 100.

@miloush
Copy link
Contributor

miloush commented Nov 15, 2024

Making something opt in has non-trivial costs. One of them is to actually do the opt in, but then also documenting it and the risks so that they can be known, and given the history of opt ins, it is forever.

@batzen
Copy link
Contributor

batzen commented Nov 15, 2024

@miloush The switch which enables opt-in/opt-out would be removed/it's default changed as soon as we know it's working properly.

@batzen
Copy link
Contributor

batzen commented Nov 15, 2024

@harshit7962 I found a completely different approach to solve the majority of the performance issues i tried to solve with my initial changes.
The main issue is that WeakReferenceList.FindWeakReference gets slower the more entries are added.
We can counter that by not using one WeakReferenceList per dictionary, but one WeakReferenceList per resource key instead.
This would require a change to DeferredResourceReference though.
Currently DeferredResourceReference only uses one field _keyOrValue to store the key or value, depending on the state.
Instead we would have to keep that information in dedicated fields _key and _value.

My tests didn't show any significant increase in memory consumption by doing so, most likely because the use of dedicated lists per resource key also decreases the "empty" space consumed by them because there are only a few entries per list and their grow, in regards to reserved capacity, is way lower than before.

I could create a PR for these changes and replace my original approach with the new approach.
What do you think?

@harshit7962
Copy link
Member

Changing the overall approach is unlikely to be approved for the service release. Additionally, we don't have much of a time left to get it fully tested before the service release code freeze. Would you be able to make that contribution for .NET 10? We would love to have the optimization in this critical path.

On mitigation of the issue in .NET 9: I believe that using the opt-in switch doesn't cause much harm, but it may not fully serve the overall purpose. Developers might hesitate to opt-in for the optimization, especially if we mention that it is unstable. This would also add logistical challenges of documentation and acquaintance of the feature, as @miloush suggested earlier. Since we are planning to replace the current approach altogether, it doesn't seem very beneficial to test the current approach. We are inclined to revert the change.

@harshit7962
Copy link
Member

@umanagarajan12, we have made this feature an opt-in for .NET 9. Similar to the radio button issue, the fix should be available by mid January, at which point the opt-out switch won't be needed.

We have still kept-it as an opt-out for .NET 10 currently to get any more such reports of failures, and get a better understanding of the magnitude of the fix.

cc:/ @batzen

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Product bug (most likely) 🚧 work in progress
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants