Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Analyzers are slow compared to 2.4 #2594

Closed
siegfriedpammer opened this issue Dec 31, 2021 Discussed in #2592 · 16 comments
Closed

Analyzers are slow compared to 2.4 #2594

siegfriedpammer opened this issue Dec 31, 2021 Discussed in #2592 · 16 comments

Comments

@siegfriedpammer
Copy link
Member

Discussed in #2592

Originally posted by Doug-Becker December 30, 2021
I had been happily using ILSpy version 2.4 for several years and never felt a strong reason to upgrade. Recently we changed the compiler we use to build our .NET Framework 4.8 assemblies and ILSpy 2.4 throws exceptions and crashes when decompiling certain assemblies.

So I upgraded to ILSpy 7.1, and while it seems to solve the crashing problems, it is incredibly slow when running an analysis. On ILSpy 2.4, analyses (for example, a "used by" analysis on a type or method) would complete in 1-2 seconds. On IlSpy 7.1, they are taking tens of minutes (not an exaggeration). This makes ILSpy effectively useless for me.

CPU usage during this wait is low (<10%) when running an analysis. There seems to be no major disk activity. There seems to be only one thread using CPU during this time (but even if it used all cores, the performance would not be close to the ILSpy 2.4)

Possibly a factor for this is that I have 6500+ assemblies loaded (as our system is huge). But ILSpy 2.4 handled this without a problem.

I haven't found any complaints about this in any web search, so I'm a bit puzzled why I am seeing this. I would have thought that such a dramatic loss of performance would have been noted by somebody. I realize that it's been a very long time since ILSpy 2.4, and the code has changed significantly since then, so I don't have much hope that anyone could easily point to why this has happened. But I'm hoping that someone could suggest a possible mitigation or at least have a theory to what might be causing this.

@siegfriedpammer
Copy link
Member Author

First analysis on my computer:

633 Assemblies
[mscorlib]System.Object.Equals(object) Used By
ILSpy 2.4: 5.42s 1788 results
ILSpy master (7.2): 88.86s 1472 results

will investigate further

siegfriedpammer added a commit that referenced this issue Dec 31, 2021
….GetRuntimePackAsync, PEFile.Name, PEFile.FullName to improve performance of assembly resolving. This improves performance of the analysis by a factor of 2.
@Doug-Becker
Copy link

Thanks for looking into this.
In email, you asked if this were true for the list of assemblies in the GAC. I tried with the 1239 assemblies in the GAC on my machine. A "used by" analysis for a type with one result on ILSpy 2.4 takes less than a second, almost instantly. On ILSpy 7.1, it takes 39s. For a type with 965 results, on ILSpy 2.4 it takes about a second. On ILSpy 7.1 it takes 85s.

I see you have a commit to help this issue. Should I attempt to download and build the latest branch, or would it be better to wait?

@siegfriedpammer
Copy link
Member Author

I see you have a commit to help this issue. Should I attempt to download and build the latest branch, or would it be better to wait?

You can always try the latest version and see how much it improves. Thanks!

@Doug-Becker
Copy link

I did a little informal testing with the latest branch including the commit, 7.1, and 2.4:

All tests below are "used by" analysis starting with a type. The timings for 7.1 and 2.4 were done by stopwatch, and may not be exact. The number of results for 2.4 were counted by hand.

GAC
System.Diagnostics.SymbolStore.SymBinder
ILSpy latest: 6481ms, 1 result
ILSpy 7.1: 40s, 1 result
ILSpy 2.4: less than a second, 1 result
System.Data.DataSet
ILSpy latest: 3910ms, 306 results
ILSpy 7.1: 90s, 965 results
ILSpy 2.4: 1-3 seconds,306 results
System.Data.DataTable
ILSpy latest: 3773ms, 857 results
IlSpy 7.1: 86s, 2658 results
ILSpy 2.4: 1-3 seconds. 523 results
(my project with more than 6500 assemblies)
(a particular type)
ILSpy latest: 240525ms, 13 results
ILSpy 7.1: too long to retry (at least 20 minutes, last time I checked)
ILSpy 2.4: about 1-2 seconds, 13 results

The performance of the latest is definitely improved, but still not near the IL2.4 performance for my large project. The different number of results for some cases is concerning.

My wild, uninformed speculation is that this analysis in 2.4 was O(n) in the number of assemblies (or maybe types? members?) while the latest versions are O(n^2) or worse, somehow.

Is there something I can do to help find out where the time is being taken?

@dgrunwald
Copy link
Member

dgrunwald commented Jan 28, 2022

So there's a bunch of things coming together here.

  1. ILSpy 2.4 did not search everything. For example searching typeof() in attributes requires decoding attribute signatures, which ILSpy 2.4 did not do (false negatives) but ILSpy 7 does, so that takes some extra time.
  2. ILSpy 7 has a feature where compile-generated methods (e.g. when the type is "used by" the body of a lambda expression) are mapped back to the method containing the lambda. ILSpy 2.4 did not have this feature and instead directly listed the compiler-generated method.
    Creating this mapping is somewhat expensive -- and it turns out we currently do this for all types, not just for those where we found any results. So there's plenty of optimization potential here.
  3. ILSpy 2.4 used Mono.Cecil, which is a somewhat low-level model of the .NET metadata. Modern ILSpy uses System.Reflection.Metadata, which is even lower-level and more efficient, but tricky to use. The decompiler needs a higher-level view of the type system and thus has its own abstractions on top. In ILSpy 2.4 the analyzers directly used Mono.Cecil; in 7 they use the high-level type-system because SRM is missing some functionality. This makes the analyzers significantly more expensive. We could fix this by re-implementing the missing functionality with SRM, bypassing the type-system.
  4. In the new type system, for every assembly we create a type-system of the referenced assemblies. So if your 6500 assemblies all depend on each other (=dependency graph being a huge linked list), the type-system construction may end up taking quadratic time.
    4b) It turns out the cache for resolving assembly references did not work as expected; so each assembly referencing "System.dll" would trigger a separate file-system search for that dll.

siegfriedpammer added a commit that referenced this issue Jan 28, 2022
… actually have a usage of the analyzed method.
siegfriedpammer added a commit that referenced this issue Jan 29, 2022
…ve LoadedAssembly.CreateLoadedAssemblyLookupAsync calls
siegfriedpammer added a commit that referenced this issue Jan 29, 2022
…ve LoadedAssembly.CreateLoadedAssemblyLookupAsync calls
@Doug-Becker
Copy link

Before I noticed that you had made some commits on this, I had tried making some caching optimizations of my own. See https://github.com/Doug-Becker/ILSpy/tree/issue2594 for the changes. Hopefully these will show some benefit.

I had made a global cache (per assembly list) in LoadedAssembly.MyAssemblyResolver for assemblies to address your issue 4. I also added some caching of the method analysis in TypeUsedByAnalyzer. Finally, looking at the ILSpy 2.4 code, it appears to me that ILSpy 2.4 managed to do parallel analysis. I've added PLINQ to TypeUsedByAnalyzer to get some speedup in this way.
The improvements from my changes are dramatic (mostly from the assembly cache to avoid the quadratic search). Still the results are 2-3x slower than ILSpy 2.4 on my large project, but they are now what I would call usable:

GAC

  • System.Diagnostics.SymbolStore.SymBinder

ILSpy latest: 6481ms, 1 result
... with LoadedAssembly.MyAssemblyResolver, caching results: 3257ms, 1 result
... also with other optimizations: 1069ms, 1 result
System.Data.DataSet

  • ILSpy latest: 3910ms, 306 results

... with LoadedAssembly.MyAssemblyResolver, caching results: 767ms, 306 results
... also with other optimizations: 548ms, 306 results
System.Data.DataTable

  • ILSpy latest: 3773ms, 857 results

... with LoadedAssembly.MyAssemblyResolver, caching results: 333ms, 857 results
... also with other optimizations: 612ms, 857 results

(my project with more than 6500 assemblies)

  • (type "A")

ILSpy latest: 240525ms, 13 results
... with LoadedAssembly.MyAssemblyResolver, caching results: 5531ms, 13 results
... also with other optimizations: 4773ms, 13 results

  • (type "B")

ILSpy latest: 274690ms, 1252 results
... with LoadedAssembly.MyAssemblyResolver, caching results: 19543ms, 1252 results
... also with other optimizations: 5698ms, 1252 results

  • (type "C")

ILSpy latest: 266660ms, 597 results
... with LoadedAssembly.MyAssemblyResolver, caching results: 8600ms, 597 results
... also with other optimizations: 4537ms, 597 results

"ILSpy latest" above is before your last set of changes. At some point I will try to measure with your changes instead.

@siegfriedpammer
Copy link
Member Author

@Doug-Becker I have pushed some optimizations, which unfortunately are primarily targeted at MethodUsedByAnalyzer (unfortunately, I misread/misremembered the situation you stated in the original discussion). However, most of the changes will also benefit the other analyzers.

Could you please try https://github.com/icsharpcode/ILSpy/tree/analyzer-opt with your large assembly list? Thanks!

@dgrunwald I just took a look at the old analyzer implementation and it was indeed using PLINQ to speed up the whole process. However, think we first should focus on improving the analyzer code in single-threaded mode. Parallelization can always be added to get a "free" boost. What do you think?

@siegfriedpammer
Copy link
Member Author

(my project with more than 6500 assemblies)

  • (type "A")

ILSpy latest: 240525ms, 13 results ... with LoadedAssembly.MyAssemblyResolver, caching results: 5531ms, 13 results ... also with other optimizations: 4773ms, 13 results

  • (type "B")

ILSpy latest: 274690ms, 1252 results ... with LoadedAssembly.MyAssemblyResolver, caching results: 19543ms, 1252 results ... also with other optimizations: 5698ms, 1252 results

  • (type "C")

ILSpy latest: 266660ms, 597 results ... with LoadedAssembly.MyAssemblyResolver, caching results: 8600ms, 597 results ... also with other optimizations: 4537ms, 597 results

@Doug-Becker This sounds like our AssemblyListSnapshot optimization will provide a vast improvement in speed. I would be very interested to see a comparison. Thanks!

@Doug-Becker
Copy link

@siegfriedpammer: I've tried the https://github.com/icsharpcode/ILSpy/tree/analyzer-opt branch on my project:

  • (type "A")
    ILSpy analyzer-opt branch: 153670ms, 13 results
  • (type "B")
    ILSpy analyzer-opt branch: 168889ms, 1252 results
  • (type "C")
    ILSpy analyzer-opt branch: 146673ms, 597 results

An improvement, but not as much as I had hoped.

@siegfriedpammer
Copy link
Member Author

Thank you for your feedback, I have pushed some optimizations for TypeUsedByAnalyzer, which now finds 416 uses of System.Data.DataSet in my GAC in about 2.4 seconds (before that it took 7.4 seconds). I think there is still some potential for optimization, but I would like to hear your opinion, whether we're getting to the point where you would call the analyzer usable for your use-case. Thanks!

I am not sure if you are familiar with the Visual Studio performance profiler. If you want, you can try profiling the one analyzer run on your machine and then send the results to me. For short analysis sessions the size should only be about ~30 MB which I think is manageable. To keep file size down, make sure to start the profiler with data collection disabled and only activate it right before you expand the "Used By" analyzer tree node and stop profiling after it is done.

If you could provide profiling data this would be really nice and might help us understand your use-case better.

@siegfriedpammer
Copy link
Member Author

I added another set of optimizations. fyi

@dgrunwald
Copy link
Member

Today we looked some more into optimizations with focus on the "Find uses of type" case for System.Data.DataSet in the whole GAC.
Explaining some of the optimizations we did:

  • Last week we added AssemblyListSnapshot which maintains a lookup table for loaded assemblies. This turns "use existing assembly if already loaded" from an O(N) loop to O(1).
  • Last week we changed MethodUsedByAnalyzer and TypeUsedByAnalyzer: before resolving a tokens in a method body, we added a quick name check whether name for the token matches the name of the analyzed method. This avoids unnecessarily performing overload resolution for everything.
  • Last week we fixed MethodUsedByAnalyzer to avoid computing CodeMappingInfo for types where we did not find any search results.
  • Today we fixed a bug that caused ILSpy treat internal classes as if they were public and search unnecessarily many assemblies.
  • Today we applied the same "quick name check" change to signatures checked by the TypeUsedByAnalyzer
  • TypeUsedByAnalyzer now bypasses the type system when searching custom attributes for typeof().
  • Finally, we re-enabled parallelization (so far only in TypeUsedByAnalyzer)

Together these changes bring "Find uses of type" for System.Data.DataSet in the whole GAC from initially 7.4s (as measured by @siegfriedpammer) to 2.4s last week, to 750ms now (with parallelization).
I don't know the exact time for ILSpy 2.4 but I believe we're close or already faster on this particular test case.

Ideas for further improvements:

  • For all analyzers: Assembly resolving is still somewhat slow -- we optimized the happy path (cache hit); but cache misses (for missing assemblies) can still be quite expensive. some additional caching could work wonders here.
  • For TypeUsedByAnalyzer: there's the potential to use bitsets to mark type references that we already checked and which were not interesting. This could avoid repeated string comparisons (or even avoid resolving when the quick name check results in false positives).
  • TypeUsedByAnalyzer: I think we still need to (re-)add support for CodeMappingInfo?
  • MethodUsedByAnalyzer: we should at least add parallelization.
  • Analyzers we did not optimize yet, but which we should apply the same optimizations to: FieldAccessAnalyzer, MethodVirtualUsedByAnalyzer, TypeInstantiatedByAnalyzer

siegfriedpammer added a commit that referenced this issue Feb 5, 2022
… actually have a usage of the analyzed method.
siegfriedpammer added a commit that referenced this issue Feb 5, 2022
…ve LoadedAssembly.CreateLoadedAssemblyLookupAsync calls
siegfriedpammer added a commit that referenced this issue Feb 5, 2022
siegfriedpammer added a commit that referenced this issue Feb 5, 2022
@siegfriedpammer
Copy link
Member Author

Quick comparison update:

Comparing results for System.Data.DataSet using my GAC:

This is 2.4 (modified to include result count and stopwatch output):

image

This is 7.2 (analyzer-opt branch):

image

Both release builds measured on my ThinkPad E590 (Intel Core i7 8th Gen + an SSD)

@Doug-Becker
Copy link

First, let me apologize. I haven't been able to spend much time reacting to your changes. (And I'm sorry to say that's probably going to be true going forward as well. I apologize again.)
Second, let me thank you again for all the work you have put into looking into this issue. You've given this issue more attention than I have any right to expect. And of course, it's very difficult for you to test my scenario. Even if in the end I have to give up on ILSpy, I think you've been very helpful.

But the last set of changes... "disappointing" doesn't cover it.

And one more apology from me (my last), my project codebase has moved on since I gave the last set of results; the numbers are not entirely apples-to-apples. And for some reason, everything is a bit slower. With my optimized branch that was previously giving me results in 4-6s, I'm now seeing 12-13s. I have no idea why; the project changes should not be so dramatic. Possibly something else is happening on my computer to slow it down, but I can't guess what it could be.

But with all those caveats, with the latest changes, I'm seeing:

  • type "A"
    576899ms, 13 results

  • type "B"
    570171ms, 1234 results

  • type "C"
    578261ms, 586 results

(The different number of results are also seen on my branch, I don't believe this indicates any problem.)

So these results are roughly 50x slower than my optimized branch. And I cannot use the tool with this sort of performance.

I have not used the Visual Studio performance profiler before, but I did give it a try. But if there is a way to save the results so I can send to you, it seems to be well hidden. And it doesn't seem to be discoverable with a Google search. (I did find instructions for VS2017 that no longer apply to VS2022.) Maybe there are different types of profiling and I did the wrong one? If you can point me to some documentation, I might be able to send you some results.

On the other hand, I did use a profiler (YourKit) to guide my optimizations. Might I suggest that it could be worth looking at them? At least it shows that it's possible to deliver the results much, much faster. Possibly there is something in my changes that is unsuitable for the actual product, but I'd hope that looking at my changes should give a sense of what could be causing over an order of magnitude of unnecessary work for analysis with many interdependent assemblies.

And one last minor note, it's not only "type used by" analysis that is very slow; at least "method used by" is as well. But I think you have started to look into that as well.

Thank you again for looking into this.

@siegfriedpammer
Copy link
Member Author

siegfriedpammer commented Feb 6, 2022

Can you please send an email to [email protected]. Thanks!

siegfriedpammer added a commit that referenced this issue Mar 7, 2022
… actually have a usage of the analyzed method.
siegfriedpammer added a commit that referenced this issue Mar 7, 2022
…ve LoadedAssembly.CreateLoadedAssemblyLookupAsync calls
siegfriedpammer added a commit that referenced this issue Mar 7, 2022
siegfriedpammer added a commit that referenced this issue Mar 7, 2022
@Doug-Becker
Copy link

With the latest master branch, I have seen a dramatic improvement in performance on my large project with over 6500 assemblies. I consider the analyzers useable on my project.

Measurements below:

Type “A”
ILSpy 2.4: about 1-2s, 13 results
ILSpy (master as of 2022-01-27): 240525ms, 13 results
ILSpy (analyzer-opt as of 2022-02-05): 576899ms, 13 results
ILSpy (master as of 2022-04-29): 4477ms, 13 results

Type “B”
ILSpy 2.4: (threw exception)
ILSpy (master as of 2022-01-27): 274690ms, 1252 results
ILSpy (analyzer-opt as of 2022-02-05): 570171ms, 1234 results
ILSpy (master as of 2022-04-29): 4724ms, 1234 results

Type “C”
ILSpy 2.4: about 1-2s, 412 results
ILSpy (master as of 2022-01-27): 266660ms, 597 results
ILSpy (analyzer-opt as of 2022-02-05): 578261ms, 586 results
ILSpy (master as of 2022-04-29): 5218ms, 586 results

Comparisons are not perfect apples-to-apples since the codebase changed somewhat in between analyses, but should be comparable.

While these numbers are 2-4x slower than ILSpy 2.4, I think these are acceptable performance numbers. I think this issue can be closed.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 2, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants