Fix to avoid stalling the process when ETW is doing a rundown #8357

vancem · 2016-11-29T20:35:15Z

This only matters when there are MANY JIT compiled methods, but Bing operates
in exactly this mode, and thus it stalls for several seconds while rundown completes.

This fix does not fix the problem completely, but it makes it MUCH less likely, and is
a trivial, safe fix. The problem is that as part of a GC, we do cleanup of any removed
JIT code. To do this we take a JIT code manager lock, but this is also a lock that the
JIT code iterator takes and is used during ETW rundown. Thus rundown blocks GCs.

Almost all the time, we DON'T have JIT code manager cleanup to do, so we just avoid
taking the lock in that case, and this makes the stall MUCH less likely.

mjsabby

LGTM

jkotas · 2016-11-30T02:50:35Z

src/vm/codeman.cpp

@@ -3445,6 +3445,16 @@ void EEJitManager::CleanupCodeHeaps()
    }
    CONTRACTL_END;

+	// Quick out, don't even take the lock if we have not cleanup to do.


Could you please convert tabs to spaces?

jkotas · 2016-11-30T02:51:31Z

src/vm/codeman.cpp

+	// blocked while ETW is doing rundown.   By not taking the lock we avoid
+	// this stall most of the time since cleanup is rare, and ETW rundown is rare
+	// the likelihood of both is very very rare.   
+	if (m_cleanupList == NULL)


It may be better to do this after the assert - so that the condition is still validated.

This only matters when there are MANY JIT compiled methods, but Bing operates in exactly this mode, and thus it stalls for several seconds while rundown completes. This fix does not fix the problem completely, but it makes it MUCH less likely, and is a trivial, safe fix. The problem is that as part of a GC, we do cleanup of any removed JIT code. To do this we take a JIT code manager lock, but this is also a lock that the JIT code iterator takes and is used during ETW rundown. Thus rundown blocks GCs. Almost all the time, we DON'T have JIT code manager cleanup to do, so we just avoid taking the lock in that case, and this makes the stall MUCH less likely.

vancem · 2016-11-30T15:48:58Z

I have moved the early out after the assert.

@dotnet-bot test this please

vancem · 2016-11-30T17:01:42Z

I have confirmed that with this change that in a common example managed code continues to run (allocate GC objects) while ETW is performing a rundown. It has the effect desired (I was worried that there were other locks that also would end up becoming the critical issue after we cleared this one, but that is not the case).

vancem · 2018-02-15T00:19:08Z

This fix has also been ported to the .NET Desktop framework.

See Bug 409381

https://devdiv.visualstudio.com/DefaultCollection/DevDiv/_workitems?id=409381&_a=edit

Resolved on 4/18/2017 (should be in 4.7.1 but I have not confirmed.

…/coreclr#8357) This only matters when there are MANY JIT compiled methods, but Bing operates in exactly this mode, and thus it stalls for several seconds while rundown completes. This fix does not fix the problem completely, but it makes it MUCH less likely, and is a trivial, safe fix. The problem is that as part of a GC, we do cleanup of any removed JIT code. To do this we take a JIT code manager lock, but this is also a lock that the JIT code iterator takes and is used during ETW rundown. Thus rundown blocks GCs. Almost all the time, we DON'T have JIT code manager cleanup to do, so we just avoid taking the lock in that case, and this makes the stall MUCH less likely. Commit migrated from dotnet/coreclr@e26e355

dnfclas added the cla-already-signed label Nov 29, 2016

mjsabby approved these changes Nov 29, 2016

View reviewed changes

jkotas reviewed Nov 30, 2016

View reviewed changes

vancem closed this Nov 30, 2016

vancem force-pushed the ETWRundownFix.11-19-16 branch from 37a678f to 6549b7a Compare November 30, 2016 15:39

vancem reopened this Nov 30, 2016

dnfclas added the cla-already-signed label Nov 30, 2016

vancem force-pushed the ETWRundownFix.11-19-16 branch from 2994c33 to d052e8c Compare November 30, 2016 15:46

jkotas approved these changes Nov 30, 2016

View reviewed changes

jkotas merged commit e26e355 into dotnet:master Nov 30, 2016

vancem deleted the ETWRundownFix.11-19-16 branch November 30, 2016 19:46

karelz modified the milestone: 2.0.0 Aug 28, 2017

mjsabby mentioned this pull request Sep 4, 2020

Why does the GC not suspend a thread doing ETW rundown? dotnet/runtime#41857

Closed

brianrob mentioned this pull request May 30, 2024

Very Long JIT Times During ETW Rundown dotnet/runtime#102858

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix to avoid stalling the process when ETW is doing a rundown #8357

Fix to avoid stalling the process when ETW is doing a rundown #8357

vancem commented Nov 29, 2016

mjsabby left a comment

jkotas Nov 30, 2016

jkotas Nov 30, 2016

vancem commented Nov 30, 2016

vancem commented Nov 30, 2016

vancem commented Feb 15, 2018

Fix to avoid stalling the process when ETW is doing a rundown #8357

Fix to avoid stalling the process when ETW is doing a rundown #8357

Conversation

vancem commented Nov 29, 2016

mjsabby left a comment

Choose a reason for hiding this comment

jkotas Nov 30, 2016

Choose a reason for hiding this comment

jkotas Nov 30, 2016

Choose a reason for hiding this comment

vancem commented Nov 30, 2016

vancem commented Nov 30, 2016

vancem commented Feb 15, 2018