-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Fix to avoid stalling the process when ETW is doing a rundown #8357
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -3445,6 +3445,16 @@ void EEJitManager::CleanupCodeHeaps() | |||
} | |||
CONTRACTL_END; | |||
|
|||
// Quick out, don't even take the lock if we have not cleanup to do. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please convert tabs to spaces?
// blocked while ETW is doing rundown. By not taking the lock we avoid | ||
// this stall most of the time since cleanup is rare, and ETW rundown is rare | ||
// the likelihood of both is very very rare. | ||
if (m_cleanupList == NULL) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may be better to do this after the assert - so that the condition is still validated.
37a678f
to
6549b7a
Compare
This only matters when there are MANY JIT compiled methods, but Bing operates in exactly this mode, and thus it stalls for several seconds while rundown completes. This fix does not fix the problem completely, but it makes it MUCH less likely, and is a trivial, safe fix. The problem is that as part of a GC, we do cleanup of any removed JIT code. To do this we take a JIT code manager lock, but this is also a lock that the JIT code iterator takes and is used during ETW rundown. Thus rundown blocks GCs. Almost all the time, we DON'T have JIT code manager cleanup to do, so we just avoid taking the lock in that case, and this makes the stall MUCH less likely.
2994c33
to
d052e8c
Compare
I have moved the early out after the assert. @dotnet-bot test this please |
I have confirmed that with this change that in a common example managed code continues to run (allocate GC objects) while ETW is performing a rundown. It has the effect desired (I was worried that there were other locks that also would end up becoming the critical issue after we cleared this one, but that is not the case). |
This fix has also been ported to the .NET Desktop framework. See Bug 409381 https://devdiv.visualstudio.com/DefaultCollection/DevDiv/_workitems?id=409381&_a=edit Resolved on 4/18/2017 (should be in 4.7.1 but I have not confirmed. |
…/coreclr#8357) This only matters when there are MANY JIT compiled methods, but Bing operates in exactly this mode, and thus it stalls for several seconds while rundown completes. This fix does not fix the problem completely, but it makes it MUCH less likely, and is a trivial, safe fix. The problem is that as part of a GC, we do cleanup of any removed JIT code. To do this we take a JIT code manager lock, but this is also a lock that the JIT code iterator takes and is used during ETW rundown. Thus rundown blocks GCs. Almost all the time, we DON'T have JIT code manager cleanup to do, so we just avoid taking the lock in that case, and this makes the stall MUCH less likely. Commit migrated from dotnet/coreclr@e26e355
This only matters when there are MANY JIT compiled methods, but Bing operates
in exactly this mode, and thus it stalls for several seconds while rundown completes.
This fix does not fix the problem completely, but it makes it MUCH less likely, and is
a trivial, safe fix. The problem is that as part of a GC, we do cleanup of any removed
JIT code. To do this we take a JIT code manager lock, but this is also a lock that the
JIT code iterator takes and is used during ETW rundown. Thus rundown blocks GCs.
Almost all the time, we DON'T have JIT code manager cleanup to do, so we just avoid
taking the lock in that case, and this makes the stall MUCH less likely.