Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.

Fix to avoid stalling the process when ETW is doing a rundown #8357

Merged
merged 1 commit into from
Nov 30, 2016

Conversation

vancem
Copy link

@vancem vancem commented Nov 29, 2016

This only matters when there are MANY JIT compiled methods, but Bing operates
in exactly this mode, and thus it stalls for several seconds while rundown completes.

This fix does not fix the problem completely, but it makes it MUCH less likely, and is
a trivial, safe fix. The problem is that as part of a GC, we do cleanup of any removed
JIT code. To do this we take a JIT code manager lock, but this is also a lock that the
JIT code iterator takes and is used during ETW rundown. Thus rundown blocks GCs.

Almost all the time, we DON'T have JIT code manager cleanup to do, so we just avoid
taking the lock in that case, and this makes the stall MUCH less likely.

Copy link

@mjsabby mjsabby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -3445,6 +3445,16 @@ void EEJitManager::CleanupCodeHeaps()
}
CONTRACTL_END;

// Quick out, don't even take the lock if we have not cleanup to do.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please convert tabs to spaces?

// blocked while ETW is doing rundown. By not taking the lock we avoid
// this stall most of the time since cleanup is rare, and ETW rundown is rare
// the likelihood of both is very very rare.
if (m_cleanupList == NULL)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be better to do this after the assert - so that the condition is still validated.

@vancem vancem closed this Nov 30, 2016
@vancem vancem force-pushed the ETWRundownFix.11-19-16 branch from 37a678f to 6549b7a Compare November 30, 2016 15:39
@vancem vancem reopened this Nov 30, 2016
This only matters when there are MANY JIT compiled methods, but Bing operates
in exactly this mode, and thus it stalls for several seconds while rundown completes.

This fix does not fix the problem completely, but it makes it MUCH less likely, and is
a trivial, safe fix. The problem is that as part of a GC, we do cleanup of any removed
JIT code. To do this we take a JIT code manager lock, but this is also a lock that the
JIT code iterator takes and is used during ETW rundown. Thus rundown blocks GCs.

Almost all the time, we DON'T have JIT code manager cleanup to do, so we just avoid
taking the lock in that case, and this makes the stall MUCH less likely.
@vancem vancem force-pushed the ETWRundownFix.11-19-16 branch from 2994c33 to d052e8c Compare November 30, 2016 15:46
@vancem
Copy link
Author

vancem commented Nov 30, 2016

I have moved the early out after the assert.

@dotnet-bot test this please

@vancem
Copy link
Author

vancem commented Nov 30, 2016

I have confirmed that with this change that in a common example managed code continues to run (allocate GC objects) while ETW is performing a rundown. It has the effect desired (I was worried that there were other locks that also would end up becoming the critical issue after we cleared this one, but that is not the case).

@jkotas jkotas merged commit e26e355 into dotnet:master Nov 30, 2016
@vancem vancem deleted the ETWRundownFix.11-19-16 branch November 30, 2016 19:46
@karelz karelz modified the milestone: 2.0.0 Aug 28, 2017
@vancem
Copy link
Author

vancem commented Feb 15, 2018

This fix has also been ported to the .NET Desktop framework.

See Bug 409381

https://devdiv.visualstudio.com/DefaultCollection/DevDiv/_workitems?id=409381&_a=edit

Resolved on 4/18/2017 (should be in 4.7.1 but I have not confirmed.

picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
…/coreclr#8357)

This only matters when there are MANY JIT compiled methods, but Bing operates
in exactly this mode, and thus it stalls for several seconds while rundown completes.

This fix does not fix the problem completely, but it makes it MUCH less likely, and is
a trivial, safe fix. The problem is that as part of a GC, we do cleanup of any removed
JIT code. To do this we take a JIT code manager lock, but this is also a lock that the
JIT code iterator takes and is used during ETW rundown. Thus rundown blocks GCs.

Almost all the time, we DON'T have JIT code manager cleanup to do, so we just avoid
taking the lock in that case, and this makes the stall MUCH less likely.

Commit migrated from dotnet/coreclr@e26e355
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants