-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use CopyContext to restore saved context on X86 #65490
Conversation
Any chance we can switch to |
@VSadov I have built your PR and I can't repro the issue anymore (AMD, x86, Win11) 👍 |
I've backported the fix on top of release/6.0 branch and built it locally. I can also confirm that the issue no longer reproduces. |
src/coreclr/vm/threadsuspend.cpp
Outdated
// | ||
// REVIEW: CopyContext may fail. in theory. How should we handle that? FailFast? | ||
CONTEXT* pTarget = pExcepPtrs->ContextRecord; | ||
CopyContext(pTarget, pTarget->ContextFlags, pCtx); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems it would make sense to modify the ReplaceExceptionContextRecord to use CopyContext instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was not sure if ReplaceExceptionContextRecord
can be used outside of Windows. Can it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not readily obvious but it's always guarded by HOST_WINDOWS
and/or !TARGET_UNIX
so it looks Windows-only. CopyContext
is available on Windows 7 SP1+ which is the minimum target at the moment so that should be fine too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also wonder if ReplaceExceptionContextRecord
needs to handle extended context at all if suspension resume switches completely to the RtlRestoreContext
plan.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was not sure if ReplaceExceptionContextRecord can be used outside of Windows. Can it?
I have tried moving the fix into ReplaceExceptionContextRecord
and that did not build on Linux, so that answers this.
I think it is worth trying. The reason for current approach could be compatibility with no longer interesting platforms. (Win95?) Edit: |
Seems reasonable going forward but I assume that this will need servicing for 6.0 and it looks like too big of a change for backporting? |
I have looked into unifying on use of The level of API that we are targeting does not include I think we might want to take the current fix first though - to unblock clean CI, since the issue causes random failures. |
CC: @davidwrighton |
src/coreclr/vm/threadsuspend.cpp
Outdated
// these contexts may contain extended registers and may have different format | ||
// for reasons such as alignment or context compaction | ||
// | ||
// REVIEW: CopyContext may fail. in theory. How should we handle that? FailFast? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, FailFast is appropriate for failure here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, be careful calling new functions from this function. If they have destructors and the compiler chooses to inline them, the runtime will start crashing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking of code like:
if (!CopyContext(pTarget, pTarget->ContextFlags, pCtx))
{
STRESS_LOG1(LF_SYNC, LL_ERROR, "ERROR: Could not set context record, lastError = 0x%x\n", GetLastError());
ThrowLastError();
}
Would that fit here?
It is highly unlikely that copy would fail, but it'd be nice to have some diagnostics if that start happenning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a quick fix to unblock testing, I approve of this, but I'm not confident this is the right long term fix. IMO, we should change to RtlRestoreContext
on operating systems that support it, and keep the existing horrifying logic for the older OS. Then of course, once we have that, we should backport the fix to servicing.
PR feedback update looks good to me. |
src/coreclr/vm/threadsuspend.cpp
Outdated
if (!CopyContext(pTarget, pTarget->ContextFlags, pCtx)) | ||
{ | ||
STRESS_LOG1(LF_SYNC, LL_ERROR, "ERROR: Could not set context record, lastError = 0x%x\n", GetLastError()); | ||
ThrowLastError(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ThrowLastError(); | |
EEPOLICY_HANDLE_FATAL_ERROR(COR_E_EXECUTIONENGINE); |
The caller does not expect exceptions. This needs to be failfast. Throwing exception is likely going to lead to hard to diagnose crash otherwise.
src/coreclr/vm/threadsuspend.cpp
Outdated
@@ -2509,6 +2509,18 @@ void RedirectedThreadFrame::ExceptionUnwind() | |||
#ifndef TARGET_UNIX | |||
|
|||
#ifdef TARGET_X86 | |||
|
|||
// a helper to make sure these calls are not inlined into a filter function. | |||
// in particular we do not want a possibility of inlined destructor calls. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not be a concern once you change it to EEPOLICY_HANDLE_FATAL_ERROR
. EEPOLICY_HANDLE_FATAL_ERROR
is non-inlineable.
+1. It is likely that the OS treatment of context will evolving over time that is going to break the exception filter hack again. |
Agree on this as well. |
Thanks!!! |
* Use CopyContext to restore saved context on X86 * PR feedback * more PR feedback
* Use CopyContext to restore saved context on X86 * PR feedback * more PR feedback
…#66120) * Use CopyContext to restore saved context on X86 (#65490) * Use CopyContext to restore saved context on X86 * PR feedback * more PR feedback * Do not copy XState other than AVX when redirecting for GC stress (#65825) * Do not copy XState other than AVX * #if defined(TARGET_X86) || defined(TARGET_AMD64) * mask XState unconditionally * Ensure XSTATE_MASK_AVX is set before calling EEGetThreadContext * redundant supportsAVX, more clear comment * PR feedback * null-check the redirect context before using. (#65910) * null-check the redirect context before using. * tweak the comment * do not allocate context if InitializeContext has unexpected results. * EE Suspension on x86 should use RtlRestoreContext when available (#65878) * RestoreContextSimulated * probe for RtlRestoreContext * ntdll.dll * restore self-trap sequence * PR feedback * Clarify CopyContext in RedirectedHandledJITCaseExceptionFilter * simpler indentation. * restore last error on the legacy path. * Update src/coreclr/vm/threads.h Co-authored-by: Dan Moseley <[email protected]> Co-authored-by: Dan Moseley <[email protected]>
* Use CopyContext to restore saved context on X86 (#65490) * Use CopyContext to restore saved context on X86 * PR feedback * more PR feedback * Do not copy XState other than AVX when redirecting for GC stress (#65825) * Do not copy XState other than AVX * #if defined(TARGET_X86) || defined(TARGET_AMD64) * mask XState unconditionally * Ensure XSTATE_MASK_AVX is set before calling EEGetThreadContext * redundant supportsAVX, more clear comment * PR feedback * null-check the redirect context before using. (#65910) * null-check the redirect context before using. * tweak the comment * do not allocate context if InitializeContext has unexpected results. * EE Suspension on x86 should use RtlRestoreContext when available (#65878) * RestoreContextSimulated * probe for RtlRestoreContext * ntdll.dll * restore self-trap sequence * PR feedback * Clarify CopyContext in RedirectedHandledJITCaseExceptionFilter * simpler indentation. * restore last error on the legacy path. * Update src/coreclr/vm/threads.h Co-authored-by: Dan Moseley <[email protected]> Co-authored-by: Vladimir Sadov <[email protected]> Co-authored-by: Dan Moseley <[email protected]>
Fixes: #65292