-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do not copy XState other than AVX when redirecting for GC stress #65825
Conversation
@janvorli - could you check that the fix actually resolves the issue? I could not reproduce it. I think the issue requires your other changes. |
It doesn't work, the CopyContext still fails with this change. |
|
Ah, yes, since it is picking whatever is dirty, it is possible that there is XState, but without AVX. I need to apply the mask unconditionally. |
But what would it copy then when the source context doesn't contain AVX? And it would drop the AVX512 state possibly resulting in floating point state corruption, wouldn't it? It still seems to me that allocating the context with everything is a better way. |
Since IP is in managed code, only AVX could have live data. If that is not present, we will only copy the legacy part. The AVX2 vs. AVX512 is confusing. I keep mixing them up. I believe managed code may use AVX2 (i.e. AVX256), but that is included in AVX feature in XState, but not AVX512. When AVX512 is supported we will have to make the context larger and start saving/restoring AVX512 feature as well. |
@@ -1997,15 +1997,6 @@ CONTEXT* AllocateOSContextHelper(BYTE** contextBuffer) | |||
pfnInitializeContext2(buffer, context, &pOSContext, &contextSize, xStateCompactionMask): | |||
InitializeContext(buffer, context, &pOSContext, &contextSize); | |||
|
|||
// if AVX is supported set the appropriate features mask in the context |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With this removed, the code above that sets the supportAVX based on the FeatureMask can be removed too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, even before supportAVX
was redundant, since SetXStateFeaturesMask
performs the same check - to filter out what is not supported.
I can confirm that the current state of the change works. |
// This should not normally fail. | ||
// The system silently ignores any feature specified in the FeatureMask | ||
// which is not enabled on the processor. | ||
bRes &= SetXStateFeaturesMask(pCtx, XSTATE_MASK_AVX); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only have one nit - I would prefer failures / asserts on each of the failure instead of combining the results together.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GetXStateFeaturesMask
and SetXStateFeaturesMask
basically cannot fail. They are very tolerant to the input.
To get failures you'd need something like pass context flags for a wrong architecture (like arm64). As-is that would not compile.
Realistically only CopyContext
is an API with complex requirements and validation that could fail here due to some mistake.
I think in this particular case individual asserts/returns would not add a lot of value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've just found those would fail also when passed context without any XSTATE present. It seems like something that could possibly happen for the usage in Thread::RedirectCurrentThreadAtHandledJITCase
where we pass it a general context that we get from the vectored exception handler. So it seems that it would be nice to actually make that work in that case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, my read on that was that Set(Get) would be a noop, but I think you are right, if no XState is set it may fail. Arguably a bug in the validation of Set not accepting 0.
Yes, we should make this case work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you!
linux DAC build failures happen in other PRs as well. |
…net#65825) * Do not copy XState other than AVX * #if defined(TARGET_X86) || defined(TARGET_AMD64) * mask XState unconditionally * Ensure XSTATE_MASK_AVX is set before calling EEGetThreadContext * redundant supportsAVX, more clear comment * PR feedback
…net#65825) * Do not copy XState other than AVX * #if defined(TARGET_X86) || defined(TARGET_AMD64) * mask XState unconditionally * Ensure XSTATE_MASK_AVX is set before calling EEGetThreadContext * redundant supportsAVX, more clear comment * PR feedback
…#66120) * Use CopyContext to restore saved context on X86 (#65490) * Use CopyContext to restore saved context on X86 * PR feedback * more PR feedback * Do not copy XState other than AVX when redirecting for GC stress (#65825) * Do not copy XState other than AVX * #if defined(TARGET_X86) || defined(TARGET_AMD64) * mask XState unconditionally * Ensure XSTATE_MASK_AVX is set before calling EEGetThreadContext * redundant supportsAVX, more clear comment * PR feedback * null-check the redirect context before using. (#65910) * null-check the redirect context before using. * tweak the comment * do not allocate context if InitializeContext has unexpected results. * EE Suspension on x86 should use RtlRestoreContext when available (#65878) * RestoreContextSimulated * probe for RtlRestoreContext * ntdll.dll * restore self-trap sequence * PR feedback * Clarify CopyContext in RedirectedHandledJITCaseExceptionFilter * simpler indentation. * restore last error on the legacy path. * Update src/coreclr/vm/threads.h Co-authored-by: Dan Moseley <[email protected]> Co-authored-by: Dan Moseley <[email protected]>
* Use CopyContext to restore saved context on X86 (#65490) * Use CopyContext to restore saved context on X86 * PR feedback * more PR feedback * Do not copy XState other than AVX when redirecting for GC stress (#65825) * Do not copy XState other than AVX * #if defined(TARGET_X86) || defined(TARGET_AMD64) * mask XState unconditionally * Ensure XSTATE_MASK_AVX is set before calling EEGetThreadContext * redundant supportsAVX, more clear comment * PR feedback * null-check the redirect context before using. (#65910) * null-check the redirect context before using. * tweak the comment * do not allocate context if InitializeContext has unexpected results. * EE Suspension on x86 should use RtlRestoreContext when available (#65878) * RestoreContextSimulated * probe for RtlRestoreContext * ntdll.dll * restore self-trap sequence * PR feedback * Clarify CopyContext in RedirectedHandledJITCaseExceptionFilter * simpler indentation. * restore last error on the legacy path. * Update src/coreclr/vm/threads.h Co-authored-by: Dan Moseley <[email protected]> Co-authored-by: Vladimir Sadov <[email protected]> Co-authored-by: Dan Moseley <[email protected]>
GC stress mechanism relies on exceptions thrown from random locations in managed code.
The context that OS passes for the exception location may contain whatever XState that happens to be dirty. However only AVX matters in managed code.
Besides, the redirection context that stores the state may not have space to save other XState features.
To avoid dealing with the state that we do not expect, we will indicate via the feature mask that only AVX state is valid in the provided context.
Fixes: #65776