-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test failure: JIT/Regression/JitBlue/Runtime_63942/Runtime_63942/Runtime_63942.cmd #98971
Comments
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch Issue DetailsFailed in: runtime-coreclr gcstress-extra 20240225.1 Failed tests:
Error message:
Stack trace:
|
This is a JIT assert; changing area to codegen |
This is failing in runtime-coreclr jitstress, too. |
@TIHan, PTAL. It is blocking clean ci optional. It is asserting in Generate Code. |
Will do. |
Did a bisection, and #98623 I believe is what allowed this assertion to appear. I'm not sure yet as to the cause. This could have been a bug for a while, but only manifested itself with the PR. This only occurs with:
|
Minimal repo: [MethodImpl(MethodImplOptions.NoInlining)]
static void Consume(Vector<byte> v) { }
static unsafe void Main()
{
int tmp = 1;
Vector<byte> vector = Unsafe.As<int, Vector256<byte>>(ref tmp).AsVector();
Consume(vector);
} When run:
|
Using the minimal repo, I can hit the same assert with a commit from Dec 2023, so #98623 is not the cause, but it may have allowed this issue to manifest. |
@EgorBo @tannergooding From the original issue, the assert was:
It invoked else if (sizeof(T) == 32)
{
if (Vector<byte>.Count == 32)
{
vector = Unsafe.As<T, Vector256<byte>>(ref tmp).AsVector();
}
else
{
Debug.Fail("Vector<T> isn't 256 bits in size?");
goto CannotVectorize;
}
} Because we turned off HWIntrinsics and TieredCompilation and forced MinOpts, the JIT is going to try to emit that path even if it never gets ran. |
The code is unreachable and so while it is unsafe, it isn't "incorrect". This is in part a weird aspect of generic code where some paths are nonsensical if they were to be actually taken, but the guards prevent that, so its fine. We couldn't properly write generic code if this kind of logic weren't allowed. It looks like we hit this because we have an EBP offset that is not negative and we expect it will always be negative. In this case it isn't since we're reading more bytes than exist in the frame. It's possible we should detect this case and not allow an EBP based frame or should otherwise ensure that this type of mismatch is handled. It's also possible a more trivial fix is to make this use |
@BruceForstall @jakobbotsch I'm not sure what the best way to handle this is. Based on the minimal repo, the code is technically I had a conversation with @tannergooding , and he had a few suggestions:
|
The reason why 3 may be difficult/not possible is because
|
If the assert hits, we're going to generate code this is (probably) going to corrupt the stack (in this case, read garbage). In this case it looks like it's because we're unrolling a Vector256 store and the second instruction has a displacement that makes the ebp offset positive. One way to fix this is to make all the asserts in |
Notably we're generating the code here, but it will never execute in this particular scenario because we have guards that ensure the path is only taken when the read would be valid. It's mostly getting generated because A user could certainly write such code themselves without the guards however, in which case we will generate the code and not hit any assert (since it will use the production JIT). I expect the assert here was mostly to catch places where the JIT computes an incorrect offset, but that of course doesn't account for users doing it themselves (either "safely" due to the code being unreachable like we have, or "unsafely" because the user has written some incorrect code). Hence why I suggested 3 as an alternative fix, basically don't allow |
I tried this, but the assert now gets hit a lot in normal cases. The change was this: /* Get the frame offset of the (non-temp) variable */
offs = emitComp->lvaFrameAddress(var, &EBPbased);
if (EBPbased)
{
#if defined(TARGET_AMD64) && !defined(UNIX_AMD64_ABI)
// If localloc is not used, then ebp chaining is done and hence
// offset of locals will be at negative offsets, Otherwise offsets
// will be positive. In future, when RBP gets positioned in the
// middle of the frame so as to optimize instruction encoding size,
// the below asserts needs to be modified appropriately.
// However, for Unix platforms, we always do frame pointer chaining,
// so offsets from the frame pointer will always be negative.
if (emitComp->compLocallocUsed || emitComp->opts.compDbgEnC)
{
noway_assert((int)offs >= 0);
}
else
#endif
{
// Dev10 804810 - failing this assert can lead to bad codegen and runtime crashes
CLANG_FORMAT_COMMENT_ANCHOR;
#ifdef UNIX_AMD64_ABI
const LclVarDsc* varDsc = emitComp->lvaGetDesc(var);
bool isRegPassedArg = varDsc->lvIsParam && varDsc->lvIsRegArg;
// Register passed args could have a stack offset of 0.
noway_assert((int)offs < 0 || isRegPassedArg || emitComp->opts.IsOSR());
#else // !UNIX_AMD64_ABI
// OSR transitioning to RBP frame currently can have mid-frame FP
noway_assert(((int)offs < 0) || emitComp->opts.IsOSR());
#endif // !UNIX_AMD64_ABI
}
}
offs += dsp; |
Except for these are Without this assert we might generate bad code, if we don't calculate the instruction size properly in the code that follows.
That's basically what I was thinking. |
What do you recommend we do then? |
Any reason we don't just ensure we compute the size correctly in this case? I would've thought it already mostly just works since the logic for That is, any non-zero offset requires the |
I am working on a change which will resolve this by not containing out-of-bounds access to locals, which is how we've been handling this problem so far. |
Failed in: runtime-coreclr gcstress-extra 20240302.2 Failed tests:
Error message:
Stack trace:
|
Failed in: runtime-coreclr gcstress-extra 20240225.1
Failed tests:
Error message:
Stack trace:
The text was updated successfully, but these errors were encountered: