Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests/kernel/fatal/exception/sentinel test is failing for various nrf platforms #30554

Closed
ioannisg opened this issue Dec 9, 2020 · 6 comments
Assignees
Labels
bug The issue is a bug, or the PR is fixing a bug platform: nRF Nordic nRFx priority: medium Medium impact/importance bug

Comments

@ioannisg
Copy link
Member

ioannisg commented Dec 9, 2020

Describe the bug
tests/kernel/fatal/exception/ stack_sentinel test variant is failing for various nRF platforms
tested on nRF5340 and nRF9160.

To Reproduce

On current master:
Build flash and run stack sentinel test variant of tests/kernel/fatal/exception

Expected behavior
Test should not assert.

Impact
kernel test failing

Logs and console output

ASSERTION FAIL [chan && chan < (0 + 1)] @ WEST_TOPDIR/zephyr/drivers/timer/nrf_rtc_timer.c:214
E: r0/a1:  0x00000004  r1/a2:  0x000000d6  r2/a3:  0x00000000
E: r3/a4:  0x00006f31 r12/ip:  0x00000000 r14/lr:  0x00001e77
E:  xpsr:  0x49000025
E: Faulting instruction address (r15/pc): 0x00006436
E: >>> ZEPHYR FATAL ERROR 4: Kernel panic on CPU 0
E: Fault during interrupt handling
E: Current thread: 0x200002e8 (main)
Caught system error -- reason 4
Was not expecting a crash
@ioannisg ioannisg added bug The issue is a bug, or the PR is fixing a bug priority: medium Medium impact/importance bug platform: nRF Nordic nRFx labels Dec 9, 2020
@ioannisg
Copy link
Member Author

ioannisg commented Dec 9, 2020

@nordic-krch I suspect a driver issue here, but might be wrong.

@ioannisg
Copy link
Member Author

ioannisg commented Dec 9, 2020

SImilar as reported in #30526

@jcyrax
Copy link

jcyrax commented Jan 10, 2021

This is a bug due to adjusting stack start and size for threads using floating point after sentinel is written to stack. This results in sentinel at wrong location and so it fails on first test.

You can see that in kernel/thread.c setup_thread_stack() sentinel is written to address stack_buf_start = Z_THREAD_STACK_BUFFER(stack); and right after that in arch_new_thread() arch/arm/core/aarch32/thread.c stack start is adjusted due to larger guard area used with floating points, but sentinel is never written to this location.

#if FP_GUARD_EXTRA_SIZE > 0
	if ((thread->base.user_options & K_FP_REGS) != 0) {
		/* Larger guard needed due to lazy stacking of FP regs may
		 * overshoot the guard area without writing anything. We
		 * carve it out of the stack buffer as-needed instead of
		 * unconditionally reserving it.
		 */
		thread->stack_info.start += FP_GUARD_EXTRA_SIZE;
		thread->stack_info.size -= FP_GUARD_EXTRA_SIZE;
	}
#endif /* FP_GUARD_EXTRA_SIZE */

@carlescufi carlescufi assigned ioannisg and unassigned nordic-krch Jan 26, 2021
@ioannisg
Copy link
Member Author

ioannisg commented Jan 29, 2021

Cannot reproduce this any more. Tried nrf52, nrf53, nrf91, with or without FPU. Closing this.
@PerMac if the test fails for nRF platforms, pls, re-open, with Medium priority.

@ioannisg
Copy link
Member Author

This is a bug due to adjusting stack start and size for threads using floating point after sentinel is written to stack. This results in sentinel at wrong location and so it fails on first test.

You can see that in kernel/thread.c setup_thread_stack() sentinel is written to address stack_buf_start = Z_THREAD_STACK_BUFFER(stack); and right after that in arch_new_thread() arch/arm/core/aarch32/thread.c stack start is adjusted due to larger guard area used with floating points, but sentinel is never written to this location.

#if FP_GUARD_EXTRA_SIZE > 0
	if ((thread->base.user_options & K_FP_REGS) != 0) {
		/* Larger guard needed due to lazy stacking of FP regs may
		 * overshoot the guard area without writing anything. We
		 * carve it out of the stack buffer as-needed instead of
		 * unconditionally reserving it.
		 */
		thread->stack_info.start += FP_GUARD_EXTRA_SIZE;
		thread->stack_info.size -= FP_GUARD_EXTRA_SIZE;
	}
#endif /* FP_GUARD_EXTRA_SIZE */

Sentinel should not be used together with MPU stack guards, so there's no stack adjustment of stack_info.start that applies here, as far as I understand.

@PerMac
Copy link
Member

PerMac commented Jan 29, 2021

@ioannisg I haven't seen it for a while. I think it was the same root cause as in #30526 and was fixed by #30487.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug The issue is a bug, or the PR is fixing a bug platform: nRF Nordic nRFx priority: medium Medium impact/importance bug
Projects
None yet
Development

No branches or pull requests

4 participants