Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

k_thread_resume can cause k_sem_take with K_FOREVER to return -EAGAIN and crash #29244

Closed
laurenmurphyx64 opened this issue Oct 15, 2020 · 2 comments · Fixed by #29299
Closed
Assignees
Labels
bug The issue is a bug, or the PR is fixing a bug priority: high High impact/importance bug

Comments

@laurenmurphyx64
Copy link
Contributor

laurenmurphyx64 commented Oct 15, 2020

Describe the bug
When one thread is blocked on k_sem_take with K_FOREVER on a semaphore whose count is set to 0 and another thread calls k_thread_resume on it, the thread will resume and the k_sem_take call will return EAGAIN, an error code that is normallly reserved for the waiting period timing out. Zephyr then seems to crash.

To Reproduce
Steps to reproduce the behavior:

  1. cd samples/hello_world
  2. Replace samples/hello_world/src/main.c with the following:
#include <zephyr.h>
#include <sys/printk.h>

K_SEM_DEFINE(BLOCKSEMA, 0, 1);

void block_thread(void)
{
	printk("Pre sema take\n");
	
	int code = k_sem_take(&BLOCKSEMA, K_FOREVER);
	
	printk("Post sema take %i\n", code);
}

K_THREAD_DEFINE(block_thread_id, 512, (k_thread_entry_t)block_thread, NULL, NULL,
		NULL, 1, 0, 0);

void main(void)
{ 
	printk("Main\n");

	k_thread_resume(block_thread_id);
}
  1. west build -b qemu_x86 -t run .
  2. See error

Expected behavior
Either k_sem_take should continue blocking since k_sem_give wasn't called, or it should produce a different error code per the documentation. It should also not crash.

Impact
Unexpected behavior / crash confusing to users.

Logs and console output
Output:

$ west build -b qemu_x86 -t run .
...
SeaBIOS (version rel-1.12.1-0-ga5cab58-dirty-20200625_115407-9426dddc0a1f-zephyr
)
Booting from ROM..*** Booting Zephyr OS build zephyr-v2.4.0-634-g53cc090c39a5  ***
Main
Pre sema take
Post sema take -11
FAILED: zephyr/CMakeFiles/run 
cd /home/herbert/zephyr-external/zephyr/samples/hello_world/build && /home/herbert/zephyr-sdk-0.11.4/sysroots/x86_64-pokysdk-linux/usr/bin/qemu-system-i386 -m 9 -cpu qemu32,+nx,+pae -device isa-debug-exit,iobase=0xf4,iosize=0x04 -no-reboot -nographic -net none -pidfile qemu.pid -chardev stdio,id=con,mux=on -serial chardev:con -mon chardev=con,mode=readline -icount shift=5,align=off,sleep=off -rtc clock=vm -kernel /home/herbert/zephyr-external/zephyr/samples/hello_world/build/zephyr/zephyr.elf
ninja: build stopped: subcommand failed.
FATAL ERROR: command exited with status 1: /home/herbert/.local/bin/cmake --build /home/herbert/zephyr-external/zephyr/samples/hello_world/build --target run

Environment:

  • OS: Ubuntu 18.04
  • Board: qemu_x86
  • Toolchain: Zephyr SDK 0.11.4
  • Commit 53cc090
@laurenmurphyx64 laurenmurphyx64 added the bug The issue is a bug, or the PR is fixing a bug label Oct 15, 2020
@laurenmurphyx64 laurenmurphyx64 changed the title k_thread_resume() can cause k_sem_take with K_FOREVER to return -EAGAIN k_thread_resume() can cause k_sem_take with K_FOREVER to return -EAGAIN and fail Oct 15, 2020
@laurenmurphyx64 laurenmurphyx64 changed the title k_thread_resume() can cause k_sem_take with K_FOREVER to return -EAGAIN and fail k_thread_resume can cause k_sem_take with K_FOREVER to return -EAGAIN and fail Oct 15, 2020
@laurenmurphyx64 laurenmurphyx64 changed the title k_thread_resume can cause k_sem_take with K_FOREVER to return -EAGAIN and fail k_thread_resume can cause k_sem_take with K_FOREVER to return -EAGAIN and crash Oct 15, 2020
@nashif
Copy link
Member

nashif commented Oct 16, 2020

now that i read through this I think it is still a valid bug.

@laurenmurphyx64 sorry, yesterday i was focused on the coverity issue and what k_sem_take returns and did not realize this is a completely unrelated bug.

@nashif nashif reopened this Oct 16, 2020
@nashif nashif added the priority: high High impact/importance bug label Oct 16, 2020
@laurenmurphyx64
Copy link
Contributor Author

laurenmurphyx64 commented Oct 16, 2020

(Like Anas said, the same crash will occur regardless of whether the semaphore call is there or not.)

This code will show a page fault when logging is turned on.

prj.conf:

CONFIG_LOG=y

Output:

SeaBIOS (version rel-1.12.1-0-ga5cab58-dirty-20200625_115407-9426dddc0a1f-zephyr
)
Booting from ROM..*** Booting Zephyr OS build zephyr-v2.4.0-634-g53cc090c39a5  ***
Main
[00:00:00.000,000] <err> os: Page fault at address 0x89fe2383 (error code 0x0)
[00:00:00.000,000] <err> os: Linear address not present in page tables
[00:00:00.000,000] <err> os: Access violation: supervisor thread not allowed to read
[00:00:00.000,000] <err> os: PDE: not present
[00:00:00.000,000] <err> os: EAX: 0x89fe2383, EBX: 0x00000002, ECX: 0x00100c14, EDX: 0x00107000
[00:00:00.000,000] <err> os: ESI: 0x00117de0, EDI: 0x00107268, EBP: 0x00117e1c, ESP: 0x00117de4
[00:00:00.000,000] <err> os: EFLAGS: 0x00000002 CS: 0x0008 CR3: 0x0011c000
[00:00:00.000,000] <err> os: call trace:
[00:00:00.000,000] <err> os: EIP: 0x00100c54
[00:00:00.000,000] <err> os:      0x00103ec0 (0x107268)
[00:00:00.000,000] <err> os:      0x00103fcc (0x107268)
[00:00:00.000,000] <err> os:      0x00104048 (0x0)
[00:00:00.000,000] <err> os:      0x001003e7 (0x107000)
[00:00:00.000,000] <err> os:      0x00103a73 (0x117e88)
[00:00:00.000,000] <err> os:      0x00100f2c (0x0)
[00:00:00.000,000] <err> os: >>> ZEPHYR FATAL ERROR 0: CPU exception on CPU 0
[00:00:00.000,000] <err> os: Current thread: 0x001071e0 (unknown)
[00:00:00.000,000] <err> os: Halting system

@nashif nashif self-assigned this Oct 20, 2020
nashif added a commit to nashif/zephyr that referenced this issue Oct 21, 2020
Do not add a thread to the run queue if it was already added.

Fixes zephyrproject-rtos#29244

Signed-off-by: Anas Nashif <[email protected]>
nashif added a commit that referenced this issue Oct 22, 2020
Do not add a thread to the run queue if it was already added.

Fixes #29244

Signed-off-by: Anas Nashif <[email protected]>
sjg20 pushed a commit to sjg20/zephyr that referenced this issue Oct 23, 2020
Do not add a thread to the run queue if it was already added.

Fixes zephyrproject-rtos#29244

Signed-off-by: Anas Nashif <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug The issue is a bug, or the PR is fixing a bug priority: high High impact/importance bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants