Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kernel usage fault when using semaphore with multi-threading #41963

Closed
RafaelLeeImg opened this issue Jan 19, 2022 · 8 comments
Closed

Kernel usage fault when using semaphore with multi-threading #41963

RafaelLeeImg opened this issue Jan 19, 2022 · 8 comments
Labels
area: ARM ARM (32-bit) Architecture area: I2C area: Kernel bug The issue is a bug, or the PR is fixing a bug priority: low Low impact/importance bug

Comments

@RafaelLeeImg
Copy link
Contributor

Describe the bug
The kernel enters usage fault condition when release semaphore in I2C driver

This project is a demo project to use LVGL with SSD1306 0.96 inch OLED screen and NUCLEO_F411RE board.

https://github.com/RafaelLeeImg/zephyr_lvgl_nucleo_f411re.git
The gdbscript I use is here
https://github.com/RafaelLeeImg/zephyr_lvgl_nucleo_f411re/blob/main/gdbscript

Please also mention any information which could help others to understand
the problem you're facing:

  • STM32F411, Cortex-M3, aarch32
  • gdb-x gdbscript

To Reproduce
Steps to reproduce the behavior:

  1. compile the project and burn it onto nucleo_f411re
  2. gdb-multiarch -x gdbscript
  3. Then the ARM will run to the line just before the fault
  4. Type these command under gdb
  5. c
  6. p/x *0xE0001004
  7. The MCU will stop at z_arm_usage_fault

Expected behavior
The MCU will stuck at z_arm_usage_fault

Impact
STM32 MCU will stuck with I2C device.

Logs and console output
#0 0x08003724 in arch_irq_unlock (key=0x0) at west/zephyr/include/arch/arm/aarch32/asm_inline_gcc.h:95
#1 arch_swap (key=0x0) at west/zephyr/arch/arm/core/aarch32/swap.c:44
#2 0x08022fe6 in z_swap_irqlock (key=0x0) at west/zephyr/kernel/include/kswap.h:184
#3 0x08023238 in z_swap (key=..., lock=0x200014b4 <announce_remaining>) at west/zephyr/kernel/include/kswap.h:195
#4 z_reschedule (lock=0x200014b4 <announce_remaining>, key=...) at west/zephyr/kernel/sched.c:874
#5 0x0800f3c0 in z_impl_k_sem_give (sem=0x20001170 <i2c_stm32_dev_data_i2c3>) at west/zephyr/kernel/sem.c:103
#6 0x08014720 in k_sem_give (sem=0x20001170 <i2c_stm32_dev_data_i2c3>) at /dev/shm/d/build/alonzo_lvgl/zephyr/include/generated/syscalls/kernel.h:1043
#7 0x08014a3c in i2c_stm32_transfer (dev=0x8023b4c <__device_dts_ord_66>, msg=0x200015d0 <z_main_stack+264>, num_msgs=0xff, slave=0xfff5) at west/zephyr/drivers/i2c/i2c_ll_stm32.c:167
#8 0x08012df8 in z_impl_i2c_transfer (dev=0x8023b4c <__device_dts_ord_66>, msgs=0x200015d0 <z_main_stack+264>, num_msgs=0x2, addr=0x3c) at west/zephyr/include/drivers/i2c.h:589
#9 0x08012ea6 in i2c_transfer (dev=0x8023b4c <__device_dts_ord_66>, msgs=0x200015d0 <z_main_stack+264>, num_msgs=0x2, addr=0x3c) at /dev/shm/d/build/alonzo_lvgl/zephyr/include/generated/syscalls/i2c.h:90
#10 0x08012e4c in i2c_burst_write (dev=0x8023b4c <__device_dts_ord_66>, dev_addr=0x3c, start_addr=0x0, buf=0x20001934 <z_main_stack+1132> " ", num_bytes=0x8) at west/zephyr/include/drivers/i2c.h:997
#11 0x08012e7a in i2c_burst_write_dt (spec=0x80241ac <ssd1306_config>, start_addr=0x0, buf=0x20001934 <z_main_stack+1132> " ", num_bytes=0x8) at west/zephyr/include/drivers/i2c.h:1019
#12 0x08012efc in ssd1306_write_bus (dev=0x8023b64 <__device_dts_ord_67>, buf=0x20001934 <z_main_stack+1132> " ", len=0x8, command=0x1) at west/zephyr/drivers/display/ssd1306.c:79
#13 0x080130fe in ssd1306_write (dev=0x8023b64 <__device_dts_ord_67>, x=0x10, y=0x18, desc=0x2000199c <z_main_stack+1236>, buf=0x2000032c ) at west/zephyr/drivers/display/ssd1306.c:241
#14 0x08011e40 in display_write (dev=0x8023b64 <__device_dts_ord_67>, x=0x10, y=0x18, desc=0x2000199c <z_main_stack+1236>, buf=0x2000032c ) at west/zephyr/include/drivers/display.h:232
#15 0x08011ef2 in lvgl_flush_cb_mono (disp_drv=0x20002984 <kheap.system_heap+892>, area=0x20000318 <disp_buf+16>, color_p=0x2000032c ) at west/zephyr/lib/gui/lvgl/lvgl_display_mono.c:26
#16 0x0800871e in lv_refr_vdb_flush () at west/modules/lib/gui/lvgl/src/lv_core/lv_refr.c:751
#17 0x080085cc in lv_refr_area_part (area_p=0x200029de <kheap.system_heap+982>) at west/modules/lib/gui/lvgl/src/lv_core/lv_refr.c:559
#18 0x0800835c in lv_refr_area (area_p=0x200029de <kheap.system_heap+982>) at west/modules/lib/gui/lvgl/src/lv_core/lv_refr.c:460
#19 0x0800812a in lv_refr_areas () at west/modules/lib/gui/lvgl/src/lv_core/lv_refr.c:382
#20 0x08007d32 in _lv_disp_refr_task (task=0x20002864 <kheap.system_heap+604>) at west/modules/lib/gui/lvgl/src/lv_core/lv_refr.c:199
#21 0x0800c07a in lv_task_exec (task=0x20002864 <kheap.system_heap+604>) at west/modules/lib/gui/lvgl/src/lv_misc/lv_task.c:409
#22 0x0800bcf8 in lv_task_handler () at west/modules/lib/gui/lvgl/src/lv_misc/lv_task.c:142
#23 0x080018f2 in main () at /dev/shm/d/proj/alonzo_lvgl/src/main.c:154

After MCU halt
#0 z_arm_usage_fault () at west/zephyr/arch/arm/core/aarch32/cortex_m/fault_s.S:80
#1
Prologue scan stopped at 0x8003770
#2 z_arm_pendsv () at west/zephyr/arch/arm/core/aarch32/swap_helper.S:346
#3
#4 arch_irq_unlock (key=0x10) at west/zephyr/include/arch/arm/aarch32/asm_inline_gcc.h:109
#5 arch_swap (key=0x8000000) at west/zephyr/arch/arm/core/aarch32/swap.c:44
#6 0x08022fe6 in z_swap_irqlock (key=0x8003db1) at west/zephyr/kernel/include/kswap.h:184
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

Environment (please complete the following information):

  • OS: Debian Linux
  • Toolchain : arm-none-eabi-gcc (15:10.3-2021.07-4) 10.3.1 20210621 (release)
  • Commit SHA: 3b235e3
    Author: Peter Mitsis [email protected]
    Date: Tue Oct 19 12:35:25 2021 -0400

Additional context
If step to the code, it will not cause error, if not step through, the MCU will stuck.

@RafaelLeeImg RafaelLeeImg added the bug The issue is a bug, or the PR is fixing a bug label Jan 19, 2022
@RafaelLeeImg
Copy link
Contributor Author

The problem is located in
In error conditions
The error is triggered under this line, before this line, the value of r0 is already wrong.

// zephyr/arch/arm/core/aarch32/swap_helper.S
SECTION_FUNC(TEXT, z_arm_pendsv)
...
stmia r0, {v1-v8, ip}

@RafaelLeeImg
Copy link
Contributor Author

Under error condition, the r2 points to an non_exist position, with address 0x1000xxxx which is not valid.
The r2 meant to be _kernel.cpu.current.
west/zephyr/arch/arm/core/aarch32/swap_helper.S

    ldr r2, [r1, #_kernel_offset_to_current]

@RafaelLeeImg
Copy link
Contributor Author

In fault condition, r0 holds an address of non-writable address or some address for peripherials like 0xE000ED00, when trying to write the address of $r0, usage fault is triggered.

...
SECTION_FUNC(TEXT, z_arm_pendsv)
...
    stmia r0, {v1-v8, ip}
...
    ldmia r0, {v1-v8, ip}

@henrikbrixandersen henrikbrixandersen added area: Kernel area: ARM ARM (32-bit) Architecture labels Jan 24, 2022
@dkalowsk dkalowsk added area: I2C platform: STM32 ST Micro STM32 priority: low Low impact/importance bug labels Jan 25, 2022
@erwango erwango removed the platform: STM32 ST Micro STM32 label Jan 25, 2022
@erwango erwango removed their assignment Jan 25, 2022
@erwango
Copy link
Member

erwango commented Jan 25, 2022

@dkalowsk would you mind assigning to ARM maintainer as this is not STM32 specific according to analysis ?

@dkalowsk
Copy link
Contributor

@erwango done. You were suggested at the bug scrub. The STM label was applied due to the reported platform.

@RafaelLeeImg
Copy link
Contributor Author

This is not a bug, This problem is caused by insufficient stack size.
Set stack to 4k will solve this problem.

CONFIG_MAIN_STACK_SIZE=4069

I'll update the details soon.

@erwango
Copy link
Member

erwango commented Jan 28, 2022

@RafaelLeeImg Thank for the heads up. Don't hesitate to close when ready

@nashif
Copy link
Member

nashif commented Feb 10, 2022

closing a non-bug

@nashif nashif closed this as completed Feb 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: ARM ARM (32-bit) Architecture area: I2C area: Kernel bug The issue is a bug, or the PR is fixing a bug priority: low Low impact/importance bug
Projects
None yet
Development

No branches or pull requests

6 participants