Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Logging (deferred v2) with a lot of output causes MPU fault #46986

Closed
nordicjm opened this issue Jun 28, 2022 · 6 comments
Closed

Logging (deferred v2) with a lot of output causes MPU fault #46986

nordicjm opened this issue Jun 28, 2022 · 6 comments
Assignees
Labels
area: Logging bug The issue is a bug, or the PR is fixing a bug priority: low Low impact/importance bug Stale

Comments

@nordicjm
Copy link
Collaborator

Describe the bug
A device with an accelerometer which outputs a message on the sensor being triggered can be shaken vigorously to generate a lot of messages, upon doing so, it can be observed that the logging thread faults.

To Reproduce
Have not attempted to reproduce with a sample application because of the requirement of a motion sensor on the unit we are testing, but essentially the application outputs a small message using LOG_ERR() when the sensor trigger callback is used, a 1000ms logging sleep delay is used with 4096 byte logging stack size and 4096 byte logging buffer size, performance over size option is also selected. The drop oldest messages on full option is selected too.

Expected behavior
Logging to work, and if there is not enough space, to drop older messages.

Impact
Showstopper, if someone can DoS a device with this tactic, it isn't a viable device

Logs and console output

[00:00:05.491,821] <err> sense: acc
[00:00:05.492,523] <err> sense: acc
[00:00:05.493,225] <err> sense: acc
[00:00:05.493,896] <err> sense: acc
[00:00:05.573,028] <err> sense: acc
[00:00:05.573,730] <err> sense: acc
[00:00:05.574,462] <err> sense: acc
[00:00:05.575,164] <err> sense: acc
[00:00:05.575,866] <err> sense: acc
[00:00:05.576,568] <err> sense: acc
[00:00:05.577,331] <err> sense: acc
[00:00:05.578,063] <err> sense: acc
[00:00:05.578,826] <err> sense: acc
[00:00:05.579,589] <err> sense: acc
[00:00:05.580,352] <err> sense: acc
[00:00:05.581,085] <err> sense: acc
[00:00:05.581,848] <err> sense: acc
[00:00:05.582,031] <err> os: ***** MPU FAULT *****
[00:00:05.582,061] <err> os:   Data Access Violation
[00:00:05.582,061] <err> os:   MMFAR Address: 0x0
[00:00:05.582,122] <err> os: r0/a1:  0x0006619c  r1/a2:  0x00000008  r2/a3:  0x00066968
[00:00:05.582,153] <err> os: r3/a4:  0x00000000 r12/ip:  0x00000000 r14/lr:  0x00024e4d
[00:00:05.582,183] <err> os:  xpsr:  0x81000000
[00:00:05.582,214] <err> os: s[ 0]:  0x00000000  s[ 1]:  0x00000000  s[ 2]:  0x00000000  s[ 3]:  0x00000000
[00:00:05.582,244] <err> os: s[ 4]:  0x00000000  s[ 5]:  0x00000000  s[ 6]:  0x00000000  s[ 7]:  0x00000000
[00:00:05.582,275] <err> os: s[ 8]:  0x00000000  s[ 9]:  0x00000000  s[10]:  0x00000000  s[11]:  0x00000000
[00:00:05.582,305] <err> os: s[12]:  0x00000000  s[13]:  0x00000000  s[14]:  0x00000000  s[15]:  0x00000000
[00:00:05.582,305] <err> os: fpscr:  0x00000000
[00:00:05.582,336] <err> os: r4/v1:  0x0006619c  r5/v2:  0x00000001  r6/v3:  0x00000000
[00:00:05.582,366] <err> os: r7/v4:  0x00000008  r8/v5:  0x00000001  r9/v6:  0x00000001
[00:00:05.582,397] <err> os: r10/v7: 0x200012d4  r11/v8: 0x00000000    psp:  0x200170c0
[00:00:05.582,427] <err> os: EXC_RETURN: 0xffffffed
[00:00:05.582,458] <err> os: Faulting instruction address (r15/pc): 0x0002964c
[00:00:05.582,489] <err> os: >>> ZEPHYR FATAL ERROR 0: CPU exception on CPU 0
[00:00:05.582,519] <err> os: Current thread: 0x20003b10 (logging)
[00:00:05.685,638] <err> os: ***** HARD FAULT *****
[00:00:05.685,668] <err> os:   Fault escalation (see below)
[00:00:05.685,699] <err> os: ***** MPU FAULT *****
[00:00:05.685,699] <err> os:   Data Access Violation
[00:00:05.685,729] <err> os:   MMFAR Address: 0x0
[00:00:05.685,791] <err> os: r0/a1:  0x0006619c  r1/a2:  0x0000001e  r2/a3:  0x00066968
[00:00:05.685,791] <err> os: r3/a4:  0x00000000 r12/ip:  0x00000000 r14/lr:  0x00024e4d
[00:00:05.685,821] <err> os:  xpsr:  0x81000004
[00:00:05.685,852] <err> os: s[ 0]:  0x00000000  s[ 1]:  0x00000000  s[ 2]:  0x00000000  s[ 3]:  0x00000000
[00:00:05.685,882] <err> os: s[ 4]:  0x00000000  s[ 5]:  0x00000000  s[ 6]:  0x00000000  s[ 7]:  0x00000000
[00:00:05.685,913] <err> os: s[ 8]:  0x00000000  s[ 9]:  0x00000000  s[10]:  0x00000000  s[11]:  0x00000000
[00:00:05.685,943] <err> os: s[12]:  0x00000000  s[13]:  0x00000000  s[14]:  0x00000000  s[15]:  0x00000000
[00:00:05.685,974] <err> os: fpscr:  0x00000000
[00:00:05.686,004] <err> os: r4/v1:  0x0006619c  r5/v2:  0x00000001  r6/v3:  0x00000000
[00:00:05.686,035] <err> os: r7/v4:  0x0000001e  r8/v5:  0x00000001  r9/v6:  0x00000001
[00:00:05.686,065] <err> os: r10/v7: 0x200012d4  r11/v8: 0x00400804    psp:  0x200170c0
[00:00:05.686,096] <err> os: EXC_RETURN: 0xffffffe1
[00:00:05.686,096] <err> os: Faulting instruction address (r15/pc): 0x0002964c
[00:00:05.686,157] <err> os: >>> ZEPHYR FATAL ERROR 0: CPU exception on CPU 0
[00:00:05.686,157] <err> os: Fault during interrupt handling

[00:00:05.686,218] <err> os: Current thread: 0x20003b10 (logging)
[00:00:06.883,453] <err> fatal_error: Resetting system

Where 0x0002964c is zephyr/include/zephyr/sys/atomic_builtin.h:88 which is the atomic_add function

Environment (please complete the following information):

  • OS: Linux
  • Toolchain: Zephyr SDK 0.14.0
  • Commit SHA or Version used: Using nRF connect SDK 2.0.0
@nordicjm nordicjm added the bug The issue is a bug, or the PR is fixing a bug label Jun 28, 2022
@fabiobaltieri fabiobaltieri added the priority: medium Medium impact/importance bug label Jun 28, 2022
@mmahadevan108
Copy link
Collaborator

@nordic-krch, can this issue be closed?

@thedjnK
Copy link
Collaborator

thedjnK commented Jul 18, 2022

@nordic-krch, can this issue be closed?

This issue is not resolved

@nordic-krch
Copy link
Contributor

I tried to reproduced it by logging from timer interrupt but it does not fail: e253a49

Any suggestion how can test be improved to trigger the issue.

@thedjnK
Copy link
Collaborator

thedjnK commented Jul 27, 2022

I no longer have the hardware or firmware which was used to reproduce this issue, I'll have a look around what breakout boards I have and see if I can create a new application which demonstrates the issue

@fabiobaltieri
Copy link
Member

fabiobaltieri commented Sep 8, 2022

I no longer have the hardware or firmware which was used to reproduce this issue, I'll have a look around what breakout boards I have and see if I can create a new application which demonstrates the issue

Hey I'll bump this to low priority since it's been here for a couple months, don't have a case for reproducing it and there's no other report for this. Happy to bump ti back up if that changes.

@fabiobaltieri fabiobaltieri added priority: low Low impact/importance bug and removed priority: medium Medium impact/importance bug labels Sep 8, 2022
@nashif nashif added this to Logging Nov 20, 2022
@nashif nashif moved this to 🆕 New in Logging Nov 20, 2022
@github-actions
Copy link

github-actions bot commented Jan 2, 2023

This issue has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this issue will automatically be closed in 14 days. Note, that you can always re-open a closed issue at any time.

@github-actions github-actions bot added the Stale label Jan 2, 2023
@github-project-automation github-project-automation bot moved this from 🆕 New to ✅ Done in Logging Jan 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: Logging bug The issue is a bug, or the PR is fixing a bug priority: low Low impact/importance bug Stale
Projects
Status: Done
Development

No branches or pull requests

6 participants