MPU FAULT, Data access violation

Hello,

I am working with the NRF 52 DK, communicating with a sensor through I2C. The communication is working and I can read the values from the sensor. I record the data in a vector for 15 seconds with a sample rate of 25sps, after this, I want to do some processing of the data at this point appears the following error

[00:00:13.764,984] <err> os: ***** BUS FAULT *****
[00:00:13.765,014] <err> os: ***** HARD FAULT *****
[00:00:13.766,357] <err> os: Fault escalation (see below)
[00:00:13.767,669] <err> os: ***** MPU FAULT *****
[00:00:13.773,345] <err> os: Data Access Violation
[00:00:13.779,144] <err> os: MMFAR Address: 0x1ffffb94
[00:00:13.785,644] <err> os: r0/a1: 0x00018963 r1/a2: 0x200007f1 r2/a3: 0x00000008
[00:00:13.791,351] <err> os: r3/a4: 0x00018963 r12/ip: 0x00000069 r14/lr: 0x0000783f
[00:00:13.797,241] <err> os: xpsr: 0x21000005
[00:00:13.803,527] <err> os: Faulting instruction address (r15/pc): 0x0000a8a8
[00:00:13.812,622] <err> os: >>> ZEPHYR FATAL ERROR 19: Unknown error on CPU 0

When debugging the code it starts the processing without any problem until a point where variables start changing for strange values so It seems to be a memory-related problem. I have tried fixing it by increasing CONFIG_MAIN_STACK_SIZE.

Also, I have looked for the faulting instruction address using addr2line but the result is a function inside Zephyr drivers, it changes every time I run it

Does anyone know what could be the problem here?

Top Replies

Vidar Berg over 1 year ago in reply to JAS0 +1 verified

Thanks for sharing your project. I noticed that you're placing several large arrays on the stack in the analysis() function and its subroutines. As a test, could you try increasing the CONFIG_MAIN_STACK_SIZE…

Parents

0 Vidar Berg over 1 year ago

Hello,

It sounds like you may be experiencing a stack overflow in one of your threads, but this should normally be caught by the stack guard (MPU-assisted stack overflow detection). Please check your <build directory>/zephyr/.conf file to see if CONFIG_HW_STACK_PROTECTION is indeed enabled in your build.

Thanks,

Vidar
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 JAS0 over 1 year ago in reply to Vidar Berg

Hello,

I just check and CONFIG_HW_STACK_PROTECTION is enabled.

If I reduced the MAIN_STACK_SIZE the error change for stack overflow, that's why I increased but not sure if this error is the same
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Vidar Berg over 1 year ago in reply to JAS0

Hello,

Thanks for confirming that the stack guard was enabled.

After you increased the main stack size, is the program consistently crashing at the same "Faulting instruction address", or does it change every time?
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Reply

0 Vidar Berg over 1 year ago in reply to JAS0

Hello,

Thanks for confirming that the stack guard was enabled.

After you increased the main stack size, is the program consistently crashing at the same "Faulting instruction address", or does it change every time?
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Children

0 JAS0 over 1 year ago in reply to Vidar Berg

Hello,
When the program crashes the first time sends this error:

[00:00:13.768,218] <err> os: ***** BUS FAULT *****
[00:00:13.768,890] <err> os: Stacking error
[00:00:13.770,202] <err> os: Precise data bus error
[00:00:13.771,514] <err> os: BFAR Address: 0x1ffffc6c
[00:00:13.777,221] <err> os: ***** HARD FAULT *****
[00:00:13.782,501] <err> os: Fault escalation (see below)
[00:00:13.788,482] <err> os: ***** BUS FAULT *****
[00:00:13.794,647] <err> os: Precise data bus error
[00:00:13.800,445] <err> os: BFAR Address: 0x1ffffc58
[00:00:13.806,945] <err> os: r0/a1: 0x20001984 r1/a2: 0x1ffffc58 r2/a3: 0x00000020
[00:00:13.812,652] <err> os: r3/a4: 0x20001984 r12/ip: 0x00000000 r14/lr: 0x00005035
[00:00:13.818,634] <err> os: xpsr: 0x21000205

After this, it reboots alone and when it crashes again the message error is the following:

[00:00:13.418,334] <err> os: ***** BUS FAULT *****
[00:00:13.419,006] <err> os: Stacking error
[00:00:13.420,288] <err> os: Precise data bus error
[00:00:13.421,600] <err> os: BFAR Address: 0x1ffffc6c

The third time is just:
[00:00:13.418,060] <err> os: ***** BUS FAULT *****

I don't understand also why it reboots itself if I have configured CONFIG_RESET_ON_FATAL_ERROR =n
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Vidar Berg over 1 year ago in reply to JAS0

Hello,

The fault handler should not reset the device when CONFIG_RESET_ON_FATAL_ERROR=n. Maybe you are experiencing a CPU lockup reset (which happens if you encounter a fault within a fault), but the lockup reset should only occur if the chip is not in debug interface mode. Either way, I think it would be helpful if you could try enabling the Thread analyzer to see if there is high stack usage in any of the threads. Additionally, please check if all threads are included in the output, as missing threads in the output could also indicate a stack overflow."
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 JAS0 over 1 year ago in reply to Vidar Berg

Hello,

The code does not define other threats is just the main, two additional header files with their respective .c files with some functions for the initialization of the sensor and processing of the data. The error occurs in one of the data-processing functions.

Is this what you mean by the Threat analyzer?

[00:00:00.266,845] <inf> thread_analyzer: Thread analyze:
[00:00:00.266,998] <inf> thread_analyzer: thread_analyzer : STACK: unused 700 usage 324 / 1024 (31 %); CPU: 3 %
[00:00:00.267,028] <inf> thread_analyzer: : Total CPU cycles used: 5
[00:00:00.267,120] <inf> thread_analyzer: logging : STACK: unused 240 usage 528 / 768 (68 %); CPU: 79 %
[00:00:00.267,120] <inf> thread_analyzer: : Total CPU cycles used: 124
[00:00:00.267,303] <inf> thread_analyzer: idle : STACK: unused 288 usage 32 / 320 (10 %); CPU: 0 %
[00:00:00.267,303] <inf> thread_analyzer: : Total CPU cycles used: 0
[00:00:00.267,974] <inf> thread_analyzer: main : STACK: unused 3080 usage 5112 / 8192 (62 %); CPU: 14 %
[00:00:00.268,005] <inf> thread_analyzer: : Total CPU cycles used: 27
[00:00:00.268,249] <inf> thread_analyzer: ISR0 : STACK: unused 1656 usage 392 / 2048 (19 %)

With the threat analyzer enable there is no error message, it just resets at the same point of the code
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Vidar Berg over 1 year ago in reply to JAS0

Hello,

Yes, this is the log output I was expecting, but I thought maybe you had more threads running. Would you be able to share your project here or in a private ticket. I probably won't be able to reproduce the issue since I don't have your I2C sensor, but I can try to see if I can spot any potential issues with the code.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 JAS0 over 1 year ago in reply to Vidar Berg

Hello,

I will create a private ticket, thank you.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel