This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

Stack overflow

Hello,

I have been running this program without any problem then I added some code to read and write to flash using NVS when the error shown below appeared.

I am running a single ( main() ) thread application in Segger Embedded Studio v5.34a.

*** Booting Zephyr OS build v2.4.99-ncs1-rc1 ***
[00:00:13.434,600] [1;31m<err> os: ***** USAGE FAULT *****
[00:00:13.434,600] [1;31m<err> os: Stack overflow (context area not valid)
[00:00:13.434,600] [1;31m<err> os: r0/a1: 0x00000000 r1/a2: 0x0000cbfd r2/a3: 0x00000000
[00:00:13.434,600] [1;31m<err> os: r3/a4: 0x0000cbfd r12/ip: 0x207bc4ae r14/lr: 0x77fadffe
[00:00:13.434,600] [1;31m<err> os: xpsr: 0xfdd9ca00
[00:00:13.434,600] [1;31m<err> os: Faulting instruction address (r15/pc): 0x05221210
[00:00:13.434,631] [1;31m<err> os: >>> ZEPHYR FATAL ERROR 2: Stack overflow on CPU 0
[00:00:13.434,631] [1;31m<err> os: Current thread: 0x20000668 (unknown)
[00:00:13.696,472] [1;31m<err> fatal_error: Resetting system

If I the solution is to increase the stack size, how do I do it?

Can someone please help?

Kind regards

Mohamed

  • Learner said:
    Could it be that the thread where the fault is occurring is the workqueue thread main_work_pid4?

    Yes, of course. I was way too fast to answer you earlier. My apologies for that. If the fault occured in the work item handler, it is the system work queue thread that is running and I think the CONFIG_SYSTEM_WORKQUEUE_STACK_SIZE should get modified to increase the stack size.

    If it doesn't help to increase that config, I'll get back to you tomorrow about how to to debug the MPU fault. 

  • Hi Simon,

    I rolled back to an old version of the firmware which ran fine without any stack crash. I then started adding gradually the new code that I had experienced the stack crash error with. I have now completed adding all the code (I think) but there is no sign of the stack crash Slight smile. I am rather puzzled by this because I wanted the crash to occur again so that I can identify the line(s) of code that caused it in the first place.

    I still would like you to send me more information on MPU FAULTStacking error (context area might be not valid) and Data Access Violation and how to debug such problems.

    Thank you.

    Kind regards

    Mohamed

  • The MPU fault  is due to a data access violation, however when browsing DevZone/Google/Zephyr-GitHub, a stack overflow is almost always the cause of an MPU fault in Zephyr. This is the case for you too, since "Stack overflow" was logged first, followed by the MPU Fault. I think the MPU fault happens because it tries to access an invalid memory address outside the stack. 

    To be honest, I don't have too much experience debugging hardfaults, but if you want to understand it better I would recommend you to take a look at the Arm Cortex-M33 documentation.

    Best regards,

    Simon

Related