This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

Stack overflow

Hello,

I have been running this program without any problem then I added some code to read and write to flash using NVS when the error shown below appeared.

I am running a single ( main() ) thread application in Segger Embedded Studio v5.34a.

*** Booting Zephyr OS build v2.4.99-ncs1-rc1 ***
[00:00:13.434,600] [1;31m<err> os: ***** USAGE FAULT *****
[00:00:13.434,600] [1;31m<err> os: Stack overflow (context area not valid)
[00:00:13.434,600] [1;31m<err> os: r0/a1: 0x00000000 r1/a2: 0x0000cbfd r2/a3: 0x00000000
[00:00:13.434,600] [1;31m<err> os: r3/a4: 0x0000cbfd r12/ip: 0x207bc4ae r14/lr: 0x77fadffe
[00:00:13.434,600] [1;31m<err> os: xpsr: 0xfdd9ca00
[00:00:13.434,600] [1;31m<err> os: Faulting instruction address (r15/pc): 0x05221210
[00:00:13.434,631] [1;31m<err> os: >>> ZEPHYR FATAL ERROR 2: Stack overflow on CPU 0
[00:00:13.434,631] [1;31m<err> os: Current thread: 0x20000668 (unknown)
[00:00:13.696,472] [1;31m<err> fatal_error: Resetting system

If I the solution is to increase the stack size, how do I do it?

Can someone please help?

Kind regards

Mohamed

Parents Reply
  • Thank you Simon for your help over the weekend. Your help is really appreciated.

    It has been few days since I last rebooted my laptop. So, I rebooted it yesterday and enabled the stack analyzer module in prj.cong then started to debug my application in SES and found that in fact the stack usage is not a problem at all. I have plenty of unused stack in the main thread. The application also runs without any problems. So, it looks like my laptop was causing the problem I was seeing in SES/zephyr/application. I am rather confused as to how a problem with my laptop could cause an embedded application running on a separate target to fail with a stack overflow error. Maybe you can enlighten me and tell how this can be possible.

    Thank you.

    Kind regards

    Mohamed

Children
  • Hmm.. Maybe it was the fact that you reopened SES that changed the behaviour? Be aware that changes in the overlay or Kconfig will not be taken into effect before restarting SES (or re-running the CMake logic again in some way).

    Best regards,

    Simon

  • Thank you Simon.

    I think changes to any configuration settings (prj.conf, overlay, Kconfig...) necessitates to re-open nRF Connect SDK project via File... I did  not think SES had to be restarted.

    Anyway, the problem has disappeared for now. Let's hope it will not show its ugly head again.

    Is there anything in the map file that could give me pointers when the stack size is not big enough?

    Kind regards

    Mohamed

  • Learner said:
    Is there anything in the map file that could give me pointers when the stack size is not big enough?

    I'm not totally sure about this. You could search for the faulting address in the map file and figure out what's causing the fault stack overflow, and set a break point at that location to see what thread it's running from, and then increase the stack size of that thread.

    The Thread analyzer is probably the best way of analyzing the stack usage of the threads.

    Learner said:
    I think changes to any configuration settings (prj.conf, overlay, Kconfig...) necessitates to re-open nRF Connect SDK project via File... I did  not think SES had to be restarted.

    Yes, you are correct about this. But reopening SES follows that you run File.. again, and that might include new dts/overlay/Kconfig changes that wasn't present before you reopened SES. However, something completely different might have triggered the fault, and let's hope it doesn't show up again.

    Best regards,

    Simon

  • Thank you Simon.

    Yes, the Thread analyzer has proved to be very useful.

    Let's just hope that it will not occur again.

    Kind regards

    Mohamed

  • Hi Simon,

    The stack overflow problem is showing its ugly head again.

    The fault is occurring in the function k_work_handler_t pid4_tasks( void ) which is setup in main() as follows,

    static struct k_work main_work_pid4;

    void main( void )
    {
           k_work_init( &main_work_pid4, pid4_tasks );
           ...

           while (1)
           {
                ...

                k_work_submit( &main_work_pid4 );

                ...

           }

    }

    The main() stack is configured in the overlay file and so is the THREAD_ANALYZER stack,

    CONFIG_MAIN_STACK_SIZE=4096

    CONFIG_THREAD_ANALYZER_AUTO_STACK_SIZE=2048

    The stack analyzer output is shown below in bold. Although I am configuring the main to be 4096, the stack analyzer is showing stack sizes of 2048, 1024 320 and 4096, why is this?

    The stack analyzer is not showing stack usage greater than 4096. In fact, I doubled the size of the main stack but the fault is still occurring. So, it could the stack overflow is the consequence of this  error "Stacking error (context area might be not valid)".

    I am attaching a picture of the debugger status at the point the fault occurs.

    I did not want to increase 

    Below is a trace capture of the Debug Terminal in SES IDE. 

    Please help.

    Kind regards

    Mohamed

    *** Booting Zephyr OS build v2.4.99-ncs1-rc1 ***
    Thread analyze:
    0x20002780 : STACK: unused 1448 usage 600 / 2048 (29 %); CPU: 0 %
    0x20002fc8 : STACK: unused 1636 usage 412 / 2048 (20 %); CPU: 0 %
    0x20002db8 : STACK: unused 628 usage 396 / 1024 (38 %); CPU: 0 %
    0x20002860 : STACK: unused 348 usage 676 / 1024 (66 %); CPU: 0 %
    0x20002f08 : STACK: unused 288 usage 32 / 320 (10 %); CPU: 0 %
    0x20002e60 : STACK: unused 3508 usage 588 / 4096 (14 %); CPU: 99 %
    Thread analyze:
    0x20002780 : STACK: unused 1208 usage 840 / 2048 (41 %); CPU: 0 %
    0x20002fc8 : STACK: unused 1636 usage 412 / 2048 (20 %); CPU: 0 %
    0x20002db8 : STACK: unused 628 usage 396 / 1024 (38 %); CPU: 0 %
    0x20002860 : STACK: unused 348 usage 676 / 1024 (66 %); CPU: 0 %
    0x20002f08 : STACK: unused 36 usage 284 / 320 (88 %); CPU: 19 %
    0x20002e60 : STACK: unused 3508 usage 588 / 4096 (14 %); CPU: 80 %
    LR1110 Driver Version: v2.0.1
    LR1110 Firmware Version: HW: 22 Type: 01, FW: 03.03
    System Errors = 0x20
    System Errors = 0x0
    Counter = 1
    LR1110 Modem Packet Type = 1
    Counter = 2

    Thread analyze:
    0x20002780 : STACK: unused 1208 usage 840 / 2048 (41 %); CPU: 0 %
    0x20002fc8 : STACK: unused 1636 usage 412 / 2048 (20 %MCU TEMP = -588.24 0.000660 21.95 °C
    ); CPU: 0 %
    --- 8 messages dropped ---
    delta_time_ms = 0
    Thread analyze:
    0x20002780 : STACK: unused 1208 usage 840 / 2048 (41 %); CPU: 0 %
    0x20002fc8 : STACK: unused 1036 usage 1012 / 2048 (49 %); CPU: 42 %
    0x20002db8 : STACK: unused 628 usage 396 / 1024 (38 %); CPU: 0 %
    0x20002860 : STACK: unused 300 usage 724 / 1024 (70 %); CPU: 0 %
    0x20002f08 : STACK: unused 36 usage 284 / 320 (88 %); CPU: 16 %
    0x20002e60 : STACK: unused 2988 usage 1108 / 4096 (27 %); CPU: 41 %
    MCU TEMP = -588.24 0.000660 21.95 °C
    [00:01:03.042,20[00:01:03.042,20delta_time_ms = 0


    [00:01:23.802,246] <err> os: ***** MPU FAULT *****
    [00:01:23.802,276] <err> os: Stacking error (context area might be not valid)
    [00:01:23.802,276] <err> os: Data Access Violation
    [00:01:23.802,307] <err> os: MMFAR Address: 0x200056dc
    [00:01:23.802,307] <err> os: r0/a1: 0x00000000 r1/a2: 0xaaaaaaaa r2/a3: 0xe288c458
    [00:01:23.802,337] <err> os: r3/a4: 0xdafdd530 r12/ip: 0x568e7d0e r14/lr: 0x3c74b9ef
    [00:01:23.802,337] <err> os: xpsr: 0xf48f4a00
    [00:01:23.802,368] <err> os: Faulting instruction address (r15/pc): 0x6052b7d0
    [00:01:23.802,368] <err> os: >>> ZEPHYR FATAL ERROR 2: Stack overflow on CPU 0
    [00:01:23.802,398] <err> os: Current thread: 0x20002860 (unknown)
    [00:01:24.103,271] <err> fatal_error: Resetting system


    *** Booting Zephyr OS build v2.4.99-ncs1-rc1 ***
    Thread analyze:
    0x20002780 : STACK: unused 1448 usage 600 / 2048 (29 %); CPU: 0 %
    0x20002fc8 : STACK: unused 1636 usage 412 / 2048 (20 %); CPU: 1 %
    0x20002db8 : STACK: unused 628 usage 396 / 1024 (38 %); CPU: 1 %
    0x20002860 : STACK: unused 348 usage 676 / 1024 (66 %); CPU: 18 %
    0x20002f08 : STACK: unused 288 usage 32 / 320 (10 %); CPU: 0 %
    0x20002e60 : STACK: unused 3508 usage 588 / 4096 (14 %); CPU: 78 %
    Thread analyze:
    0x20002780 : STACK: unused 1208 usage 840 / 2048 (41 %); CPU: 0 %
    0x20002fc8 : STACK: unused 1636 usage 412 / 2048 (20 %); CPU: 0 %
    0x20002db8 : STACK: unused 628 usage 396 / 1024 (38 %); CPU: 0 %
    0x20002860 : STACK: unused 348 usage 676 / 1024 (66 %); CPU: 0 %
    0x20002f08 : STACK: unused 36 usage 284 / 320 (88 %); CPU: 99 %
    0x20002e60 : STACK: unused 3508 usage 588 / 4096 (14 %); CPU: 0 %
    LR1110 Driver Version: v2.0.1
    LR1110 Firmware Version: HW: 22 Type: 01, FW: 03.03
    System Errors = 0x20
    System Errors = 0x0
    Counter = 1


    LR1110 Modem Packet Type = 1
    Counter = 2


    Thread analyze:
    0x20002780 : STACK: unused 1208 usage 840 / 2048 (41 %); CPU: 0 %
    0x20002fc8 : STACK: unused 1636 usage

Related