This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

Stack overflow

Hello,

I have been running this program without any problem then I added some code to read and write to flash using NVS when the error shown below appeared.

I am running a single ( main() ) thread application in Segger Embedded Studio v5.34a.

*** Booting Zephyr OS build v2.4.99-ncs1-rc1 ***
[00:00:13.434,600] [1;31m<err> os: ***** USAGE FAULT *****
[00:00:13.434,600] [1;31m<err> os: Stack overflow (context area not valid)
[00:00:13.434,600] [1;31m<err> os: r0/a1: 0x00000000 r1/a2: 0x0000cbfd r2/a3: 0x00000000
[00:00:13.434,600] [1;31m<err> os: r3/a4: 0x0000cbfd r12/ip: 0x207bc4ae r14/lr: 0x77fadffe
[00:00:13.434,600] [1;31m<err> os: xpsr: 0xfdd9ca00
[00:00:13.434,600] [1;31m<err> os: Faulting instruction address (r15/pc): 0x05221210
[00:00:13.434,631] [1;31m<err> os: >>> ZEPHYR FATAL ERROR 2: Stack overflow on CPU 0
[00:00:13.434,631] [1;31m<err> os: Current thread: 0x20000668 (unknown)
[00:00:13.696,472] [1;31m<err> fatal_error: Resetting system

If I the solution is to increase the stack size, how do I do it?

Can someone please help?

Kind regards

Mohamed

  • Hi Simon,

    I have had a look at the thread-analyzer link and it looks like it could help debug the stack overflow I am seeing. However, I am not sure how to use it. Can you please provide an example C code using thread_analyzer_run() and thread_analyzer_print().

    Thank you.

    Kind regards

    Mohamed

  • Could you first try to do the following?

    • cd <application folder>/<build_folder>/zephyr
    • addr2line -e zephyr.elf 0xaec010

    I'm using addr2line that comes with MinGW. That command will provide you with the exact place that causes the fault. I got the address 0xaec010 from here: [00:00:41.795,074] <err> os: Faulting instruction address (r15/pc): 0x00aec010.

    Then start a debug session, and put a break point on the place that caused the fault, and you will find the corresponding thread at the bottom of the call stack, like in the call stack in https://devzone.nordicsemi.com/f/nordic-q-a/70100/ecdsa-signing-crashes-when-implemented-with-bluetooth/288468#288468.

    If the issue is due to a stack overflow, increasing the stack size of the thread you found should solve the problem.

    There may be some more efficient ways of resolving this, but this should work.

    Best regards,

    Simon

  • Thank you Simon for your help over the weekend. Your help is really appreciated.

    It has been few days since I last rebooted my laptop. So, I rebooted it yesterday and enabled the stack analyzer module in prj.cong then started to debug my application in SES and found that in fact the stack usage is not a problem at all. I have plenty of unused stack in the main thread. The application also runs without any problems. So, it looks like my laptop was causing the problem I was seeing in SES/zephyr/application. I am rather confused as to how a problem with my laptop could cause an embedded application running on a separate target to fail with a stack overflow error. Maybe you can enlighten me and tell how this can be possible.

    Thank you.

    Kind regards

    Mohamed

  • Hmm.. Maybe it was the fact that you reopened SES that changed the behaviour? Be aware that changes in the overlay or Kconfig will not be taken into effect before restarting SES (or re-running the CMake logic again in some way).

    Best regards,

    Simon

  • Thank you Simon.

    I think changes to any configuration settings (prj.conf, overlay, Kconfig...) necessitates to re-open nRF Connect SDK project via File... I did  not think SES had to be restarted.

    Anyway, the problem has disappeared for now. Let's hope it will not show its ugly head again.

    Is there anything in the map file that could give me pointers when the stack size is not big enough?

    Kind regards

    Mohamed

Related