This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

Stack overflow

Hello,

I have been running this program without any problem then I added some code to read and write to flash using NVS when the error shown below appeared.

I am running a single ( main() ) thread application in Segger Embedded Studio v5.34a.

*** Booting Zephyr OS build v2.4.99-ncs1-rc1 ***
[00:00:13.434,600] [1;31m<err> os: ***** USAGE FAULT *****
[00:00:13.434,600] [1;31m<err> os: Stack overflow (context area not valid)
[00:00:13.434,600] [1;31m<err> os: r0/a1: 0x00000000 r1/a2: 0x0000cbfd r2/a3: 0x00000000
[00:00:13.434,600] [1;31m<err> os: r3/a4: 0x0000cbfd r12/ip: 0x207bc4ae r14/lr: 0x77fadffe
[00:00:13.434,600] [1;31m<err> os: xpsr: 0xfdd9ca00
[00:00:13.434,600] [1;31m<err> os: Faulting instruction address (r15/pc): 0x05221210
[00:00:13.434,631] [1;31m<err> os: >>> ZEPHYR FATAL ERROR 2: Stack overflow on CPU 0
[00:00:13.434,631] [1;31m<err> os: Current thread: 0x20000668 (unknown)
[00:00:13.696,472] [1;31m<err> fatal_error: Resetting system

If I the solution is to increase the stack size, how do I do it?

Can someone please help?

Kind regards

Mohamed

Parents Reply Children
  • Could you first try to do the following?

    • cd <application folder>/<build_folder>/zephyr
    • addr2line -e zephyr.elf 0xaec010

    I'm using addr2line that comes with MinGW. That command will provide you with the exact place that causes the fault. I got the address 0xaec010 from here: [00:00:41.795,074] <err> os: Faulting instruction address (r15/pc): 0x00aec010.

    Then start a debug session, and put a break point on the place that caused the fault, and you will find the corresponding thread at the bottom of the call stack, like in the call stack in https://devzone.nordicsemi.com/f/nordic-q-a/70100/ecdsa-signing-crashes-when-implemented-with-bluetooth/288468#288468.

    If the issue is due to a stack overflow, increasing the stack size of the thread you found should solve the problem.

    There may be some more efficient ways of resolving this, but this should work.

    Best regards,

    Simon

  • Thank you Simon for your help over the weekend. Your help is really appreciated.

    It has been few days since I last rebooted my laptop. So, I rebooted it yesterday and enabled the stack analyzer module in prj.cong then started to debug my application in SES and found that in fact the stack usage is not a problem at all. I have plenty of unused stack in the main thread. The application also runs without any problems. So, it looks like my laptop was causing the problem I was seeing in SES/zephyr/application. I am rather confused as to how a problem with my laptop could cause an embedded application running on a separate target to fail with a stack overflow error. Maybe you can enlighten me and tell how this can be possible.

    Thank you.

    Kind regards

    Mohamed

  • Hmm.. Maybe it was the fact that you reopened SES that changed the behaviour? Be aware that changes in the overlay or Kconfig will not be taken into effect before restarting SES (or re-running the CMake logic again in some way).

    Best regards,

    Simon

  • Thank you Simon.

    I think changes to any configuration settings (prj.conf, overlay, Kconfig...) necessitates to re-open nRF Connect SDK project via File... I did  not think SES had to be restarted.

    Anyway, the problem has disappeared for now. Let's hope it will not show its ugly head again.

    Is there anything in the map file that could give me pointers when the stack size is not big enough?

    Kind regards

    Mohamed

  • Learner said:
    Is there anything in the map file that could give me pointers when the stack size is not big enough?

    I'm not totally sure about this. You could search for the faulting address in the map file and figure out what's causing the fault stack overflow, and set a break point at that location to see what thread it's running from, and then increase the stack size of that thread.

    The Thread analyzer is probably the best way of analyzing the stack usage of the threads.

    Learner said:
    I think changes to any configuration settings (prj.conf, overlay, Kconfig...) necessitates to re-open nRF Connect SDK project via File... I did  not think SES had to be restarted.

    Yes, you are correct about this. But reopening SES follows that you run File.. again, and that might include new dts/overlay/Kconfig changes that wasn't present before you reopened SES. However, something completely different might have triggered the fault, and let's hope it doesn't show up again.

    Best regards,

    Simon

Related