This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

Spurious fault reboot

Custom board, ncs v1.4.0, modem FW v1.2.3, SES v5.34a. The code is based on the Asset Tracker (v1), LTE with PSM, motion-activated GPS, using MQTT to send messages to private backend.

I'm having some random reboot issues. I set a breakpoint on sys_reboot() in reboot.c to see how we got there. I hit Go and when it stops, the Debugger has stopped on a spurious fault, which would lead to a reboot. There's nothing helpful in the call stack. It’s quite intermittent. So far, it has happened right after an MQTT Subscribe and then I see it during a GPS restart. Any ideas on how to debug this?

0 Didrik Rokhaug over 4 years ago

Hi,

There's nothing helpful in the call stack.

If you step a bit further through the fault handler, it should eventually give you more information.

In addition, setting CONFIG_RESET_ON_FATAL_ERROR=n and CONFIG_LOG=y should make the fault handler print out an error message.

CONFIG_LOG_IMMEDIATE=y can be used to disable deferred logging, so that all log messages gets printed where they are in the code. That way, you will not miss any yet-to-be-printed log messages.

Remember that you have to use 'Project -> Run CMake...' or re-open the project for changes to prj.conf or other configuration files to take effect when using SES.

Best regards,

Didrik
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 spline_pete over 4 years ago in reply to Didrik Rokhaug

With these set:

CONFIG_LOG=n
CONFIG_LOG_IMMEDIATE=y
CONFIG_RESET_ON_FATAL_ERROR=n

I still see reboots, but there are no fault handler messages. Do you have an example of what I should expect to see?
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

0 Didrik Rokhaug over 4 years ago in reply to spline_pete

spline_pete said:
CONFIG_LOG=n

Sorry, that should be 'y', not 'n'.

I've edited my original post so that it is now correct.

spline_pete said:
Do you have an example of what I should expect to see?

Here is an example where I am writing to a NULL pointer:

SPM: NS image at 0x8000
SPM: NS MSP at 0x2001bca0
SPM: NS reset vector at 0xbb95
SPM: prepare to jump to Non-Secure image.
*** Booting Zephyr OS build v2.4.99-ncs1  ***
Starting GPS application
[00:00:00.207,763] [1B][1;31m<err> os: Exception occurred in Secure State[1B][0m
[00:00:00.214,324] [1B][1;31m<err> os: ***** HARD FAULT *****[1B][0m
[00:00:00.219,848] [1B][1;31m<err> os:   Fault escalation (see below)[1B][0m
[00:00:00.226,074] [1B][1;31m<err> os: ***** BUS FAULT *****[1B][0m
[00:00:00.231,536] [1B][1;31m<err> os:   Precise data bus error[1B][0m
[00:00:00.237,243] [1B][1;31m<err> os:   BFAR Address: 0x50008158[1B][0m
[00:00:00.243,133] [1B][1;31m<err> os: r0/a1:  0x00000019  r1/a2:  0x00000000  r2/a3:  0x00000000[1B][0m
[00:00:00.251,770] [1B][1;31m<err> os: r3/a4:  0x00000000 r12/ip:  0x00000000 r14/lr:  0x00009aed[1B][0m
[00:00:00.260,406] [1B][1;31m<err> os:  xpsr:  0x41000000[1B][0m
[00:00:00.265,594] [1B][1;31m<err> os: s[ 0]:  0x00000000  s[ 1]:  0x00000000  s[ 2]:  0x00000000  s[ 3]:  0x00000000[1B][0m
[00:00:00.275,970] [1B][1;31m<err> os: s[ 4]:  0x00000000  s[ 5]:  0x00000000  s[ 6]:  0xffffffff  s[ 7]:  0xffffffff[1B][0m
[00:00:00.286,315] [1B][1;31m<err> os: s[ 8]:  0x00000000  s[ 9]:  0x00000001  s[10]:  0x00000000  s[11]:  0xffffffff[1B][0m
[00:00:00.296,691] [1B][1;31m<err> os: s[12]:  0x00000000  s[13]:  0x00000000  s[14]:  0x00000000  s[15]:  0x00000000[1B][0m
[00:00:00.307,067] [1B][1;31m<err> os: fpscr:  0xb873db2f[1B][0m
[00:00:00.312,255] [1B][1;31m<err> os: Faulting instruction address (r15/pc): 0x00009aee[1B][0m
[00:00:00.320,129] [1B][1;31m<err> os: >>> ZEPHYR FATAL ERROR 0: CPU exception on CPU 0[1B][0m
[00:00:00.327,911] [1B][1;31m<err> os: Current thread: 0x20018a98 (unknown)[1B][0m
[00:00:00.334,655] [1B][1;31m<err> os: Halting system[1B][0m

0 spline_pete over 4 years ago in reply to Didrik Rokhaug

The update of ncs from v1.4.0 to v1.5.0 seems to have fixed the issue of spurious faults. I ran a couple units overnight and there weren't any MQTT disconnects or reboots, which was the case just recently.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Reject Answer

Cancel