Softdevice Random Hardfault at 0x00001398

Hello,

We are currently using softdevice version 7.0.1 and SDK version 16.0.0.
We are having difficulty nailing down a rare hard faulting bug.
Very rarely, the device will run for an undetermined length of time then hardfault.
However, most of the time the device will run without issue for an unlimited amount of time.
This has made catching the bug with a debugger connected extremely difficult.

Today was the first and only day we were able to catch the bug with the programmer connected.
Running though the stack, I was able to find that the program counter was set to 0xCAFEBABE with a return address of 0x00001398.
Since this is located with-in the softdevice source, I was wondering if someone from Nordicsemi could look up what condition would cause a jump to 0xCAFEBABE?

I know that it can be used internally by the softdevice to signal particular error conditions (See Link Below).
https://devzone.nordicsemi.com/f/nordic-q-a/30431/biref-description-about-deadbeef-how-it-is-used-where/120541#120541

We have not been able to easily reproduce this error.
FYI, it only calls the Hardfault handler, it does not call the app error handler.

Thanks

Parents Reply Children
  • We are using the NRF52832 rev. AAEO with softdevice variant s132 running version 7.0.1.
    We do not write or modify the MWU peripheral.

    Unfortunately this was the only time we have been able to catch this hardfault with the debugger connected.
    I didn't realize you could save the state with the ozone debugger, I will do this if I am able to catch it in action again.
    As such, I do not have a full register dump or call stack to look over anymore.

    When I did catch it, I was able to see 0xFFFFFFFE in the link register.
    Ozone displayed one level up in the call stack, which had a the register values of
    PC: 0xCAFEBABE 
    LR: 0x00001398

    I know this isn't much information to work with, but any idea on direction would be immensely helpful.

  • It's not something we have seen before, the theory is that there is a bug in the application (rather than the softdevice), and the bug corrupts the callstack and somehow return to wrong address on exception return. This theory might be strengthen if you also experience other hardfaults, are they allways CAFEBABE in PC and 0x1398 in LR.. or if there is sometimes other content.

    You may check if you have any application interrupts at all above SVC priority.. those should be prio 2 or 3, since prio 0 and 1 is reserved by the SD, and prio 4 is what the SVC is using.

    I think if you experience the problem again you should try to read out the hardfault status registers, this might help understand what actually caused the hardfault.

    Kenneth

Related