This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

hard fault after a while when connected

I'm developing a product using a module with a built-in NRF51822 and I'm experiencing a strange error:

After staying in a connection for a long time (hours to days), the device suddenly enters into the hard fault handler. The system is simple: BLE-Module, Sensor (I2C with bit-banging implementation), battery, button, LCD/LCD-Driver (SPI). Every 1s the sensor data is read and the characteristics are updated (and usually notified).

I back-traced the instruction causing the hard fault to the addresses 0x0000B65F and 0x0000B5E1 when using the softdevice S110 7.0.0 or S110 7.1.0. This seems to be somewhere in the Softdevice area where I have no source or Debug possibilities.

I am using the ARM mbed library and BLE_API. The Stack-/Heappointers don't change during the established connection and the hard fault occurs outside of my application/periodic callback (both verified through debug messages) so there should not be a problem with nested interrupts or similar I guess.

Because the error occurs outside of my application it's very hard for me to debug and find the root cause. But I have tried several things and one lead to a decreased chance of the error to happen (from >80% failure chance after two days to <20%) :

  • Not using the Hardware SPI and use a bit-banged implementation instead.

Following actions had no influence:

  • not using the sd_flash api calls
  • not updating characteristics
  • compiler optimization level
  • used memory model (2 region or 1 region)
  • softdevice version 7.0.0 or 7.1.0

Does someone has any idea what might be the cause here or what else I could try? Especially if the Nordic guys have some information about what happens @ 0x0000B65F / 0x0000B5E1?

Thanks

  • I am pretty sure this is a SoftDevice assert, which should call the assert callback and then produce a HardFault by doing an unaligned memory access. The parameters given to the assert callback will provide you with a file name, a line number and the current program counter. HardFaulting is intentional to stop the stack from running when in an invalid state. You could catch this by setting a breakpoint in the stack assert handler, or by writing the parameters to some transport (UART/SPI) or flash.

  • Here is the outcome:

    line_num: 0x0574 -> 1396

    file_name: src\ll_lm.s0.c

    SoftDevice: V7.0.0

    which is according to Ulrich a known problem. Moved to Nordic support cases with this information. Thanks! :)

  • I can confirm that using the mbed Ticker can lead to a SoftDevice Assert and a hard fault. The Ticker globally disables interrupts (for ~62us) and this can lead to a timing violation within the SoftDevice. If you encounter this problem as well, use the Nordic app_timer API instead of the mbed Ticker.

    This should be solved in the mbed library, you don't want a hard fault timebomb in your code...

Related