Beware that this post is related to an SDK in maintenance mode
More Info: Consider nRF Connect SDK for new designs
This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

OTA - Brick prevention - Buttonless DFU

Hello,

For our device we are using a NRF52832 and SDK version 17.0.2.
It only has one Button connected to P02. Due to needing the device to be water resistant the decision was against a possibility
of changing/disconnecting the battery or a switch that completely disconnects the NRF from the battery.

For future updates of the functionality we are using secure buttonless DFU that is done from our Android Application.

We read a lot of the documentation and related questions and answers here in the forum.
But still we are not sure on how to prevent potential bricking of the devices. The OTA update itself seems to be as
secure and stable as possible and we have to trust on Nordic on this one (at least beyond SDK 12)

What worries us are potential bugs in our code that could prevent the DFU to not react. (We are neither experienced with C nor with embedded)

Can pointer problems/memory access in our code lead to a non reacting buttonless DFU?
How to deal with such things? Any suggestions for best practices?
Anyone out there who had such a problem?

Does a button event listener always react? For example a very simple one that only reacts to a long press and then reboots the NRF. Or can other code also prevent such a listener?

We at some point managed to use TWI wrong which led to an reboot loop. We are not able to reproduce this. But how to deal with such a problem?

One option that seems as a last resort in case everything else fails and we release a firmware that gets stuck, we thought of:
- Let the user drain the battery
- Tell the user to prepare the Android App for the firmware update
- The user should connect the 2 pin magnetic charging cable
- The bootloader is programmed to stay x Seconds in DFU advertising before entering the (faulty) application

But for this option we would need to modify the bootloader to delay the start of the application. We have found similar posts here
in the forum with reference to "Timers" but are not sure so far how to modify the bootloader this way. (Coming from a non embedded world). Does anyone have example code, a tutorial or a good explanation on how to do this?
And: Is there a potential drawback to this idea? Except an unresponsive device for x Seconds after a full discharge -> We can live with that. But: Do we get potential to get stuck in the bootloader with this modification?

Thanks in advance for suggestions,

Max

  • Hi,

    If you are afraid of your application bricking the device, then here are some suggestions on how to deal with this:

    1) Always go into DFU mode after a watchdog-reset. Standard DFU timeout before the bootloader will start the app again is 2 min (NRF_BL_DFU_INACTIVITY_TIMEOUT_MS).
    2) For hard-faults and app-errors, you could in the error-handler enter DFU mode by setting the GPREGRET registers and call nvic_systemreset, or loop until the watchdog times out.

    We at some point managed to use TWI wrong which led to an reboot loop. We are not able to reproduce this. But how to deal with such a problem?

     This link might be useful: https://www.i2c-bus.org/i2c-primer/analysing-obscure-problems/blocked-bus/

  • Watchdog will generally save you from a stuck application. As Sigurd said, you can configure the bootloader to stay in bootloader mode after a watchdog reset.

    Note that your last resort case can also be handled without modifying the bootloader: You can configure the bootloader to also stay in BL-mode if a button is pressed during boot. So add to your user instructions to keep the button pressed when connecting the charging cable.

    A trickier case is if your application is faulty in such a way that it can't enter DFU, but isn't completely stuck either. How to enter the bootloader in that case?

  • mrono said:
    How to enter the bootloader in that case?

    This would be covered by suggestion #2, i.e. when app_error_handler is called. Note that when the device resets, it will boot the bootloader, and the bootloader will decide if it should enter DFU mode, or start/jump to the application. 

Related