This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

Buttonless BLE DFU - nrf_fstorage_write() failed with error 0x4

In our product we have implemented buttonless DFU over BLE with the secure boot-loader (unbonded) on SDK 15.2. The bootloader is unmodified except for the addition of PA/LNA control.

We are undertaking pre-production testing of our product and have been using the Nordic DFU libraries for iOS/Android to push firmware updates to our beta testers. Most of the time DFU process completes fine but occasionally (maybe 1 time in 10) the DFU process is failing.

We've managed to replicate this in nRFConnect by repeatedly performing DFU until it fails - here's a screenshot of nRFConnect showing the same error we see reported in our app logs:

After the error the device remains in DFU mode until timeout. This is a problem as the user experience is terrible - as the device remains stuck in DFU mode the user has to wait two minutes (for the DFU timeout) until the application starts again and we can re-establish a connection from the app to the product. We do the firmware version check/update when the user opens our app to interact with the product so asking them to wait two minutes for the product to reset is far from ideal.

I enabled logging on the bootloader and managed to replicate the problem with the product attached to the debugger, here is what gets logged:

 app: Entering DFU mode.
 nrf_dfu_validation: Signature required. Checking signature.
 nrf_dfu_validation: Calculating init packet hash (init packet len: 58)
 nrf_dfu_validation: Verify signature
 nrf_dfu_validation: Image verified
 nrf_dfu_settings: Backing up settings page to address 0xFE000.
 nrf_dfu_flash: nrf_fstorage_write() failed with error 0x4.
 nrf_dfu_flash: nrf_fstorage_write() failed with error 0x4.
 nrf_dfu_flash: nrf_fstorage_write() failed with error 0x4.
 nrf_dfu_flash: nrf_fstorage_write() failed with error 0x4.
 nrf_dfu_flash: nrf_fstorage_write() failed with error 0x4.
 nrf_dfu_flash: nrf_fstorage_write() failed with error 0x4.
 nrf_dfu_flash: nrf_fstorage_write() failed with error 0x4.
 nrf_bootloader_wdt: Internal feed
 app: Inactivity timeout.

I think error code 4 means no memory which seems like an odd error to be occurring in the bootloader....

Any suggestions as to what might be causing the issue and how to mitigate it would be appreciated.

Parents
  • Hi,

    According to the documentation for nrf_fstorage_write(), when using the SoftDevice API for writing flash you get NRF_ERROR_NO_MEM if the internal queue of operations is full.

    I will try to explain what that means. The flash storage module puts all flash operations in a queue, and operations are initiated from that queue when the SoftDevice is ready for executing a new flash operation. If you get NRF_ERROR_NO_MEM a lot (as you do) you can increase the size of the queue, by increasing NRF_FSTORAGE_SD_QUEUE_SIZE in sdk_config.h.

    Note that flash operations halts the CPU, and so the SoftDevice must have available time slots for scheduling the operation. Erase operations in particular take long time to complete and so are hard to schedule. This means depending on SoftDevice operation (for instance short connection interval), executing the pending flash operations may take quite some time. The bootloader example is tuned for working well, but if you make changes to SoftDevice operation you may end up in a situation where flash operations cannot be executed and the operation queue fills up faster than operations are handled.

    Regards,
    Terje

  • Hi tesc, thanks for your help.

    We have not modified the stock bootloader example other than to add PA/LNA support, which is driven by PPI and therefore I don't think should impact behaviour of the Softdevice. We have not changed the BLE connection parameters from the defaults in the example.

    We can try increasing NRF_FSTORAGE_SD_QUEUE_SIZE. In the stock bootloader example this is set to 16 - how much would you recommend we increase it?

    Unfortunately due to the intermittent nature of the issue a trial and error approach to testing is not really practical.

Reply
  • Hi tesc, thanks for your help.

    We have not modified the stock bootloader example other than to add PA/LNA support, which is driven by PPI and therefore I don't think should impact behaviour of the Softdevice. We have not changed the BLE connection parameters from the defaults in the example.

    We can try increasing NRF_FSTORAGE_SD_QUEUE_SIZE. In the stock bootloader example this is set to 16 - how much would you recommend we increase it?

    Unfortunately due to the intermittent nature of the issue a trial and error approach to testing is not really practical.

Children
  • Ok, we tried doubling NRF_FSTORAGE_SD_QUEUE_SIZE to 32 and that seems to have resolved the issue. I suspect the underlying cause of the problem is a change implemented in SDK 15 to backup the bootloader settings page to flash, which seems to happen more times than necessary:

    <info> nrf_bootloader_wdt: WDT enabled CRV:3932160 ticks
    <info> nrf_bootloader_wdt: Starting a timer (3928960 ticks) for feeding watchdog.
    <info> app: Entering DFU mode.
    <info> nrf_dfu_validation: Signature required. Checking signature.
    <info> nrf_dfu_validation: Calculating init packet hash (init packet len: 58)
    <info> nrf_dfu_validation: Verify signature
    <info> nrf_dfu_validation: Image verified
    <info> nrf_dfu_settings: Backing up settings page to address 0xFE000.
    <info> nrf_dfu_settings: Backing up settings page to address 0xFE000.
    <info> nrf_dfu_settings: Backing up settings page to address 0xFE000.
    <info> app: Inside main
    <info> nrf_bootloader_wdt: WDT enabled CRV:3932160 ticks
    <info> nrf_bootloader_wdt: Starting a timer (3928960 ticks) for feeding watchdog.
    <info> nrf_dfu_settings: Backing up settings page to address 0xFE000.
    <info> nrf_dfu_settings: Backing up settings page to address 0xFE000.
    <info> nrf_dfu_settings: Backing up settings page to address 0xFE000.
    <info> nrf_dfu_settings: Backing up settings page to address 0xFE000.
    <info> nrf_dfu_settings: Backing up settings page to address 0xFE000.
    <info> nrf_dfu_settings: Backing up settings page to address 0xFE000.
    <info> app: Inside main
    <info> app: No firmware to activate.
    

    On another note, during testing of this fix we identified another issue with the OTA DFU process, which is a clash between garbage collection for peer manager and the setting of the advertising name for DFU.

    This issue occurs because the BLE connection is torn down and brought up again by the iOS/Android DFU libraries at the start of the DFU process. Occasionally when this happens the peer manager will generate a PM_EVT_STORAGE_FULL event and we do garbage collection to free up room in flash for the peer data to be updated. While GC is on-going a request comes in from the app libraries to change the advertising name (to e.g. Dfu58366) but the softdevice rejects this request, presumably because of the on-going garbage collection process.

    This issue is nowhere near as bad as the previous one as the DFU process fails cleanly and can be retried immediately. It could probably be resolved completely by adding a 1 second delay between connection and sending the request to update the advertising name on the app side of the process, or by not dropping the connection when starting DFU.

Related