MCUBOOT bug, swapping back and forth forever if remote host doesn't confirm

NCS 2.7 with mcuboot via sysbuild

custom board with nrf52840, 2 custom applications (manufacturig and end-user)

In my application our DFU will be performed over BLE.  I have enabled the SMP service with GRP_OS and GRP_IMG and Image Manager.  I was testing DFU with the NRF Connect app on Android.  

The update process seems to depend on the DFU managing application (nrfConnect) on the remote Bluetooth device (android phone) to complete the update sequence with an image confirm.  During my test, an extended driver startup time (downloading to an external component) caused the remote Bluetooth host update to timeout.  Since the process was not completed with the image confirm, mcuboot swapped images on every reset. 

Luckily in my test the 2 images were significantly different (manufacturing vs end-user) and it was quit noticeable that the image was getting swapped back and forth.  However this seems like a very tricky bug to discover in the field.  

From my perspective relying on an external system to complete the DFU procedure is a major flaw.  End-users are very unpredictable, they cannot be expected to keep their phone within range while the images are being swapped.  Swapping back and forth on every reset forever doesn't ever seem like it would be an intended behavior for a typical DFU use case.

Is there a sample that demonstrates how to build a bulletproof OTA DFU, that would include handling the case where the remote Bluetooth host is not available to confirm the final firmware? 

Parents
  • Hello,

    There is a function that you can use in the application that will self-verify the image. The intention of this is that you can add some self-checks before doing so, e.g. check that you can still communicate with some physical sensor, before verifying the image. If you do this at the start of your application, it will not swap back to the older image after rebooting. 

    Please see the declaration of boot_write_img_confirmed() in ncs\zephyr\include\zephyr\dfu\mcuboot.h:

    /**
     * @brief Marks the currently running image as confirmed.
     *
     * This routine attempts to mark the currently running firmware image
     * as OK, which will install it permanently, preventing MCUboot from
     * reverting it for an older image at the next reset.
     *
     * This routine is safe to call if the current image has already been
     * confirmed. It will return a successful result in this case.
     *
     * @return 0 on success, negative errno code on fail.
     */
    int boot_write_img_confirmed(void);

    Call that from your application, and it will not swap back to the previous image.

    Best regards,

    Edvin

Reply
  • Hello,

    There is a function that you can use in the application that will self-verify the image. The intention of this is that you can add some self-checks before doing so, e.g. check that you can still communicate with some physical sensor, before verifying the image. If you do this at the start of your application, it will not swap back to the older image after rebooting. 

    Please see the declaration of boot_write_img_confirmed() in ncs\zephyr\include\zephyr\dfu\mcuboot.h:

    /**
     * @brief Marks the currently running image as confirmed.
     *
     * This routine attempts to mark the currently running firmware image
     * as OK, which will install it permanently, preventing MCUboot from
     * reverting it for an older image at the next reset.
     *
     * This routine is safe to call if the current image has already been
     * confirmed. It will return a successful result in this case.
     *
     * @return 0 on success, negative errno code on fail.
     */
    int boot_write_img_confirmed(void);

    Call that from your application, and it will not swap back to the previous image.

    Best regards,

    Edvin

Children
No Data
Related