This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

OTA DFU Stuck in pending

I have adapted and incorporated the NCS OTA DFU sample into my project. Initially, FOTA updates have been working. Now, after pushing an update, the image is stuck pending in slot=1 (see mcumgr output below). It does not swap to slot=0 after a reset. Additionally, I can't erase or overwrite this image from slot=1. I get various errors like "Mcu Mgr Error: BAD_STATE (6)" or "Mcu Mgr Error: NO_MEMORY (2)".

This device is already deployed so programming pins are inaccessible; remote management is the only way I can possibly recover this.

Lastly, and I'm not sure if this matters, I wanted to use the LittleFS but it was conflicting with NVS so I made this change the nrf repo:
devzone.nordicsemi.com/.../308671

Is there any I can complete this update or cancel it and start a new one?

sudo mcumgr --conntype ble <connection string> image list

Images:
 image=0 slot=0
    version: 0.0.0
    bootable: true
    flags: active confirmed
    hash: 8cc09b4293faaac3c0d613aaadbeab0aefc4fd87aa85ecc3ad29ac4edbfc3e41
 image=0 slot=1
    version: 0.0.0
    bootable: true
    flags: pending
    hash: 20e3bbee692b24aeb6bdb9cb5cf3fbb8ccc9929ce5cf58cebfc5f2ddf3c771cc
Split status: N/A (0)

BT40F (nRF5340)
git describe in sdk-nrf: v1.8.0-97-gb57588840
git describe in sdk-zephyr: v2.7.0-ncs1-25-g2dca349769

  • Hi Qualry, 

    Could you try to do a mcumgr image test with the hash of the 2nd image (instead of list) ? 

    By doing that mcumgr will execute the image on 2nd slot (change from pending to confirm) on the next reset. 

    If you have an error when doing that, could you send the log ? 

    Have you tried to reproduce the issue on a local device that you can attach a debugger on ? 

  • > Could you try to do a mcumgr image test with the hash of the 2nd image (instead of list) ?

    I am unable to do that. When I initially attempted an OTA update I uploaded the image, tested the image in the slot, then reset the device. Since then the image has remained in slot=1 with the pending status. In the nRF Device Manager app the option to test the image is grayed out. Also, `mcumgr <connection string> image test hash` doesn't give an error but also doesn't seem to do anything.

    > Have you tried to reproduce the issue on a local device that you can attach a debugger on ?

    No. Trying but have not been able too.

  • Hi Qualry, 

    Without debug information it would be hard for us to know what could be the reason. 

    I assume if you do FOTA update of the same image on the same application on your test device , you see no error ?

    And if you try to do another FOTA on the deployed device you receive error BAD_STATE and NO_MEMORY ? (please send us the log when you receive theses error)

    Have you tried to use the phone to do FOTA update ? 

    Do you store any other data in the flash   ? 


    Could you send the partitions.yml file of both the old and new image build. 

  • I ran into something similar recently when bootloading using external flash.  I had mistakenly had mcuboot_primary a different size from mcuboot_secondary (set using pm_statitic.yml).  I found that mcuboot would not try to test the new/pending image.  Instead, it would display an error, leave the pending flag on the secondary image as-is, and jump back to the application.

    It seems like this is a bug in mcuboot because a device can no longer accept new firmware updates.  The pending flag is set on the secondary image so this image cannot be deleted.  Resetting doesn't clear the pending flag because of the size mismatch between the primary and secondary images.  Performing an image test again doesn't do anything because the pending flag is already set.

    It would be better if mcuboot always cleared the pending flag (even if there is an error).  Alternatively, it would be good to be able to force a delete of the secondary image, even if the pending flag is set.  This would allow recovering from this "bricked" scenario.

    It is probably possible to recover from this using the serial recovery feature, but this isn't always feasible on a sealed device without exposed reset and UART pins.

  • > Could you send the partitions.yml file of both the old and new image build. 

    This is likely the issue: older partitions.yml  newer partitions.yml

    I have recreated this situation in a nrf5340dk. Here's the log: log output

    In my search I've seen similar people in my same predicament. It would be nice if this was implemented:

    https://github.com/apache/mynewt-mcumgr/issues/157

Related