DFU OTA mcumgr slot 1 stuck (bad state 6)

Hi, I have issue where the slot 1 image seems stuck. It can not be erased, confirmed, or swapped in the state that it is currently on. When I click erase, the error bad state 6 will show. As per the error code, the source has this definition "The device is not currently in a state to handle the request.

This firmware image is thoroughly tested in production and has been working for 2 years. The version is stable and supports the SMP server version and implementation same as in the other version on slot 1 and slot 0.

Can the slot 1 still be recovered in some way?

nrf sdk v2.4.2
nrf device manager

  • Here is additional context.

    As per another nrf5340 device the same flags are applied for another image. When I send confirm command the permanent changes to true so I believe this or the reset command is the last command sent to the device before it has got stuck.

    zephyr 28a3fca7da5

    first one point to the img_mgmt_erase which reads the info from the specified slot so i believe this fails causing it to check the the slot in secondary slot has flash confirmed or pending.

    reset is not swapping the images either so does this mean the mcuboot is not swapping the images because of version downgrade protection or something. i have downgraded the device many times before so I don't know why it is issue this time. 

    does confirm and test set some additional flags to check this version which would prevent it from downgrading? I know that nrf5340 does not support the mcumgr test command because of the multi core nature. Does it mean I can softbrick my device using this confirm and test command?

  • Hi, 

    This firmware image is thoroughly tested in production and has been working for 2 years. The version is stable and supports the SMP server version and implementation same as in the other version on slot 1 and slot 0.

    Is the firmware enabled for the downgrade protection? Do you update the new firmware with the same partition?

    Can the slot 1 still be recovered in some way?

    You can upload a new file built with the same partition of the image in slot 0. If it enables the downgrade protection, also ensure to update the version.  

    86s9f5w8 said:
    does confirm and test set some additional flags to check this version which would prevent it from downgrading?

    No, the mcuboot would check it before swapping. 

    Could you upload the mcuboot log if it's possible?

    Regards,
    Amada H.

  • Is the firmware enabled for the downgrade protection?

    CONFIG_MCUBOOT_DOWNGRADE_PREVENTION is not set. CONFIG_BOOT_UPGRADE_ONLY is not set. We have functionality to downgrade our devices using older firmwares which is why these options are not set. It has been working for 2 years now aswell with thousands of firmware updates.

    Do you update the new firmware with the same partition?

    Yes, the same yml is used with the partitions specified.

    You can upload a new file built with the same partition of the image in slot 0. If it enables the downgrade protection, also ensure to update the version.  

    I have tried this solution but to no luck.

    Could you upload the mcuboot log if it's possible?

    Not at the moment because this is a production device with no access to JTAG interface.

  • Hi, 

    It sounds like a known issue that is addressed in v2.6.0. To recover from this, you must physically attach a debugger to the device and erase the secondary partition.

    86s9f5w8 said:
    Not at the moment because this is a production device with no access to JTAG interface.

    It won't be able to recover from this without JTAG.

    Have you already distributed the FW update to your customers, or did you catch this issue during release testing? 

    It needs to figure out what caused the update to become stuck in the pending state in the first place to prevent it from happening again on other devices. 

    -Amanda H.

  • It sounds like a known issue that is addressed in v2.6.0. 

    Can you link or describe this issue? On our side, we need to document this to know the full extent. We are currently on 2.4.2.

    Have you already distributed the FW update to your customers, or did you catch this issue during release testing? 

    No, this FW package is not in distribution. This was noticed during DFU testing. 

Related