MCUBoot + MCUmgr: Image in the secondary slot is not valid!

Hi!

I am working on a custom board and would like to use an external flash for BLE OTA updates. The FW is based on NCS v1.7.1 and MCUBoot as the bootloader. I would like to use the MX25R64 external flash connected via SPI (not QSPI) for the secondary bootloader slot. 

I started with the SMP Server sample and got it working on an nRF52840DK using QSPI (see NCS 1.7 ota use external flash). I've followed the discusion ncs-external-flash-ota-change-qspi-to-spi to use SPI instead of QSPI. When using the Android 'Device Manager' App in combination with the provided sample application 1207.hello_world_spi_nor_ext_flash.zip everthing works fine. But the update process fails, if I use the MCUmgr CLI interface for the image upload. In this case I get the error message 'Image in the secondary slot is not valid!'. The MCUmgr CLI runs on a Raspberry PI with an nRF52840 Dongle as BLE HCI Interface.

I also tried tried NCS v1.8.0 and got the same error message.


Debugging MCUBoot with Ozone shows that the image signature/hash is incorrect when using MCUmgr CLI for the image upload. Further investigation of the error shows that the integrity check perforrmed in the bootutil_img_validate function fails. In this case the variable hash is invalid, but the content of the variable buf is correct. When I use the Device Manager App to  upload the image, the signature is valid and the update is successful.

  /*
     * Traverse through all of the TLVs, performing any checks we know
     * and are able to do.
     */
    while (true) {
        rc = bootutil_tlv_iter_next(&it, &off, &len, &type);
        if (rc < 0) {
            goto out;
        } else if (rc > 0) {
            break;
        }

        if (type == IMAGE_TLV_SHA256) {
            /*
             * Verify the SHA256 image hash.  This must always be
             * present.
             */
            if (len != sizeof(hash)) {
                rc = -1;
                goto out;
            }
            rc = LOAD_IMAGE_DATA(hdr, fap, off, buf, sizeof(hash));
            if (rc) {
                goto out;
            }

            FIH_CALL(boot_fih_memequal, fih_rc, hash, buf, sizeof(hash));
            if (fih_not_eq(fih_rc, FIH_SUCCESS)) {
                goto out;
            }

            sha256_valid = 1;



Can you give me any advice to fix the error?


Best regards,

Thomas

  • Hi,

    Tom_H said:
    I further investigated the error and I think it is somehow related to the image transfer speed. Instead of a Raspberry Pi I used a PC with Ubuntu and did several tests. Before each test I erased the external flash

    I am glad to hear that you got it working!

    Tom_H said:
    The update process works when using the internal HCI device. Upload speed is around 3.0 KiB/s and it took 1m7s.

    When I use the Nordic Dongle with BLE HCI firmware the update process fails. In this case upload speed is about 11,8 KiB/s and takes about 16 seconds.

    Do you have any advice?

    I see the same speed at my end. Unfortunately, this is a current restriction with updating. I have opened up an internal improvement report to look at the speed of the overall DFU process. 

     

    Kind regards,

    Håkon

  • Thanks for your response.

    Is there a way to reduce the image transfer speed using a Nordic Device with BLE HCI firmware to make BLE OTA work for our intended production environment? Our Gateway is based on a custom board with a Raspberry PI compute module and a pre-certified nRF52840 module.

    Best regards,
    Thomas

  • Hi Thomas,

     

    As mentioned, this is unfortunately the state of the image transfer at this time. I have inputted this as an internal improvement report. We are continuously trying to improve our deliveries, and speed of the transfer is one of the areas we're trying to improve. However, as of writing this answer (latest stable version NCS v1.8.0), the speed is approx. 3kB/s.

     

    Kind regards,

    Håkon

  • Many thanks for the support.

    Does this mean that at the current state we cannot use a dongle as HCI device for BLE OTA if the secondary bootloader slot of the peripheral device is located on external flash and connected via SPI?

    I have tested the following combinations with MCUmgr CLI and a nordic dongle as HCI device on a Linux system to update a BLE peripheral device:
    - Works: peripheral with secondary slot on internal flash
    - Works: peripheral with secondary slot on external flash (QSPI)
    - Fails: peripheral with secondary slot on external flash (SPI) -> our usecase (example above)

    Our custom senosr node (peripheral device) is already in production and we need a redesign if the update process doesn't work. At the moment we are using the old nRF5 SDK with the included bootloader and are working on porting the application to NCS and Zephyr.

    Best regards,
    Thomas

  • Hi Thomas,

     

    I am sorry, I must have misunderstood the scenario here. Given the speed you got, I assumed that you were referring to nRF5 sdk implementation, but yes; you can get this speed with mcumgr if overwriting the same image, it seems.

    Tom_H said:
    - Fails: peripheral with secondary slot on external flash (SPI) -> our usecase (example above)

    I setup hci_uart with this setup, and tried it locally. I can see that the issue is present on my side as well:

    *** Booting Zephyr OS build v2.6.99-ncs1  ***
    I: Starting bootloader
    I: Primary image: magic=unset, swap_type=0x1, copy_done=0x3, image_ok=0x3
    I: Secondary image: magic=good, swap_type=0x2, copy_done=0x3, image_ok=0x3
    I: Boot source: none
    I: Swap type: test
    E: Image in the secondary slot is not valid!
    I: Bootloader chainload address offset: 0xc000
    *** Booting Zephyr OS build v2.6.99-ncs1  ***
    

     

    The strange part is that if I manually program the "app_update_test.hex" into the primary slot, the hash of the image is equal to when I'm doing the same procedure over-the-air, which indicates that there's a problem with the moving the image from SPI-flash to internal flash. I am not sure where the issue is in your test-application.

     

    Could you try this sample and see if the same issue occurs on your end?

    smp_svr_nrf52840dk_spi_nor.zip

    Note: this one advertises as "Zephyr".

     

    Kind regards,

    Håkon

Related