MCUBoot + MCUmgr: Image in the secondary slot is not valid!

Hi!

I am working on a custom board and would like to use an external flash for BLE OTA updates. The FW is based on NCS v1.7.1 and MCUBoot as the bootloader. I would like to use the MX25R64 external flash connected via SPI (not QSPI) for the secondary bootloader slot. 

I started with the SMP Server sample and got it working on an nRF52840DK using QSPI (see NCS 1.7 ota use external flash). I've followed the discusion ncs-external-flash-ota-change-qspi-to-spi to use SPI instead of QSPI. When using the Android 'Device Manager' App in combination with the provided sample application 1207.hello_world_spi_nor_ext_flash.zip everthing works fine. But the update process fails, if I use the MCUmgr CLI interface for the image upload. In this case I get the error message 'Image in the secondary slot is not valid!'. The MCUmgr CLI runs on a Raspberry PI with an nRF52840 Dongle as BLE HCI Interface.

I also tried tried NCS v1.8.0 and got the same error message.


Debugging MCUBoot with Ozone shows that the image signature/hash is incorrect when using MCUmgr CLI for the image upload. Further investigation of the error shows that the integrity check perforrmed in the bootutil_img_validate function fails. In this case the variable hash is invalid, but the content of the variable buf is correct. When I use the Device Manager App to  upload the image, the signature is valid and the update is successful.

  /*
     * Traverse through all of the TLVs, performing any checks we know
     * and are able to do.
     */
    while (true) {
        rc = bootutil_tlv_iter_next(&it, &off, &len, &type);
        if (rc < 0) {
            goto out;
        } else if (rc > 0) {
            break;
        }

        if (type == IMAGE_TLV_SHA256) {
            /*
             * Verify the SHA256 image hash.  This must always be
             * present.
             */
            if (len != sizeof(hash)) {
                rc = -1;
                goto out;
            }
            rc = LOAD_IMAGE_DATA(hdr, fap, off, buf, sizeof(hash));
            if (rc) {
                goto out;
            }

            FIH_CALL(boot_fih_memequal, fih_rc, hash, buf, sizeof(hash));
            if (fih_not_eq(fih_rc, FIH_SUCCESS)) {
                goto out;
            }

            sha256_valid = 1;



Can you give me any advice to fix the error?


Best regards,

Thomas

  • Hi,

     

    Thank you for providing your test firmware.

    I tried to replicate the issue you are seeing using these commands on the PC side:

     

    sudo /path/to/mcumgr --conntype ble --connstring ctlr_name=hci0,peer_name='Nordic_LBS' image upload zephyr/app_update.bin

    Here's the "image list" after successful upload:

    sudo /path/to/mcumgr --conntype ble --connstring ctlr_name=hci0,peer_name='Nordic_LBS' image list
    Images:
     image=0 slot=0
        version: 0.0.0
        bootable: true
        flags: active confirmed
        hash: 970ee41e6186bebf631aab7aebbc7c7ae0d6b027ac8149f782104c06884673b7
     image=0 slot=1
        version: 0.0.0
        bootable: true
        flags: 
        hash: 24bc32212e7cc352aafc82a5143dd3f79bd9819869c548dacaf41d2e2eddfe11
    Split status: N/A (0)
    

    Now test the second image and see that it is set to "pending" successfully:

    sudo /path/to/mcumgr --conntype ble --connstring ctlr_name=hci0,peer_name='Nordic_LBS' image test 24bc32212e7cc352aafc82a5143dd3f79bd9819869c548dacaf41d2e2eddfe11
    Images:
     image=0 slot=0
        version: 0.0.0
        bootable: true
        flags: active confirmed
        hash: 970ee41e6186bebf631aab7aebbc7c7ae0d6b027ac8149f782104c06884673b7
     image=0 slot=1
        version: 0.0.0
        bootable: true
        flags: pending
        hash: 24bc32212e7cc352aafc82a5143dd3f79bd9819869c548dacaf41d2e2eddfe11
    Split status: N/A (0)
    

    And here's the output after reset where you can see that the timestamp has changed (image tested OK):

    *** Booting Zephyr OS build v2.6.99-ncs1-1  ***
    I: Starting bootloader
    I: Primary image: magic=unset, swap_type=0x1, copy_done=0x3, image_ok=0x3
    I: Secondary image: magic=unset, swap_type=0x1, copy_done=0x3, image_ok=0x3
    I: Boot source: none
    I: Swap type: none
    I: Bootloader chainload address offset: 0xc000
    *** Booting Zephyr OS build v2.6.99-ncs1-1  ***
    Starting Bluetooth Peripheral LBS example
    build time: Jan 19 2022 09:25:17
    I: 2 Sectors of 4096 bytes
    I: alloc wra: 0, fd0
    I: data wra: 0, 1c
    I: SoftDevice Controller build revision: 
    I: 3f 47 70 8e 81 95 4e 86 |?Gp...N.
    I: 9d d3 a2 95 88 f6 30 0a |......0.
    I: 7f 53 49 fd             |.SI.    
    I: No ID address. App must call settings_load()
    Bluetooth initialized
    Advertising successfully started
    Connected
    I: Swap type: none
    I: Swap type: test
    I: Swap type: test
    Disconnected (reason 19)
    Connected
    Disconnected (reason 19)
    *** Booting Zephyr OS build v2.6.99-ncs1-1  ***
    I: Starting bootloader
    I: Primary image: magic=unset, swap_type=0x1, copy_done=0x3, image_ok=0x3
    I: Secondary image: magic=good, swap_type=0x2, copy_done=0x3, image_ok=0x3
    I: Boot source: none
    I: Swap type: test
    I: Bootloader chainload address offset: 0xc000
    *** Booting Zephyr OS build v2.6.99-ncs1-1  ***
    Starting Bluetooth Peripheral LBS example
    build time: Jan 19 2022 09:33:22
    I: 2 Sectors of 4096 bytes
    I: alloc wra: 0, fd0
    I: data wra: 0, 1c
    I: SoftDevice Controller build revision: 
    I: 3f 47 70 8e 81 95 4e 86 |?Gp...N.
    I: 9d d3 a2 95 88 f6 30 0a |......0.
    I: 7f 53 49 fd             |.SI.    
    I: No ID address. App must call settings_load()
    Bluetooth initialized
    Advertising successfully started
    

     

    Debugging MCUBoot with Ozone shows that the image signature/hash is incorrect when using MCUmgr CLI for the image upload. Further investigation of the error shows that the integrity check perforrmed in the bootutil_img_validate function fails. In this case the variable hash is invalid, but the content of the variable buf is correct. When I use the Device Manager App to  upload the image, the signature is valid and the update is successful.

    If the hash is incorrect, then there's a problem with the integrity of the image. Since you are working with a nRF52840-DK, could you try to erase the flash and QSPI flash?

    nrfjprog -e
    nrfjprog --reset
    nrfjprog -qspieraseall

    Then flash your sample.

     

    Kind regards,

    Håkon

  • Many thanks for the fast response!
    I followed your steps but still get the same error message.

    As a first step, I erased the flash and QSPI flash

    nrfjprog -e
    nrfjprog --reset
    nrfjprog --qspieraseall


    Than I build and flashed the sample using NCS v1.7.1
    west build -p always -b nrf52840dk_nrf52840
    west flash


    I rebuilt the sample application, copied the update binary to a Raspberry Pi and uploaded the image using MCUmgr
    ubuntu@ubuntu:~$ sudo go/bin/mcumgr -c ble_lbs image upload app_update.bin
     200.50 KiB / 200.50 KiB [======================================================================] 100.00% 9.76 KiB/s 20s
    Done

    List the images

    ubuntu@ubuntu:~$ sudo go/bin/mcumgr -c ble_lbs image list
    Images:
     image=0 slot=0
        version: 0.0.0
        bootable: true
        flags: active confirmed
        hash: 30372ead86b3fe06019e4c90d02c2b680bd97195ddc91f20cc5ec68b7516a4a6
     image=0 slot=1
        version: 0.0.0
        bootable: true
        flags:
        hash: 56385894ea602db72850094e158c58afb4925607790ce3d79d64b1024a2a3d3d
    Split status: N/A (0)

    Triggered an image test and reset the device using nrfjprog

    ubuntu@ubuntu:~$ sudo go/bin/mcumgr -c ble_lbs image test 56385894ea602db72850094e158c58afb4925607790ce3d79d64b1024a2a3d
    3d
    Images:
     image=0 slot=0
        version: 0.0.0
        bootable: true
        flags: active confirmed
        hash: 30372ead86b3fe06019e4c90d02c2b680bd97195ddc91f20cc5ec68b7516a4a6
     image=0 slot=1
        version: 0.0.0
        bootable: true
        flags: pending
        hash: 56385894ea602db72850094e158c58afb4925607790ce3d79d64b1024a2a3d3d
    Split status: N/A (0)

    However, testing the image fails with the same error message
    *** Booting Zephyr OS build v2.6.99-ncs1-1  ***
    I: Starting bootloader
    I: Primary image: magic=unset, swap_type=0x1, copy_done=0x3, image_ok=0x3
    I: Secondary image: magic=unset, swap_type=0x1, copy_done=0x3, image_ok=0x3
    I: Boot source: none
    I: Swap type: none
    I: Bootloader chainload address offset: 0xc000
    *** Booting Zephyr OS build v2.6.99-ncs1-1  ***
    Starting Bluetooth Peripheral LBS example
    build time: Jan 19 2022 13:30:44
    I: 2 Sectors of 4096 bytes
    I: alloc wra: 0, fe8
    I: data wra: 0, 0
    I: SoftDevice Controller build revision: 
    I: 3f 47 70 8e 81 95 4e 86 |?Gp...N.
    I: 9d d3 a2 95 88 f6 30 0a |......0.
    I: 7f 53 49 fd             |.SI.    
    I: No ID address. App must call settings_load()
    Bluetooth initialized
    Advertising successfully started
    Connected
    I: Swap type: none
    Disconnected (reason 19)
    Connected
    I: Swap type: none
    I: Swap type: none
    E: Unable to allocate TX context
    E: Unable to allocate TX context
    E: Unable to allocate TX context
    E: Unable to allocate TX context
    Disconnected (reason 8)
    Connected
    I: Swap type: none
    I: Swap type: none
    Disconnected (reason 19)
    Connected
    I: Swap type: none
    I: Swap type: test
    I: Swap type: test
    Disconnected (reason 19)
    *** Booting Zephyr OS build v2.6.99-ncs1-1  ***
    I: Starting bootloader
    I: Primary image: magic=unset, swap_type=0x1, copy_done=0x3, image_ok=0x3
    I: Secondary image: magic=good, swap_type=0x2, copy_done=0x3, image_ok=0x3
    I: Boot source: none
    I: Swap type: test
    E: Image in the secondary slot is not valid!
    I: Bootloader chainload address offset: 0xc000
    *** Booting Zephyr OS build v2.6.99-ncs1-1  ***
    Starting Bluetooth Peripheral LBS example
    build time: Jan 19 2022 13:30:44
    I: 2 Sectors of 4096 bytes
    I: alloc wra: 0, fd0
    I: data wra: 0, 1c
    I: SoftDevice Controller build revision: 
    I: 3f 47 70 8e 81 95 4e 86 |?Gp...N.
    I: 9d d3 a2 95 88 f6 30 0a |......0.
    I: 7f 53 49 fd             |.SI.    
    I: No ID address. App must call settings_load()
    Bluetooth initialized
    Advertising successfully started

    Did I do something wrong?

    Best regards,

    Thomas

  • Hi,

     

    Looks like there's a problem allocating TX packets at your end, this could be due to the bluetooth mtu that your bluetooth host uses. Could you try the following configs

    CONFIG_BT_L2CAP_TX_MTU=252
    CONFIG_BT_BUF_ACL_RX_SIZE=256
    CONFIG_SYSTEM_WORKQUEUE_STACK_SIZE=2304

     

    Kind regards,

    Håkon

  • Thanks for the info.

    The configuration for the larger MTU size has already been set.

    # Allow for large Bluetooth data packets.
    CONFIG_BT_L2CAP_TX_MTU=252
    CONFIG_BT_BUF_ACL_RX_SIZE=256
    
    CONFIG_SYSTEM_WORKQUEUE_STACK_SIZE=4096


    Setting the MTU configuration from the throughput example fixes the allocation problem, but the image verification error remains the same

    CONFIG_BT_BUF_ACL_RX_SIZE=251
    CONFIG_BT_BUF_ACL_TX_SIZE=251
    CONFIG_BT_BUF_ACL_TX_COUNT=40
    CONFIG_BT_L2CAP_TX_BUF_COUNT=40
    CONFIG_BT_ATT_PREPARE_COUNT=2
    CONFIG_BT_L2CAP_TX_MTU=247
    CONFIG_BT_CTLR_RX_BUFFERS=2
    CONFIG_BT_CTLR_DATA_LENGTH_MAX=251


    *** Booting Zephyr OS build v2.6.99-ncs1-1  ***
    I: Starting bootloader
    I: Primary image: magic=good, swap_type=0x4, copy_done=0x1, image_ok=0x1
    I: Secondary image: magic=unset, swap_type=0x1, copy_done=0x3, image_ok=0x3
    I: Boot source: none
    I: Swap type: none
    I: Bootloader chainload address offset: 0xc000
    *** Booting Zephyr OS build v2.6.99-ncs1-1  ***
    Starting Bluetooth Peripheral LBS example
    build time: Jan 19 2022 15:31:58
    I: 2 Sectors of 4096 bytes
    I: alloc wra: 0, fd0
    I: data wra: 0, 1c
    I: SoftDevice Controller build revision: 
    I: 3f 47 70 8e 81 95 4e 86 |?Gp...N.
    I: 9d d3 a2 95 88 f6 30 0a |......0.
    I: 7f 53 49 fd             |.SI.    
    I: No ID address. App must call settings_load()
    Bluetooth initialized
    Advertising successfully started
    Connected
    I: Swap type: none
    Disconnected (reason 19)
    Connected
    I: Swap type: none
    I: Swap type: none
    Disconnected (reason 8)
    Connected
    I: Swap type: none
    I: Swap type: none
    Disconnected (reason 19)
    Connected
    I: Swap type: none
    I: Swap type: test
    I: Swap type: test
    Disconnected (reason 19)
    *** Booting Zephyr OS build v2.6.99-ncs1-1  ***
    I: Starting bootloader
    I: Primary image: magic=good, swap_type=0x4, copy_done=0x1, image_ok=0x1
    I: Secondary image: magic=good, swap_type=0x2, copy_done=0x3, image_ok=0x3
    I: Boot source: none
    I: Swap type: test
    E: Image in the secondary slot is not valid!
    I: Bootloader chainload address offset: 0xc000
    *** Booting Zephyr OS build v2.6.99-ncs1-1  ***
    Starting Bluetooth Peripheral LBS example
    build time: Jan 19 2022 15:31:58
    I: 2 Sectors of 4096 bytes
    I: alloc wra: 0, fd0
    I: data wra: 0, 1c
    I: SoftDevice Controller build revision: 
    I: 3f 47 70 8e 81 95 4e 86 |?Gp...N.
    I: 9d d3 a2 95 88 f6 30 0a |......0.
    I: 7f 53 49 fd             |.SI.    
    I: No ID address. App must call settings_load()
    Bluetooth initialized
    Advertising successfully started


    But I think the problem is related to buffer or SPI configuration, cause the image upload throughput is much higher when using MCUmgr and it works in combination with QSPI.

    Best regards,
    Thomas

  • I further investigated the error and I think it is somehow related to the image transfer speed. Instead of a Raspberry Pi I used a PC with Ubuntu and did several tests. Before each test I erased the external flash

    nrfjprog --qspieraseall


    The update process works when using the internal HCI device. Upload speed is around 3.0 KiB/s and it took 1m7s.

    When I use the Nordic Dongle with BLE HCI firmware the update process fails. In this case upload speed is about 11,8 KiB/s and takes about 16 seconds.

    Do you have any advice?

Related