DFU using USB file system (not serial emulation)?

My project has a nrf5340+external flash, and an external USB-C connector. The device exposes a USB file system (FAT) which is hosted on the external flash partition, and appears as an external disk when connected to a host. This all works well and is used to update config, image files etc for the device to use.

I would like to be able to do application / network firmware updates by simply copying the new image (hex or bin) to this file system and rebooting. Has anyone done this? I'm guessing it involves using MCUBoot bootloader and copying the image into the secondary image slot somehow? I also want to put the secondard image slot on my external flash as won't have space on the internal flash - even better would be if MCUBoot could take the image directly from the file in the FS....

by the way, to be clear, I do NOT want to use the 'DFU over USB serial emulation' method, as this requires the user to install the specific tool to the host machine, which may not be available for their host or may not be allowed by IT policy (or by their level of expertise...). 

thanks for any pointers!

Parents
  • Hi Brian, 
    You should set 

    CONFIG_MCUBOOT_SERIAL=n in the mcuboot.conf overlay to apply the config to the MCUBoot child image, not the application. Usually you put that file into child_image folder.  
    Anyway , by default MCUBOOT_SERIAL is not enable so you don't really have to worry about it. (double check with .config file in the MCUBoot build folder)
  • Ok... I removed all the wifi stuff from my app, added a mcuboot.cong in the child_image folder with:

    CONFIG_MCUBOOT_SERIAL=n
    CONFIG_BOOT_USE_MIN_PARTITION_SIZE=y

    This didn't fix it, although it has changed things.

    Latest build error is:

    [373/415] Performing build step for 'mcuboot_subimage'

    ...

    `text' will not fit in region `FLASH'
    c:/ncs/toolchains/cf2149caf2/opt/zephyr-sdk/arm-zephyr-eabi/bin/../lib/gcc/arm-zephyr-eabi/12.2.0/../../../../arm-zephyr-eabi/bin/ld.bfd.exe: region `FLASH' overflowed by 37320 bytes
    collect2.exe: error: ld returned 1 exit status
    ninja: build stopped: subcommand failed.

    ...
    [396/415] Linking C executable zephyr\zephyr_pre0.elf
    c:/ncs/toolchains/cf2149caf2/opt/zephyr-sdk/arm-zephyr-eabi/bin/../lib/gcc/arm-zephyr-eabi/12.2.0/../../../../arm-zephyr-eabi/bin/ld.bfd.exe: warning: orphan section `.fonts' from `app/libapp.a(u8g2_fonts.c.obj)' being placed in section `.fonts'
    [397/415] Generating ../../zephyr/net_core_app_test_update.hex
    image.py: sign the payload
    [402/415] Linking C executable zephyr\zephyr.elf
    FAILED: modules/mcuboot/mcuboot_subimage-prefix/src/mcuboot_subimage-stamp/mcuboot_subimage-build mcuboot/zephyr/zephyr.hex mcuboot/zephyr/zephyr.elf C:/work/dev/if-device-nrf53/cc1-med/build/modules/mcuboot/mcuboot_subimage-prefix/src/mcuboot_subimage-stamp/mcuboot_subimage-build C:/work/dev/if-device-nrf53/cc1-med/build/mcuboot/zephyr/zephyr.hex C:/work/dev/if-device-nrf53/cc1-med/build/mcuboot/zephyr/zephyr.elf
    ...
    Memory region Used Size Region Size %age Used
    FLASH: 446762 B 1031680 B 43.30%
    RAM: 152572 B 440 KB 33.86%
    IDT_LIST: 0 GB 32 KB 0.00%
    ninja: build stopped: subcommand failed.

    I note that the application built correctly, and is around 446Kb (with the wifi enabled I'm thinking this was showing 901kB...). But this didnt stop the mcuboot subimage failing, this time with its 'text' segment overflowing the FLASH...

    Maybe I should try to reconfigure to stop using the partition manager if its creating a partition for mcuboot which is too small?

    or just build the mcuboot child image to check if that is the issue?

  • Hi Brian, 

    As far as I know you only need to call boot_write_img_confirmed() in the application at its first boot after the swap then it should be permanent. 

    You can find the call in the \nrf\samples\bluetooth\mesh\dfu\target sample. Where we call dfu_target_image_confirm() after bluetooth initialized. 

    From what I can see in the code swap type should be set to 

    swap_type = BOOT_SWAP_TYPE_PERM
    in boot_write_img_confirmed-> boot_set_next()

    Could you confirm you call dfu_target_image_confirm() in the application after the first boot after swapping ? 
    Please make sure you give the application enough time to write to flash before reset. 
  • As far as I know you only need to call boot_write_img_confirmed() in the application at its first boot after the swap then it should be permanent. 

    Yes, that was my understanding. My code calls 

    ret = boot_write_img_confirmed();
    exactly as per the sample. However the swap_type remains as BOOT_SWAP_TYPE_TEST - as far as I can see the bootutil code for boot_set_next() just updates the image_ok flag to SET...
    (inspecting the bootutil code on the mcu github)
    Where do you see the swap_type being updated?
  • Please make sure you give the application enough time to write to flash before reset. 

    yes, I give it at least 10s and this appears ot be ok as mcuboot sees image_ok as 0x1 after the reboot.

  • Hi Brian, 
    I was looking at this line https://github.com/mcu-tools/mcuboot/blob/main/boot/bootutil/src/bootutil_public.c#L534

    But the magic of the slot was not BOOT_MAGIC_UNSET ? I assume it's BOOT_MAGIC_GOOD?

    As far as I know at the first boot, the new image should already be in the primary slot and the old image should be in the secondary (test swap). 


    There are 3 scenarios where action (new swap) need to be taken as in here
    In your case if magic_primary_slot=BOOT_MAGIC_GOOD and image_ok_primary_slot= BOOT_FLAG_SET then I don't see why it would do a revert. 

  • Hi,

    Yes, thats what confuses me. At boot mcuboot tells me what its found:

    *** Booting nRF Connect SDK 3758bcbfa5cd ***
    I: Starting bootloader
    I: Primary image: magic=good, swap_type=0x2, copy_done=0x1, image_ok=0x1
    I: Secondary image: magic=good, swap_type=0x2, copy_done=0x3, image_ok=0x3
    I: Boot source: none
    I: Image index: 0, Swap type: test
    I: Starting swap using move algorithm.
    I: Bootloader chainload address offset: 0xe000

    In your case if magic_primary_slot=BOOT_MAGIC_GOOD and image_ok_primary_slot= BOOT_FLAG_SET then I don't see why it would do a revert. 

    And as you can see it has that, so it SHOULD NOT REVERT : and yet it does.

    Work around (but not a good long term solution) : I destroy the secondary slot by running a fake dfu_target process on it with just 1 buffer of 0's (which essentially makes the image invalid) -> mcuboot is forced to stay with my new primary). 

    Not very satisfactory, as it means I have no rollback later....but at least it stops it flip flopping at every boot!

Reply
  • Hi,

    Yes, thats what confuses me. At boot mcuboot tells me what its found:

    *** Booting nRF Connect SDK 3758bcbfa5cd ***
    I: Starting bootloader
    I: Primary image: magic=good, swap_type=0x2, copy_done=0x1, image_ok=0x1
    I: Secondary image: magic=good, swap_type=0x2, copy_done=0x3, image_ok=0x3
    I: Boot source: none
    I: Image index: 0, Swap type: test
    I: Starting swap using move algorithm.
    I: Bootloader chainload address offset: 0xe000

    In your case if magic_primary_slot=BOOT_MAGIC_GOOD and image_ok_primary_slot= BOOT_FLAG_SET then I don't see why it would do a revert. 

    And as you can see it has that, so it SHOULD NOT REVERT : and yet it does.

    Work around (but not a good long term solution) : I destroy the secondary slot by running a fake dfu_target process on it with just 1 buffer of 0's (which essentially makes the image invalid) -> mcuboot is forced to stay with my new primary). 

    Not very satisfactory, as it means I have no rollback later....but at least it stops it flip flopping at every boot!

Children
  • Hi Brian, 
    I haven't got the time to look into this yet. 
    But having to erase 2nd slot doesn't sound like a best option. 

    I would suggest to check the code where the logic for deciding reverting in MCUBoot to see why it revert. Also try to do a normal DFU with BLE or UART for example, and see how the image _magic and image _ok are written on first boot and on the second boot. 
    Will try to find time to look into this next week. 

  • But having to erase 2nd slot doesn't sound like a best option. 

    Totally agree.

    I would suggest to check the code where the logic for deciding reverting in MCUBoot to see why it revert.

    Well, I was hoping to avoid debugging mcuboot... especially as I am using the standard high level api calls to do the DFU (dfu_target) and mark the image as ok...

    MCUBoot  logic is convoluted, to say the least. I have gone through it at least twice, and my understand is that is image_ok is SET and the magic is GOOD, then it should NOT revert....

    Hopefully fresh eyes will let you see what the corner case is that stops it doing what is expected... 

  • Hi Brian, 

    Sorry for the late response. 
    I took a look at your result again. It seems that what you have: 
    I: Primary image: magic=good, swap_type=0x2, copy_done=0x1, image_ok=0x1
    I: Secondary image: magic=good, swap_type=0x2, copy_done=0x3, image_ok=0x3

    Matched with this scenario: 
    {
    .magic_primary_slot = BOOT_MAGIC_ANY,
    .magic_secondary_slot = BOOT_MAGIC_GOOD,
    .image_ok_primary_slot = BOOT_FLAG_ANY,
    .image_ok_secondary_slot = BOOT_FLAG_UNSET,
    .copy_done_primary_slot = BOOT_FLAG_ANY,
    .swap_type = BOOT_SWAP_TYPE_TEST,
    },

    You have magic_secondary_slot = good,  image_ok_secondary =0x3 (UNSET)  the rest is just any so it match and it explains why the swap type is test instead of non and it will be swapped. 

    Here is what I have in my test, first when I send the test image: 

    It matched with the scenario above => swap test.

    Then if I don't do anything, it will revert: 

    {
            .magic_primary_slot =       BOOT_MAGIC_GOOD,
            .magic_secondary_slot =     BOOT_MAGIC_UNSET,
            .image_ok_primary_slot =    BOOT_FLAG_UNSET,
            .image_ok_secondary_slot =  BOOT_FLAG_ANY,
            .copy_done_primary_slot =   BOOT_FLAG_SET,
            .swap_type =                BOOT_SWAP_TYPE_REVERT,
        },

    If I confirm the image instead, next boot is like this: 

    So it's very important that the secondary image's magic should be unset and the primary image_ok is set. 
    What I noticed in my case is that at the beginning before the first test swap, the primary image's magic is unset. This is very important. The primary image magic will be come the secondary image magic after the swap (it follows the image). So if at the beginning you already set the primary image magic to set, it may result in what you observed. 
    The only difference in my no swap booting is the secondary magic is unset (and swaptype =0x01 but  that's not important in my opinion). 
    Please take a look and check why you have both image magic = good. 

  • I don't think I explicitly set the magic to good anywhere... the code only does the call to say 'image ok' once it runs...

    And if we have done a successful update (secondary->primary), the primary will then have a good magic (because it was a good secondary), so for the next update both primary and secondary have magic=good...

    I thought the magic was just to let mcuboot know that the image is a legitimate mcuboot image, and its set by the build process?

  • So... I come back to this dfu issue after several weeks (months) of having fun migrating to NCS2.9, dealing with sysbuild etc....

    Now, I have 3 image slots, for the app, CPU-NET (its a nrf5340) and the wifi firmware (because the wifi code is so big I needed to push the wifi fw (78kb!) out to my external flash).

    First moan :  the dfu_target_xx subsystem (supposed to hide the bootloader from me) lacks both a 'check if image is confirmed' and a 'confirm the primary slot of an image is good' method. So I have to go directly to the mcuboot methods to do this anyway (but we already noted the failing of the dfu_target API in the thread above...) 

    Second moan : the mcuboot api (exposed in include/zephyr/dfu/mcuboot.h) contains a multi-image compatible call to confirm an image (boot_write_img_confirmed_multi(image_number)), but not a function to check if the image is already confirmed (to avoid writing to the mcuboot header at every boot...): there is boot_is_img_confirmed() but not boot_is_img_confirmed_multi(imgage_number). WHY OH WHY? and there isn't even the function in bootutil.c (which mcuboot.c delegates all the multi-image stuff to...)

    So I have to add my own function  to bootloader/mcuboot/boot/bootutil/src/bootutil_public.c (as this is the code that has the mapping table for the image number->flash area id....). Not great for future NCS updates....

    Now, with sysbuild when I flash the merged.hex, the initial state of the slots is

    *** Booting MCUboot v2.1.0-dev-12e5ee106034 ***
    *** Using nRF Connect SDK v2.9.0-7787b2649840 ***
    *** Using Zephyr OS v3.7.99-1f8f3dc29142 ***
    Starting bootloader

    Primary image: magic=unset, swap_type=0x1, copy_done=0x3, image_ok=0x3
    Secondary image: magic=unset, swap_type=0x1, copy_done=0x3, image_ok=0x3
    Boot source: none
    Image index: 0, Swap type: none

    Primary image: magic=unset, swap_type=0x1, copy_done=0x3, image_ok=0x3
    Secondary image: magic=unset, swap_type=0x1, copy_done=0x3, image_ok=0x3
    Boot source: none
    Image index: 1, Swap type: none

    Primary image: magic=good, swap_type=0x0, copy_done=0x1, image_ok=0x1
    Secondary image: magic=unset, swap_type=0x1, copy_done=0x3, image_ok=0x3
    Boot source: none
    Image index: 2, Swap type: none

    Bootloader chainload address offset: 0x11000

    Only the primary slots for images 0 and 2 are relevant, as they have been written from the merged.hex.

    Immediately we can see that the build sets the magic of the application image (0) primary slot to be UNSET, while the nrf70 wifi fw (2) primary slot has magic=good!

    But lets see if updating the NCS2.9 has fixed the mcuboot 'confirmation' issue. I remove my 'break the secondary image code', and make it confirm att 3 slots as 'ok' after 30s of execution....

    When I DFU the wifi fw in image 2, it all works ok :

    *** Booting MCUboot v2.1.0-dev-12e5ee106034 ***
    *** Using nRF Connect SDK v2.9.0-7787b2649840 ***
    *** Using Zephyr OS v3.7.99-1f8f3dc29142 ***
    Starting bootloader

    Primary image: magic=unset, swap_type=0x1, copy_done=0x3, image_ok=0x3
    Secondary image: magic=unset, swap_type=0x1, copy_done=0x3, image_ok=0x3
    Boot source: none
    Image index: 0, Swap type: none

    Primary image: magic=unset, swap_type=0x1, copy_done=0x3, image_ok=0x3
    Secondary image: magic=unset, swap_type=0x1, copy_done=0x3, image_ok=0x3
    Boot source: none
    Image index: 1, Swap type: none

    Primary image: magic=good, swap_type=0x0, copy_done=0x1, image_ok=0x1
    Secondary image: magic=good, swap_type=0x2, copy_done=0x3, image_ok=0x3
    Boot source: none
    Image index: 2, Swap type: test
    Starting swap using move algorithm.

    applicaiton then runs, my code then confirms all images as ok using the mcuboot function aftre 30s.

    on next reboot:


    *** Using Zephyr OS v3.7.99-1f8f3dc29142 ***
    Starting bootloader

    Primary image: magic=unset, swap_type=0x1, copy_done=0x3, image_ok=0x3
    Secondary image: magic=unset, swap_type=0x1, copy_done=0x3, image_ok=0x3
    Boot source: none
    Image index: 0, Swap type: none

    Primary image: magic=unset, swap_type=0x1, copy_done=0x3, image_ok=0x3
    Secondary image: magic=unset, swap_type=0x1, copy_done=0x3, image_ok=0x3
    Boot source: none
    Image index: 1, Swap type: none

    Primary image: magic=good, swap_type=0x0, copy_done=0x1, image_ok=0x1
    Secondary image: magic=unset, swap_type=0x1, copy_done=0x3, image_ok=0x3
    Boot source: none
    Image index: 2, Swap type: none

    The image 2 secondary slot has had its magic 'unset' and the swap_type set to NONE (1). So, no reverting. 

    BUT: when I do exactly the same operation for the application, the 'confirmation' does not work:

    *** Booting MCUboot v2.1.0-dev-12e5ee106034 ***
    *** Using nRF Connect SDK v2.9.0-7787b2649840 ***
    *** Using Zephyr OS v3.7.99-1f8f3dc29142 ***
    Starting bootloader

    Primary image: magic=good, swap_type=0x2, copy_done=0x1, image_ok=0x1
    Secondary image: magic=good, swap_type=0x2, copy_done=0x3, image_ok=0x3
    Boot source: none
    Image index: 0, Swap type: test

    Primary image: magic=unset, swap_type=0x1, copy_done=0x3, image_ok=0x3
    Secondary image: magic=unset, swap_type=0x1, copy_done=0x3, image_ok=0x3
    Boot source: none
    Image index: 1, Swap type: none

    Primary image: magic=good, swap_type=0x0, copy_done=0x1, image_ok=0x1
    Secondary image: magic=unset, swap_type=0x1, copy_done=0x3, image_ok=0x3
    Boot source: none
    Image index: 2, Swap type: none

    Starting swap using move algorithm.
    Bootloader chainload address offset: 0x11000

    -> update done ok
    after 30s my code confirms the images, exactly as it did for the wifi fw.
    [00:00:30.417,694] <inf> app: Boot : confirmed image 0 ok to mcuboot
    [00:00:30.417,816] <wrn> app: Boot : failed to confirm mcuboot image 1 as ok (-5)
    [00:00:30.418,182] <inf> app: Boot : image 2 is already flagged as ok to mcuboot

    reboot:


    *** Booting MCUboot v2.1.0-dev-12e5ee106034 ***
    *** Using nRF Connect SDK v2.9.0-7787b2649840 ***
    *** Using Zephyr OS v3.7.99-1f8f3dc29142 ***
    Starting bootloader

    Primary image: magic=good, swap_type=0x2, copy_done=0x1, image_ok=0x1
    Secondary image: magic=good, swap_type=0x2, copy_done=0x3, image_ok=0x3
    Boot source: none
    Image index: 0, Swap type: test

    Primary image: magic=unset, swap_type=0x1, copy_done=0x3, image_ok=0x3
    Secondary image: magic=unset, swap_type=0x1, copy_done=0x3, image_ok=0x3
    Boot source: none
    Image index: 1, Swap type: none

    Primary image: magic=good, swap_type=0x0, copy_done=0x1, image_ok=0x1
    Secondary image: magic=unset, swap_type=0x1, copy_done=0x3, image_ok=0x3
    Boot source: none
    Image index: 2, Swap type: none
    Starting swap using move algorithm.

    And we're back to flip-flop between the 2 slots at each reboot.

    In the wifi case, I don't  see what code has set the secondary slot's magic to unset, which is the differnce between the wifi and the app cases... I'm wondering if its actually the wifi setup firmware loader code?

    Any ideas?

Related