Sysbuild doesn't include MCUBoot headers in merged.hex when using upgradeable bootloader

I am in the process of migrating to NCS 2.7. My project includes NSIB, MCUBoot, TFM, and the application. Everything works fine with multi-image builds.

However when using sysbuild, MCUBoot (executing out of slot 0) is unable to find the headers for the s1_image and therefore aborts the boot. This happens because the merged.hex file flashed via 'west flash --erase' does not include the headers for the mcuboot or s1_image. If I manually flash signed_by_mcuboot_and_b0_s1_image.hex on top of the existing image then it will boot fine.

I am using a pm_static.yml, but I see the same behavior with dynamically generated partitions as well. For reference here are the flash regions from my partition_manager_report. 

  external_flash (0x200000 - 2048kB):
+------------------------------------------------------+
| 0x0: mcuboot_secondary (0xd7000 - 860kB)             |
| 0xd7000: persistent_settings (0x1000 - 4kB)          |
| 0xd8000: persistent_blob_oolong_task (0x4000 - 16kB) |
| 0xdc000: EXTERNAL_FLASH_UNUSED (0x124000 - 1168kB)   |
| 0x200000: external_flash (0x0 - 0B)                  |
+------------------------------------------------------+

  flash_primary (0x100000 - 1024kB):
+---------------------------------------------------+
+---0x0: b0_container (0x8000 - 32kB)---------------+
| 0x0: b0 (0x8000 - 32kB)                           |
+---0x8000: s0 (0x10000 - 64kB)---------------------+
| 0x8000: s0_pad (0x200 - 512B)                     |
+---0x8200: s0_image (0xee00 - 59kB)----------------+
| 0x8200: mcuboot (0xee00 - 59kB)                   |
+---------------------------------------------------+
| 0x17000: RESERVED_FOR_MOVE_S0 (0x1000 - 4kB)      |
+---0x18000: s1 (0x10000 - 64kB)--------------------+
| 0x18000: s1_pad (0x200 - 512B)                    |
| 0x18200: s1_image (0xee00 - 59kB)                 |
| 0x27000: RESERVED_FOR_MOVE_S1 (0x1000 - 4kB)      |
+---0x28000: mcuboot_primary (0xd8000 - 864kB)------+
| 0x28000: mcuboot_pad (0x200 - 512B)               |
+---0x28200: app_image (0xd6e00 - 859kB)------------+
+---0x28200: mcuboot_primary_app (0xd6e00 - 859kB)--+
+---0x28200: tfm_app (0xd6e00 - 859kB)--------------+
+---0x28200: tfm_secure (0x7e00 - 31kB)-------------+
| 0x28200: tfm (0x7e00 - 31kB)                      |
+---0x30000: tfm_nonsecure (0xcf000 - 828kB)--------+
| 0x30000: app (0xcf000 - 828kB)                    |
+---------------------------------------------------+
| 0xff000: RESERVED_FOR_MOVE_PRIMARY (0x1000 - 4kB) |
+---------------------------------------------------+

  otp (0x2f4 - 756B):
+------------------------------------+
| 0xff8108: provision (0x280 - 640B) |
| 0xff8388: otp (0x74 - 116B)        |
+-

The following files are all the ones merged into merged.hex from partition_manager.cmake:

  • 'BUILD/app_provision.hex' (implicit) 
  • 'BUILD/b0_container.hex' (implicit)
  • 'BUILD/s0_image.hex' (implicit)
  • 'BUILD/s0.hex' (implicit)
  • 'BUILD/s1.hex' (implicit)
  • 'BUILD/b0/zephyr/zephyr.hex' (explicit, unsigned hex name for b0 image)
  • 'BUILD/signed_by_b0_mcuboot.hex' (explicit, signed hex name for mcuboot image)
  • 'BUILD/signed_by_b0_s1_image.hex' (explicit, signed hex name for s1_image image)
  • 'BUILD/teabox/zephyr/teabox.signed.hex' (explicit, signed hex name for application image)

The files marked implicit are assigned at partition_manager.cmake#L235 whereas those marked explicit are assigned at partition_manager.cmake#L224. Now the implicit files are largely redundant, and could be somewhat cleaned up by changing the names of the partitions so that they don't match with the actual hex files. But more importantly, it can be seen that the signed mcuboot and s1_image hex files are for the images which have only been signed by b0. These images do not have valid mcuboot headers or signatures. This can be seen clearly by dumping the hex file ranges:

$ diff BUILD/s0.hex BUILD/s0_image.hex
$ diff BUILD/s0.hex BUILD/signed_by_b0_mcuboot.hex
$ srec_info BUILD/signed_by_b0_mcuboot.hex -Intel
Format: Intel Hexadecimal (MCS-86)
Data:   008200 - 011FC7
$ srec_info BUILD/signed_by_mcuboot_and_b0_mcuboot.hex -Intel
Format: Intel Hexadecimal (MCS-86)
Data:   008000 - 01205E

$ diff BUILD/s1.hex BUILD/signed_by_b0_s1_image.hex
$ srec_info BUILD/signed_by_b0_s1_image.hex -Intel
Format: Intel Hexadecimal (MCS-86)
Data:   018200 - 021FC7
$ srec_info BUILD/signed_by_mcuboot_and_b0_s1_image.hex -Intel
Format: Intel Hexadecimal (MCS-86)
Data:   018000 - 02205F

As can be seen above the ranges 0x8000-0x8200 and 0x18000-0x18200 are missing from the mcuboot and s1_images respectively. As expected, they are also missing from merged.hex:

$ srec_info BUILD/merged.hex -Intel
Format: Intel Hexadecimal (MCS-86)
Data:   000000 - 005D07
        008200 - 011FC7
        018200 - 021FC7
        028000 - 06B54A
        FF8130 - FF8193

The underlying issue in the sysbuild cmake structure is that BYPRODUCT_KERNEL_SIGNED_HEX_NAME is set to the b0 signed files at image_signing.cmake#L174. However, it should be reset to the mcuboot signed files inside b0_mcuboot_signing.cmake. If I add a sysbuild_set function to Zephyr and call 'sysbuild_set("${output}.hex" IMAGE ${application} VAR BYPRODUCT_KERNEL_SIGNED_HEX_NAME CACHE)' at b0_mcuboot_signing.cmake#L91 then it fixes the problem. The merged hex will contain signed_by_mcuboot_and_b0_mcuboot.hex and signed_by_mcuboot_and_b0_s1_image.hex and boot properly. The new merged.hex dump is shown below:

$ srec_info BUILD/merged.hex -Intel
Format: Intel Hexadecimal (MCS-86)
Data:   000000 - 005D07
        008000 - 01205E
        018000 - 02205E
        028000 - 06B54B
        FF8130 - FF8193

Now the thing I don't understand is that this should be broken for everybody using sysbuild and the upgradeable bootloader. I didn't test it, but a quick code inspection indicates that the problem still persists at tip of tree. Can someone at Nordic look into this and determine if this is the case?

Parents
  • Hello,

    Sorry for the late reply. 

    To me this sounds like the application is not properly ported for sysbuild, and that the bootloader is not properly included.

    Please have a look at the migration notes for NCS v2.7.0:

    https://docs.nordicsemi.com/bundle/ncs-2.9.0/page/nrf/releases_and_maturity/migration_guides.html

    Also, as an introduction to the sysbuild, I really recommend going through the nRF Connect SDK Intermediate course on DevAcademy. Particularly Lesson 8 is focused on Sysbuild and multi-image builds:

    https://academy.nordicsemi.com/courses/nrf-connect-sdk-intermediate/lessons/lesson-8-sysbuild/

    Best regards,

    Edvin

  • No, my application is fine. I confirmed that I can reproduce the issue on a pristine NCS 2.9.0 using the Zephyr sample applications and the nrf9160dk board.

    Let's use zephyr/samples/sysbuild/with_mcuboot as an example. If I add just SB_CONFIG_SECURE_BOOT_APPCORE=y to its sysbuild.conf I get the exact same behavior as described above.

    This is my build command:

    west build . -b nrf9160dk/nrf9160/ns --sysbuild -DZEPHYR_TOOLCHAIN_VARIANT=gnuarmemb -DGNUARMEMB_TOOLCHAIN_PATH=$PREBUILT_PATH/gcc-arm-none-eabi-13.2.rel1/linux/

    Here is the partition manager report.

      flash_primary (0x100000 - 1024kB):
    +--------------------------------------------------+
    +---0x0: b0_container (0x8000 - 32kB)--------------+
    | 0x0: b0 (0x8000 - 32kB)                          |
    +---0x8000: s0 (0xc000 - 48kB)---------------------+
    | 0x8000: s0_pad (0x200 - 512B)                    |
    +---0x8200: s0_image (0xbe00 - 47kB)---------------+
    | 0x8200: mcuboot (0xbe00 - 47kB)                  |
    +--------------------------------------------------+
    | 0x14000: EMPTY_0 (0x4000 - 16kB)                 |
    +---0x18000: s1 (0xc000 - 48kB)--------------------+
    | 0x18000: s1_pad (0x200 - 512B)                   |
    | 0x18200: s1_image (0xbe00 - 47kB)                |
    +--------------------------------------------------+
    | 0x24000: EMPTY_1 (0x4000 - 16kB)                 |
    +---0x28000: mcuboot_primary (0x68000 - 416kB)-----+
    +---0x28000: tfm_secure (0x8000 - 32kB)------------+
    | 0x28000: mcuboot_pad (0x200 - 512B)              |
    +---0x28200: app_image (0x67e00 - 415kB)-----------+
    +---0x28200: mcuboot_primary_app (0x67e00 - 415kB)-+
    | 0x28200: tfm (0x7e00 - 31kB)                     |
    +---0x30000: tfm_nonsecure (0x60000 - 384kB)-------+
    | 0x30000: app (0x60000 - 384kB)                   |
    +--------------------------------------------------+
    | 0x90000: mcuboot_secondary (0x68000 - 416kB)     |
    | 0xf8000: EMPTY_2 (0x8000 - 32kB)                 |
    +--------------------------------------------------+

    The merged.hex consists of the following files:

    • BUILD/app_provision.hex
    • BUILD/b0_container.hex
    • BUILD/s0_image.hex
    • BUILD/s0.hex
    • BUILD/s1.hex
    • BUILD/b0/zephyr/zephyr.hex
    • BUILD/signed_by_b0_mcuboot.hex (<-- wrong as described above)
    • BUILD/signed_by_b0_s1_image.hex (<-- wrong as described above)
    • BUILD/with_mcuboot/zephyr/zephyr.signed.hex

    The missing mcuboot headers (and trailers) can also clearly be seen in the merged.hex hex dump:

    $ srec_info BUILD/merged.hex -Intel
    Format: Intel Hexadecimal (MCS-86)
    Data:   000000 - 0002EB
            0002F0 - 0050C7
            0050D0 - 005CB5
            008200 - 00E097
            00E0A0 - 00EA9D
            00EAA0 - 00EB4F
            018200 - 01E097
            01E0A0 - 01EA9D
            01EAA0 - 01EB4F
            028000 - 03822D
            FF8130 - FF817F
    
    $ srec_info BUILD/signed_by_b0_mcuboot.hex -Intel
    Format: Intel Hexadecimal (MCS-86)
    Data:   8200 - E097
            E0A0 - EA9D
            EAA0 - EB4F
    
    $ srec_info BUILD/signed_by_mcuboot_and_b0_mcuboot.hex -Intel
    Format: Intel Hexadecimal (MCS-86)
    Data:   8000 - EBE6

  • Nick Ewalt said:
    • BUILD/signed_by_b0_mcuboot.hex (<-- wrong as described above)
    • BUILD/signed_by_b0_s1_image.hex (<-- wrong as described above)

    I am sure you have described it, but can we simplify this issue. What are the symptoms that you are seeing with this approach? When flashing it with the FW built in v2.9.0. Does it work? Do you see any logs that are saying that something is wrong?

    Best regards,

    Edvin

Reply
  • Nick Ewalt said:
    • BUILD/signed_by_b0_mcuboot.hex (<-- wrong as described above)
    • BUILD/signed_by_b0_s1_image.hex (<-- wrong as described above)

    I am sure you have described it, but can we simplify this issue. What are the symptoms that you are seeing with this approach? When flashing it with the FW built in v2.9.0. Does it work? Do you see any logs that are saying that something is wrong?

    Best regards,

    Edvin

Children
  • The sample will boot as-is, here is the output with CONFIG_MCUBOOT_LOG_LEVEL_INF=y:

    *** Booting nRF Connect SDK v2.9.0-7787b2649840 ***
    *** Using Zephyr OS v3.7.99-1f8f3dc29142 ***
    Attempting to boot slot 0.
    Attempting to boot from address 0x8200.
    I: Trying to get Firmware version
    I: Verifying signature against key 0.
    I: Hash: 0x41...4a
    I: Firmware signature verified.
    Firmware version 1
    I: Setting monotonic counter (version: 1, slot: 0)
    *** Booting MCUboot v2.1.0-dev-12e5ee106034 ***
    *** Using nRF Connect SDK v2.9.0-7787b2649840 ***
    *** Using Zephyr OS v3.7.99-1f8f3dc29142 ***
    I: Starting bootloader
    I: Primary image: magic=unset, swap_type=0x1, copy_done=0x3, image_ok=0x3
    I: Secondary image: magic=unset, swap_type=0x1, copy_done=0x3, image_ok=0x3
    I: Boot source: none
    I: Image index: 0, Swap type: none
    I: Primary image: magic=unset, swap_type=0x1, copy_done=0x3, image_ok=0x3
    I: Secondary image: magic=unset, swap_type=0x1, copy_done=0x3, image_ok=0x3
    I: Boot source: none
    I: Image index: 1, Swap type: none
    I: Bootloader chainload address offset: 0x28000
    �*** Booting nRF Connect SDK v2.9.0-7787b2649840 ***
    *** Using Zephyr OS v3.7.99-1f8f3dc29142 ***
    Address of sample 0x30000
    Hello sysbuild with mcuboot! nrf9160dk

    However that doesn't mean it is working correctly. Without the headers it won't be able to swap or revert updates. Additionally it will fail to boot with CONFIG_BOOT_VALIDATE_SLOT0=n (which is how I noticed this originally in my application). Here are the logs:

    *** Booting nRF Connect SDK v2.9.0-7787b2649840 ***
    *** Using Zephyr OS v3.7.99-1f8f3dc29142 ***
    Attempting to boot slot 0.
    Attempting to boot from address 0x8200.
    I: Trying to get Firmware version
    I: Verifying signature against key 0.
    I: Hash: 0x41...4a
    I: Firmware signature verified.
    Firmware version 1
    I: Setting monotonic counter (version: 1, slot: 0)
    *** Booting MCUboot v2.1.0-dev-12e5ee106034 ***
    *** Using nRF Connect SDK v2.9.0-7787b2649840 ***
    *** Using Zephyr OS v3.7.99-1f8f3dc29142 ***
    I: Starting bootloader
    I: Primary image: magic=unset, swap_type=0x1, copy_done=0x3, image_ok=0x3
    I: Secondary image: magic=unset, swap_type=0x1, copy_done=0x3, image_ok=0x3
    I: Boot source: none
    I: Image index: 0, Swap type: none
    I: Primary image: magic=unset, swap_type=0x1, copy_done=0x3, image_ok=0x3
    I: Secondary image: magic=unset, swap_type=0x1, copy_done=0x3, image_ok=0x3
    I: Boot source: none
    I: Image index: 1, Swap type: none
    E:   bad image magic 0xffffffff; Image=1
    E: Unable to find bootable image

    This happens because this code path is missing the conditional on image_validated_by_nsib as seen here. Note that the comment immediately above that code is wrong: image 1 primary is certainly not the currently executing mcuboot image, and therefore has not been validated by NSIB.

Related