Bug in Secure Boot / b0n related partition configuration for nRF5340 under sysbuild in SDK v2.9.0?

Hi, I'm having problems with migrating my project to SDK v2.9.0. I'm using mcuboot and I want to use b0n on the net core and update support for the net core application via mcuboot (running on the application core). Everything was running fine on the SDK v2.5.3 with multi-image build, and I managed to port the project to sysbuild on SDK v2.9.0 with mcuboot, but when I try to enable the net core update, the build process fails.

I tried to re-create the configuration with the peripheral_lbs sample (SDK v2.9.0). After checking out the sample project, I added sysbuild.conf to the project folder:

#SB_CONFIG_NETCORE_APP_UPDATE=y
#SB_CONFIG_SECURE_BOOT_NETCORE=y
#SB_CONFIG_SECURE_BOOT_SIGNING_KEY_FILE="C:/<mypath>/mykey.pem"

SB_CONFIG_BOOTLOADER_MCUBOOT=y
SB_CONFIG_MCUBOOT_MODE_SINGLE_APP=n
SB_CONFIG_BOOT_SIGNATURE_KEY_FILE="C:/<mypath>/mykey.pem"
SB_CONFIG_BOOT_SIGNATURE_TYPE_ECDSA_P256=y

SB_CONFIG_NETCORE_HCI_IPC=y

So far, everything compiled without any problem. But when I try to enable the upper three lines (for b0n and net core update support, the build process fails: I receive the following error message seven times before the build process is aborted:

In file included from C:/ncs/v2.9.0/bootloader/mcuboot/boot/zephyr/include/sysflash/sysflash.h:12,
                 from C:/ncs/v2.9.0/bootloader/mcuboot/boot/bootutil/src/bootutil_priv.h:33,
                 from C:/ncs/v2.9.0/bootloader/mcuboot/boot/bootutil/src/tlv.c:24:
C:/ncs/v2.9.0/bootloader/mcuboot/boot/zephyr/include/sysflash/pm_sysflash.h: In function '__flash_area_ids_for_slot':
C:/ncs/v2.9.0/bootloader/mcuboot/boot/zephyr/include/sysflash/pm_sysflash.h:20:37: error: 'PM_MCUBOOT_PRIMARY_1_ID' undeclared (first use in this function); did you mean 'PM_MCUBOOT_PRIMARY_ID'?
   20 | #define FLASH_AREA_IMAGE_1_SLOTS    PM_MCUBOOT_PRIMARY_1_ID, PM_MCUBOOT_SECONDARY_1_ID,
      |                                     ^~~~~~~~~~~~~~~~~~~~~~~
C:/ncs/v2.9.0/bootloader/mcuboot/boot/zephyr/include/sysflash/pm_sysflash.h:38:29: note: in expansion of macro 'FLASH_AREA_IMAGE_1_SLOTS'
   38 |                             FLASH_AREA_IMAGE_1_SLOTS
      |                             ^~~~~~~~~~~~~~~~~~~~~~~~
C:/ncs/v2.9.0/bootloader/mcuboot/boot/zephyr/include/sysflash/pm_sysflash.h:55:9: note: in expansion of macro 'ALL_AVAILABLE_SLOTS'
   55 |         ALL_AVAILABLE_SLOTS
      |         ^~~~~~~~~~~~~~~~~~~
C:/ncs/v2.9.0/bootloader/mcuboot/boot/zephyr/include/sysflash/pm_sysflash.h:20:37: note: each undeclared identifier is reported only once for each function it appears in
   20 | #define FLASH_AREA_IMAGE_1_SLOTS    PM_MCUBOOT_PRIMARY_1_ID, PM_MCUBOOT_SECONDARY_1_ID,
      |                                     ^~~~~~~~~~~~~~~~~~~~~~~
C:/ncs/v2.9.0/bootloader/mcuboot/boot/zephyr/include/sysflash/pm_sysflash.h:38:29: note: in expansion of macro 'FLASH_AREA_IMAGE_1_SLOTS'
   38 |                             FLASH_AREA_IMAGE_1_SLOTS
      |                             ^~~~~~~~~~~~~~~~~~~~~~~~
C:/ncs/v2.9.0/bootloader/mcuboot/boot/zephyr/include/sysflash/pm_sysflash.h:55:9: note: in expansion of macro 'ALL_AVAILABLE_SLOTS'
   55 |         ALL_AVAILABLE_SLOTS
      |         ^~~~~~~~~~~~~~~~~~~
C:/ncs/v2.9.0/bootloader/mcuboot/boot/zephyr/include/sysflash/pm_sysflash.h:20:62: error: 'PM_MCUBOOT_SECONDARY_1_ID' undeclared (first use in this function); did you mean 'PM_MCUBOOT_SECONDARY_ID'?
   20 | #define FLASH_AREA_IMAGE_1_SLOTS    PM_MCUBOOT_PRIMARY_1_ID, PM_MCUBOOT_SECONDARY_1_ID,
      |                                                              ^~~~~~~~~~~~~~~~~~~~~~~~~
C:/ncs/v2.9.0/bootloader/mcuboot/boot/zephyr/include/sysflash/pm_sysflash.h:38:29: note: in expansion of macro 'FLASH_AREA_IMAGE_1_SLOTS'
   38 |                             FLASH_AREA_IMAGE_1_SLOTS
      |                             ^~~~~~~~~~~~~~~~~~~~~~~~
C:/ncs/v2.9.0/bootloader/mcuboot/boot/zephyr/include/sysflash/pm_sysflash.h:55:9: note: in expansion of macro 'ALL_AVAILABLE_SLOTS'
   55 |         ALL_AVAILABLE_SLOTS
      |         ^~~~~~~~~~~~~~~~~~~

It appears like the build system suddenly is using different partition names when I enable b0n?

Under SDK v2.5.3, the app core partitions were just named mcuboot_primary (mcuboot_primary_app) and mcuboot_secondary. Both net core update and app core update were loaded in mcuboot_secondary for flashing. Now, under SDK v2.9.0 and sysbuild, there seems to be a different naming convention (see PM_MCUBOOT_PRIMARY_1_ID error messages).

Some feedback would be appreciated. Is this a bug in the build/configuration system? Or is there a new convention for partition naming that changes when b0n is enabled?

Best regards,
Michael

PS: Update: I tried disabling SB_CONFIG_NETCORE_APP_UPDATE, but leave secure boot for the net core active. In this case, I get the following error messages:

In file included from C:/ncs/v2.9.0/nrf/include/dfu/pcd.h:28,
                 from C:/ncs/v2.9.0/bootloader/mcuboot/boot/zephyr/main.c:95:
C:/ncs/v2.9.0/nrf/include/dfu/pcd_common.h: In function 'pcd_write_cmd_lock_debug':
C:/ncs/v2.9.0/nrf/include/dfu/pcd_common.h:39:25: error: 'PM__PCD_SRAM_ADDRESS' undeclared (first use in this function); did you mean 'PM_SRAM_ADDRESS'?
   39 | #define PCD_CMD_ADDRESS PM__PCD_SRAM_ADDRESS
      |                         ^~~~~~~~~~~~~~~~~~~~

Beside 'pcd_write_cmd_lock_debug', the same error occurs for functions 'pcd_read_cmd_done' and 'pcd_read_cmd_lock_debug'.

Parents
  • Hi,

    Are you using a static partition file from your existing project? '_primary_1' is the emulated flash partition in RAM used to transfer the FW update to the network core. 

    Project I used to generate the memory report above (only verified that it builds without errors)

    peripheral_lbs_dfu_test_nrf5340_v290.zip

    Best regards,

    Vidar

  • Hi Vidar,

    thanks for checking. Unfortunately, your sample uses external flash and if I disable this in the configuration, I get the same error message about mcuboot_primary_1  again:

    #SB_CONFIG_PM_EXTERNAL_FLASH_MCUBOOT_SECONDARY=y

    Could you please look a little bit deeper into this? Normally, all partitions should be generated by the build system, which I can use as a basis for my static partitioning later. At least this is what the Nordic Academy course said. If the build system's partitioning mechanism does not work by itself without external flash, this is a software bug in my opinion...

    Also, I noticed that the build system also creates mcuboot_secondary_1 in DT_CHOSEN(nordic_pm_ext_flash). Under SDK v2.5.3, I could use the same slot (mcuboot_secondary) for app core and net core updates. How can I get this running in SDK v2.9.0? ALso, in SDK v2.5.3, I didn't have a ram_flash partition or region.

    I do not have external flash on my board, and I cannot reserve another 256K of internal flash for the net core's update image. Up until now, I used application specific code to transfer new firmware images directly into mcuboot_secondary and flag them for confirmed update via mcuboot (executed upon the next system restart). Mcuboot seemed to detect by itself whether this was an app core or a net core image (maybe based on metadata included in the signatures). We need to keep this convention in our update process.

  • Hi Michael,

    Glad to hear that it is working on your end as well. Just remember to test this with your bootloader hex from v2.5.3 if you have devices out in the field with this version of the bootloader.

    puz_md said:
    According to my understanding (and the partitioning training in Nordic Academy), the steps (2) through (4) should not be needed to be performed by the developer. The SDK should create partitions ready for use. Will this issue be fixed in future SDK versions?

    The problem is that dynamic partitioning is not supported with your configuration. This was partially addressed by the PR I linked to earlier. I will report this internally.

    puz_md said:
    Do I need the "CONFIG_MCUBOOT_VERIFY_IMG_ADDRESS=n"? It seems my project also works without it...

    Did you verify that the netcore image got updated even if you had additional address verification enabled? 

    puz_md said:
    And do I need the Konfig and Kconfig.sysbuild? I've never used these files in my configuration. Currently, I'm using "SB_CONFIG_NETCORE_HCI_IPC=y" in sysbuild.conf. Does this something similar to the ipc_radio configuration in Kconfig.sysbuild?

    Kconfig.sysbuild changes the default value to 'y'  instead of explicitly enabling the HCI IPC image in sysbuild.conf. This allows the project to be built for other single core  devices (nRF52/nRF54) without triggering any errors because the symbol can't be selected. You are selecting HCI IPC instead of the IPC radio firmware when you select SB_CONFIG_NETCORE_HCI_IPC.

    Best regards,

    Vidar

  • Hi Vidar,

    Glad to hear that it is working on your end as well. Just remember to test this with your bootloader hex from v2.5.3 if you have devices out in the field with this version of the bootloader.

    I tested it already and it worked.

    The problem was that I did not program the new mcuboot binary and test the following issue:

    puz_md said:
    Do I need the "CONFIG_MCUBOOT_VERIFY_IMG_ADDRESS=n"? It seems my project also works without it...

    Did you verify that the netcore image got updated even if you had additional address verification enabled? 

    You're right, it doesn't work without turning address verification off! I forgot to program the new mcuboot for testing.

    Unfortunately, I ran into another problem now with the new mcuboot: It programs the app core software just fine. But the net core software is not removed/invalidated from the slot after updating the net core (which seems to work by the way): During each power cycle, the system stays for many seconds in mcuboot and probably tries to update the net core again. After the update try, I can connect via Auterm again and the slot readout reports has the following bits set:

    The app core is updated just fine and Auterm doesn't report an image in the slot anymore (and the bootloader does not consume extra time during each startup after the first update attempt). Any idea what could be wrong with the net core processing this time?

    By the way, only if I erase the flash pages of the secondary slot, in Auterm "Slot 1" disappears and I can upload another update again.

    Best regards,
    Michael

  • Hi Michael,

    puz_md said:
    Unfortunately, I ran into another problem now with the new mcuboot: It programs the app core software just fine. But the net core software is not removed/invalidated from the slot after updating the net core (which seems to work by the way):

    Could you please confirm if you see the same with your original MCUBoot version? I'm able to reproduce the same here but I could not find any relevant changes in loader.c between v2.5.x and v2.9.0.

    Best regards,

    Vidar

  • Hi Vidar,

    I did not see this behavior with the old mcuboot (SDK v2.5.3). It might be related to the swap feature (I did not configure the 'overwrite only' parameter, maybe I can try it another time). For now, this is the observation that I make with mcuboot SDK v2.9.0:

    (1) App core update: After the update is processed, the first and the last flash page of the secondary partition is erased (all bytes 0xff). The image is NOT swapped as I would expect (and as it worked with SDK v2.3.0).

    (2) Net core update: After the update, the secondary partition is untouched! In the last flash page of the secondary partition, the update/confirmed bits are still set! I think this should not happen. Also, considering the time delay after each power cycle in this state, it is possible that the net core is programmed again (I think in the past the keys were compared and if they matched, no extra programming took place...)

    Hint: I had to configure this intermediate 'ram_flash' partition for the net core which did not exist in SDK v2.3.0. Maybe mcuboot tries to invalidate the copy in 'ram_flash'?

    Here a memory dump of the last flash page of my secondary partition (0x88000-0xf7fff), I will cut the last 16 bytes bytes as they might be a confidential key):

    0x000F7FC0 | FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF | ................
    0x000F7FD0 | FF FF FF FF FF FF FF FF 03 FF FF FF FF FF FF FF | ................
    0x000F7FE0 | FF FF FF FF FF FF FF FF 01 FF FF FF FF FF FF FF | ................

    With mcuboot SDK v2.5.3, the application was switched. I did not test it with the net core though, but the slot was available always again after a net core update, so no blocked slots.

    My conclusion: There are two problems in mcuboot SDK v2.9.0: First, there swap does not work anymore. Second: For the net core, the image in the secondary slot is not invalidated and triggers endless updates. The update procedure itself seems to work for both cores: The content in the primary slots is updated.

    I hope this helps with the debugging.

    Best regards,
    Michael

    Update: I checked it again with the net core update on SDK v2.5.3: I did only compare some bytes with the eye at 1K offset (compared 256 bytes at 0x89000, which is 1K into the secondary slot). The data did not change after the update, but in this case, the last secondary slot page (NOT the first page here!) was deleted again which prevented unwanted additional updates. So: Apparently no swap for the net core, but correct unflagging of the image via the last slot page

  • Hi Michael,

    Thank you for the update.  I think this explains. I had enabled the overwrite only mode in my version (sysbuild.conf --> SB_CONFIG_MCUBOOT_MODE_OVERWRITE_ONLY=y)  which makes it so that the image trailer is not erased here: https://github.com/nrfconnect/sdk-mcuboot/blob/148712e7b4618aadbedd04e8d3ce5c3847d3be4f/boot/bootutil/src/loader.c#L1519. I've proposed this fix to the developers for the overwrite only mode: 

    diff --git a/boot/bootutil/src/loader.c b/boot/bootutil/src/loader.c
    index f35ec786..fd01ef5b 100644
    --- a/boot/bootutil/src/loader.c
    +++ b/boot/bootutil/src/loader.c
    @@ -1615,6 +1615,10 @@ boot_validated_swap_type(struct boot_loader_state *state,
                      */
                     rc = swap_erase_trailer_sectors(state,
                             secondary_fa);
    +#elif defined(MCUBOOT_OVERWRITE_ONLY)
    +                BOOT_LOG_INF("Erasing secondary slot");
    +                rc = boot_erase_region(secondary_fa, 0, secondary_fa->fa_size);
    +                assert(rc == 0);      
     #endif
                     swap_type = BOOT_SWAP_TYPE_NONE;
                 }
    
     

    Best regards,

    Vidar

Reply
  • Hi Michael,

    Thank you for the update.  I think this explains. I had enabled the overwrite only mode in my version (sysbuild.conf --> SB_CONFIG_MCUBOOT_MODE_OVERWRITE_ONLY=y)  which makes it so that the image trailer is not erased here: https://github.com/nrfconnect/sdk-mcuboot/blob/148712e7b4618aadbedd04e8d3ce5c3847d3be4f/boot/bootutil/src/loader.c#L1519. I've proposed this fix to the developers for the overwrite only mode: 

    diff --git a/boot/bootutil/src/loader.c b/boot/bootutil/src/loader.c
    index f35ec786..fd01ef5b 100644
    --- a/boot/bootutil/src/loader.c
    +++ b/boot/bootutil/src/loader.c
    @@ -1615,6 +1615,10 @@ boot_validated_swap_type(struct boot_loader_state *state,
                      */
                     rc = swap_erase_trailer_sectors(state,
                             secondary_fa);
    +#elif defined(MCUBOOT_OVERWRITE_ONLY)
    +                BOOT_LOG_INF("Erasing secondary slot");
    +                rc = boot_erase_region(secondary_fa, 0, secondary_fa->fa_size);
    +                assert(rc == 0);      
     #endif
                     swap_type = BOOT_SWAP_TYPE_NONE;
                 }
    
     

    Best regards,

    Vidar

Children
  • Hi Vidar,

    sorry for the confusion, but ist seems I also had the overwrite only configuration activated all along. I probably added it when evaluating your sample configuration. So we're using the same configuration and I did not test SWAP with the new SDK yet.

    Your fix works on my side, too. There is only one "beauty bug": After the net core image, the whole second slot is erased now, and after the app core update, only the first and the last sector is erased. Works both, but would be nice if the behavior was the same.

    Unfortunately, the fix didn't make it into SDK v2.9.1 which was released today. There is also another fix still missing that was discussed with AmandaH about 1-2 weeks ago in another ticket and is still missing from the SDK. This fix is necessary for correct application of the net core's static partitioning file: Remove set(static_configuration) from C:\ncs\v2.9.0\nrf\cmake\sysbuild\partition_manager.cmake, line 126.

    Please talk to your developer team if they can bring both fixes in a v2.9.x release (so that I don't have to migrate again to SDK v3.0.0 which probably has major changes).

    Further, it would be good to have these two bugs in the "Known Issues" section of the SDK documentation (sections 'Bootloader' and 'Build System'). Maybe it can help other developers, too.

    Best regards,
    Michael

  • Hi Michael,

    Are you using the bootloader you had from v2.5.3 in production? In that case, I would have considered to continue using the same bootloader hex to ensure you have the same bootloader on all your devices. 

    puz_md said:
    Your fix works on my side, too. There is only one "beauty bug": After the net core image, the whole second slot is erased now, and after the app core update, only the first and the last sector is erased. Works both, but would be nice if the behavior was the same.

    I was considering whether to erase just the header and trailer but decided to erase the whole slot for simplicity. You can use the code here as reference if you want to erase just the header+trailer.

    I've created a ticket for this case in our internal bug tracker so the developers can follow up on the issues we've discussed.

    Best regards,

    Vidar

Related