Fixing the "known issue" Partitioning limitation with MCUboot swap move

The subject issue is documented here as "NCSDK-20567: Partitioning limitation with MCUboot swap move":


This issue has been open for while. 

These two DevZone tickets refer to this issue and mention the impact is that you can use only around 95% of the flash space:
1. https://devzone.nordicsemi.com/f/nordic-q-a/112343/mcuboot-image-in-the-primary-slot-is-not-valid-unable-to-find-bootable-image

2. https://devzone.nordicsemi.com/f/nordic-q-a/107599/mcuboot-sign-image-possible-wrong-slot-size/475558

Here are my questions:

1. In the Workaround description, there is a formula to calculate the approximate size limitation. What CONFIG item is referred to for mcuboot_primary_size?  I am not able to figure out how to use that equation to result in a value of approximately 95% of the "Region Size" for the application image indicated in the build log.

2. Why is "additional" margin suggested?  Can that margin be quantified?

3. Are there still plans to fix this?  A lot of internal flash space seems to be wasted due to this issue.

4. When the fix is available, do you think the fix can be patched by us to an earlier NCS version, such as NCS 2.6?

  • First, lets add here reference to MCUboot design doc, explaining the trailer:
    https://github.com/mcu-tools/mcuboot/blame/268968f8e4a834ea3825d98439e4c321de0c6a78/docs/design.md#L493

    Q1) Swap means that image in slot1 will some time in the future end up in slot0; and may be reverted if only uploaded for test. So any image has to fit into both slots.

    If you have slot1 bigger than slot0, you may be able to upload thing that will never fit into image1, or will fit, but mcuboot will never be able to revert it, because it will not be able to write the trailer part.

    Think of this like this: it something is too big to fit into either slot, due to trailer and swap info that would appear in that slot, that is the same for the other slot; because in the end that thing will end up in the slot it is too big for.

    Q2) Trailer is added from the end of slot (or partition has been dedicated for it).

    Can you tell me if it is accurate?

    No it is not. The trailer is written "backwards" from the end of partition.

    You have positioned the trailer in your images, it is placed at the end of partition/slot not app image.

    If you try to sign your image, using imgtool, to be already confirmed or set for test, the size of binary will be extended to full image slot, because it adds the trailer then.

    Lets say you have slot size of 256k; if you generate hello world, that takes ~34k, it will take ~34-35k signed, but if you sign if to be already confirmed, the signed image will take exactly 256k, because it fills all the space between image and the trailer that has to be placed at the end of image. The trailer comes from the end of partition, it is written "backwards'.

    If you have swap move, what happens is that MCUboot shifts up slot0 first, by one page, to release slot0 page for swap from slot1. So you need this extra one page in primary slot. Then it starts to write log from the back of the slot, after the trailer (again, backwards). The trailer does not have to take entire page and can cut into last page taken by application image (after the shift), because the image will not probably take entire page. If trailer is bigger then the space left in that last page, then trailer will need separate page and the space in last page, used by application image, will not be used.

    Swap-move does not use scratch page. What it does is moves image by one page, in slot0, which gives it ability to immediately place first page from slot1 into first page of slot0, as the first page has already been moved, basically it looks like this:
    1) in slot0 move page i to page i + 1, where i is N, N -1, ... 1, 0; this lives you with image in pages 1, to N + 1 of slot0
    2) page i from slot1 is taken and placed in page i of slot0
    3) page i + 1 is taken and placed in page i of slot1
    4) ++i; goto 2) till the larger image is completely moved.
    Of course there is erase of page between each copy and entry to trailer log is added to indicate how far operation went, to be able to recover in case device resets.
    Note that you will have at least one and at most two copies of move page at any point.

    If you have secondary slot bigger than primary, you may and up uploading something you will not be able to swap. Or once you have it swapped, you will not be able to revert it.

    Q3)

    What do you mean by "256 logs take up to 3072"? What logs are you referring to?

    By log I mean, what is correct name, "Swap status" https://github.com/mcu-tools/mcuboot/blame/268968f8e4a834ea3825d98439e4c321de0c6a78/docs/design.md#L1057 . MCUboot writes here each step it has completed to know where to start in case device reboots or something happens that interrupts the swap.
    256 is 3 states * 4 write block, which is 3072 size of max swap status, trailer would take additional 80 bytes. But you if you define 256 pages of 4095 and have slot of 256k, then 3/4 of your swap status will never be written, so the additional 80 byte just merge into it.

    Q4)

    Does CONFIG_BOOT_MAX_IMG_SECTORS affect the "Trailer Move Page / Swap Status Area" Size?

    Yes.

    The CONFIG_BOOT_MAX_IMG_SECTORS determines size of allocated structures for storing information on layout of a device (aka pages) and how much will the "Swap status" cut into the image from the end of slot. You can basically see this as "biggest size of slot(partition) divided by size of device page".

    If the actual used sectors is smaller than the max image sectors, will the Trailer and Trailer Move Page / Swap Status Area be smaller as well?

    MCUboot will use all CONFIG_BOOT_MAX_IMG_SECTORS in RAM, when discovering device layout, but when writing swap status it may only take several, for example if you swap two images of ~40k (between slot0 and slot1), the swap status will only write ~10 * 4 * 3 bytes of status , because only 10 pages are moved. But you have to have enough space to cover entire slot.

    Basically, how do we determine the size of the "Trailer Move Page / Swap Status Area" ?

    CONFIG_BOOT_MAX_IMG_SECTORS is how many sectors MCUboot will be able to move in a biggest slot (still should be the same). It has to be able to cover the biggest slot/partition that is involved in image update.

    Here is how the swap status looks like: https://github.com/mcu-tools/mcuboot/blame/268968f8e4a834ea3825d98439e4c321de0c6a78/docs/design.md#L493 

    If you set CONFIG_BOOT_MAX_IMG_SECTORS=2048, you will have swap status of at most 24k, if you have page of 4k, that is able to handle 8MiB of slot.

  • Hi  , thank you so much for your help with all of this. Just wanted to share where we ended up, in case anyone else is on the same path:

    We ended up adding a script to our build pipeline that calculates our flash usage of the maximum firmware-upgradeable image given the trailer size, etc.

     

    As discussed above, the west build reports flash usage of the primary partition, but staying below 100% alone doesn't ensure firmware upgradeability - because of the necessary space for the trailer and image swap page.

     

    So our new script takes that extra space into account, and analyzes files from the build/ directory in order to tell us the percentage used of the maximum firmware-upgradeable image size.

     

    Here is our final "Build Artifacts" diagram for “Swap Using Move Without Scratch” based on NCS v2.6.x:
     

     

     

    And here are the key calculations we use in our script for the app image:

     

    Maximum Image Size Calculations

     

    Block Size: 4 Bytes (nRF5340 write block size)

    Page Size: 4096 Bytes (nRF5340 page erase/write size)

     

    BOOT_MAX_IMG_SECTORS: This is the max number of "sectors" necessary for mcuboot to manage our image slots. Recommended to include some padding. If unnecessarily large, will use unnecessary RAM and slow the bootloader down (more time to zero out the RAM).

    BOOT_MAX_IMG_SECTORS >= ceil [ ( largest image slot size ) / ( page size ) ]

     

    Swap Status Region Size: Part of the trailer. Depends on BOOT_MAX_IMG_SECTORS and which swap method you are using. For swap with move, we need 3 states - each in a discrete write block. mcuboot/docs/design.md at 268968f8e4a834ea3825d98439e4c321de0c6a78 · mcu-tools/mcuboot

    Swap Status Region Size = BOOT_MAX_IMG_SECTORS * Block Size * s

    s (number of states for swap move without scratch) = 3

     

    Trailer Size: Part of the slot, this is the part of the slot that MCUBoot uses to keep track of the upgrade swap. Rounded up to the nearest page, where Page Size is 4096 bytes (determined by nRF5340 hardware).

    Raw Trailer Size = ( Swap Status Region ) + 80 bytes

    Raw Trailer Size = ( BOOT_MAX_IMG_SECTORS * Block Size ) * 3 + 80

    Trailer Size = round up to Page Size ( Raw Trailer Size )

     

    Maximum Upgradeable Image Size: Primary slot size (internal flash) for a firmware upgradeable image. Per below, +1 Page because we need a page as temporary scratch for the swap. See here: https://docs.nordicsemi.com/bundle/ncs-latest/page/mcuboot/design.html#swap_using_move_without_using_scratch

    Maximum Image Size = ( N - 1 ) * Sector Size - Trailer Size

    Sector Size = Page Size = 4096

    Trailer Size = nearest page[ BOOT_MAX_IMG_SECTORS * Block Size * s + 80 ]

    N = floor( Primary Slot Size / Page Size ) = number of full pages in primary slot

    Maximum Image Size = ( N - 1 ) * Page Size - Trailer Size

    Primary Slot Flash Usage = 100 * Image Size / Maximum Image Size

    This can also be rearranged/described as:

    Necessary Primary Slot Size for a Given App Image:

    Primary Slot Size >= App Image Size + Trailer Size + 1 Move Page (Swap Sector Page)

     

     

     

    And if you're curious, here's an example of our script's output at the end of a build:

Related