Direct-XIP MCUboot support seems to be missing from NCS/Zephyr

We are trying to implement Direct-XIP FOTA updates in NCS 2.2.0 and nrf52832.

I've been able to get around a NCS built issue  Direct-XIP MCUboot both images have the same version , and I am now able to Flash MCUboot + 0 image slot, but I can not consistently update the device using the Device Manager Android App (Version 1.6.0). Transferring an image to the second slot is ok, and the bootloader correctly loads it (if the version number is higher). However, the new image can not be confirmed and gets deleted on the next reboot.

Some more digging suggests that NCS/Zephyr's imgmgt just doesn't really support Direct-XIP.

Issues include the baked in assumption that slot0 is always running (which it isn't):

// img_mgmg_config.h
#define IMG_MGMT_BOOT_CURR_SLOT 0

// img_mgmt_state.c:93
    /* Slot 0 is always active. */
	/* XXX: The slot 0 assumption only holds when running from flash. */
	if (query_slot == IMG_MGMT_BOOT_CURR_SLOT) {
		flags |= IMG_MGMT_STATE_F_ACTIVE;
	}

But the issues run deeper than imgmgt. The application can't even set itself to being confirmed. As far as I can tell, boot_set_confirmed_multi(int image_index) in bootutil_public.c also assumes that the currently running application is in slot0 and will set that one to confirmed, even if called from an app running in slot 1!

Like I said in the previous ticket, the documentation is rather sparse.  https://developer.nordicsemi.com/nRF_Connect_SDK/doc/latest/nrf/app_dev/bootloaders_and_dfu/index.html suggests in the Table that Direct-XIP is supported. Then in https://developer.nordicsemi.com/nRF_Connect_SDK/doc/latest/mcuboot/readme-ncs.html it also suggests that all you have to do is enable the CONFIG_BOOT_BUILD_DIRECT_XIP_VARIANT.

But looking at the MCUBOOT discussion about a lack of support in boot-util (https://github.com/mcu-tools/mcuboot/discussions/1474) and the "support XIP in zephyr" issue https://github.com/zephyrproject-rtos/zephyr/issues/27673 a bunch of work is still required to make Direct-xip work in practice.

So my question is:

Is Direct-XIP in NCS actually not yet supported, even though it says it is? If not, can the documentation be updated to warn users like me? What would the required steps be to support it? If I'm getting this totally wrong, and it is well-supported, how do I make the image sticky (confirmed) after a DFU?

I hope I'm wrong and haven't just put hours into trying to get something working that just isn't there yet.

Parents
  • Hi markuckermann,

    Thank you for another detailed finding. I will also compile this into my internal communication with our bootloader team.

    It looks like the assumption that Slot 0 is always active comes from the non-direct-XIP solution, where the active image is always swapped into the Primary slot, and thus "Slot 0" is always active.

    I wonder if there is a MCUboot Kconfig that would result in different handling for Direct XIP cases. In any cases, the bootloader team will tell me.

    I hope to hear from them tomorrow. At the latest, I will update you by the end of Monday Mar 6.

    I am sorry for the inconvenience you are experiencing.

    Hieu

Reply
  • Hi markuckermann,

    Thank you for another detailed finding. I will also compile this into my internal communication with our bootloader team.

    It looks like the assumption that Slot 0 is always active comes from the non-direct-XIP solution, where the active image is always swapped into the Primary slot, and thus "Slot 0" is always active.

    I wonder if there is a MCUboot Kconfig that would result in different handling for Direct XIP cases. In any cases, the bootloader team will tell me.

    I hope to hear from them tomorrow. At the latest, I will update you by the end of Monday Mar 6.

    I am sorry for the inconvenience you are experiencing.

    Hieu

Children
  • Hi markuckermann, 

    Today another engineer looked at the findings you listed here today and observed the same findings as you did. They will try to provide a fix for them. They also think SMP Server might not work, but that is not confirmed yet.

    However, this engineer is on vacation so there will not be significant progresses in a while.

    Given the situation, I think we cannot expect Direct XIP to work well in a near future. Thus, I would like to look into other directions instead. 

    What is your future plan regarding updating your device? Is it going to be only bug fixes, or will there be feature changes?

    The thought did not cross my mind previously (my apology about this), but currently your image is almost too big for an application slot. That means that your future application version will be subjected to a very small space margin.

    In this situation, I would like to recommend these alternatives instead:

    1. Use an SoC variant with more flash
    2. Adding an external flash
      1. There is no official sample for DFU with the secondary image on external flash. However, this is being done on the nRF5340 SoC and the nRF9160 SiP successfully, so the risk of getting into the same situation with Direct XIP is lower.

    I am very sorry for the inconveniences you are having with Direct XIP and for being late with my follow-up.

    Hieu

  • Hi markuckermann,

    I don't have better news today, but I would like to relay some new information regarding Direct-XIP.

    The bootloader engineer I previously mentioned about looked into the problem a little bit more today. Here are their comments on the current state of things:

    • The current conclusion is that Direc-XIP with revert (CONFIG_BOOT_DIRECT_XIP_REVERT) is overall not well supported in the Zephyr upstream. After that is fixed in the Zephyr upstream, NCS will also inherit it after an upmerge.

    • On the other hand, Direct-XIP without revert should work. However, as the SMP Server doesn't seem to be able to detect the active slot, an out-of-the-box complete DFU solution is also not available.

    Hieu

  • Hi  

    Yes, that's the conclusion I expected. I will try and squeeze the build size further instead.

  • Hi Mark,

    You can consider turning off any NCS feature you are not using if you didn't do so already. I was able to reduce a sample application size by a lot more than 4 kB recently. You can find the details in this DevZone thread.
    Add mcuboot to connect SDK BLE sample "central and peripheral"

    In that thread, I also just now update with some other alternatives to be able to fit a bigger application. To my knowledge, all of them are currently not supported out of the box, and likely require from as much to a lot more effort to realize compared to Direct-XIP with revert, so perhaps they are not of your interest.  

    Hieu

Related