MCUboot warnings + app errors when using external flash

Disclaimer: I'm new to both Nordic SDK and Zephyr in general so apologize in advance for glaring mistakes

Using SDK v2.1.2
Trying to implement OTA DFU functionality, hence need bootloader.
Using MCUboot (version enclosed with SDK) for that purpose.
My Zephyr-based app image is too large (>250KB) to fit twice in the internal flash of nRF52833 (512KB), so I have to place secondary partition in external SPI-connected NOR flash (Winbond W25Q80DV).

I managed to build both mcuboot and app, but when I load merged file, I get a warning from MCUBoot.

** Booting Zephyr OS build v3.1.99-ncs1-1 ***
I: Starting bootloader
W: Failed reading sectors; BOOT_MAX_IMG_SECTORS=128 - too small?
W: Cannot upgrade: not a compatible amount of sectors
I: Bootloader chainload address offset: 0x10000
*** Booting Zephyr OS build v3.1.99-ncs1-1 ***
Starting app...

Upon further inspection, the chain of errors is as following:
boot_initialize_area --> flash_area_get_sectors--> flash_area_layout --> flash_area_open --> get_flash_area_from_id 

static inline struct flash_area const *get_flash_area_from_id(int idx)
{
    for (int i = 0; i < flash_map_entries; i++) {
        if (flash_map[i].fa_id == idx) {
            return &flash_map[i];
        }
    }

    return NULL; <----------------------
}

After the warnings, MCUboot proceeds to load the app, but app misbehaves, which it doesn't do when loaded standalone or when bootloader is built without external flash configuration.
Misbehavior is strangely confined to BLE stack (my app implements a BLE peripheral)

E: Too big advertising data
Failed to set advertising data (-22)
W: opcode 0x2039 status 0x42
E: Failed to start advertiser
Failed to start advertising set (-5)


Another piece of information, I tried
CONFIG_SINGLE_APPLICATION_SLOT=y
in bootloader's config overlay file, which eliminates the need for configuring external flash in device tree and config
That makes both MCUBoot and app happy, but of course this is not an acceptable workaround, need two images for OTA DFU.

Any advice on what am I doing wrong with the external flash configuration (or in general) would be greatly appreciated

My prj.conf
4628.prj.conf

Main device tree
myboard.dts

Device tree overlay (used by both app and bootloader)
myboard_uart.overlay

Bootloader config overlay
bootloader_uart.conf

Build command line

west build --build-dir $BUILD_DIR $TOP_DIR --pristine --board 32-00008618 -- -DUSE_PARTITION_MANAGER=1 -DNCS_TOOLCHAIN_VERSION:STRING="NONE" -DCONFIG_SIZE_OPTIMIZATIONS=y -DCONFIG_DEBUG_THREAD_INFO=y -DCONF_FILE:STRING="${TOP_DIR}/prj.conf" -DBOARD_ROOT:STRING="$TOP_DIR" -DDTC_OVERLAY_FILE:STRING="${TOP_DIR}/myboard_uart.overlay" -DOVERLAY_CONFIG:STRING="${TOP_DIR}/factory_mode.conf" -Dmcuboot_OVERLAY_CONFIG="${TOP_DIR}/bootloader_uart.conf" -Dmcuboot_DTC_OVERLAY_FILE="${TOP_DIR}/myboard_uart.overlay"

Parents
  • I added
    CONFIG_PM_OVERRIDE_EXTERNAL_DRIVER_CHECK=y
    to both main proj.conf and bootloader config overlay file
    That took care of bootloader warnings.

    MCUBoot seems happy now and launches the app from primary slot, no warnings

    *** Booting Zephyr OS build v3.1.99-ncs1-1 ***
    I: Starting bootloader
    I: Primary image: magic=unset, swap_type=0x1, copy_done=0x3, image_ok=0x3
    I: Secondary image: magic=unset, swap_type=0x1, copy_done=0x3, image_ok=0x3
    I: Boot source: none
    I: Swap type: none
    I: Bootloader chainload address offset: 0x10000
    *** Booting Zephyr OS build v3.1.99-ncs1-1 ***
    Starting Cyclone app...

    App is still failing in BLE code and crashes almost immediately

    Main thread started
    E: Too big advertising data
    Failed to set advertising data (-22)
    E: ***** USAGE FAULT *****
    E: Unaligned memory access
    E: r0/a1: 0x00000030 r1/a2: 0x00000000 r2/a3: 0x20001bf8
    E: r3/a4: 0x00002125 r12/ip: 0x00000000 r14/lr: 0x0003dddb
    E: xpsr: 0x81000200
    E: s[ 0]: 0x00000000 s[ 1]: 0x00000000 s[ 2]: 0x00000000 s[ 3]: 0x00000000
    E: s[ 4]: 0x00000000 s[ 5]: 0x00000000 s[ 6]: 0x00000000 s[ 7]: 0x00000000
    E: s[ 8]: 0x00000000 s[ 9]: 0x00000000 s[10]: 0x00000000 s[11]: 0x00000000
    E: s[12]: 0x00000000 s[13]: 0x00000000 s[14]: 0x00000000 s[15]: 0x00000000
    E: fpscr: 0x00000000
    E: Faulting instruction address (r15/pc): 0x0003dbd0
    E: >>> ZEPHYR FATAL ERROR 0: CPU exception on CPU 0
    E: Current thread: 0x20002d40 (sysworkq)
    *** Booting Zephyr OS build v3.1.99-ncs1-1 ***

    PC points to kernel's remove_timeout()

    I'll debug this tomorrow

Reply
  • I added
    CONFIG_PM_OVERRIDE_EXTERNAL_DRIVER_CHECK=y
    to both main proj.conf and bootloader config overlay file
    That took care of bootloader warnings.

    MCUBoot seems happy now and launches the app from primary slot, no warnings

    *** Booting Zephyr OS build v3.1.99-ncs1-1 ***
    I: Starting bootloader
    I: Primary image: magic=unset, swap_type=0x1, copy_done=0x3, image_ok=0x3
    I: Secondary image: magic=unset, swap_type=0x1, copy_done=0x3, image_ok=0x3
    I: Boot source: none
    I: Swap type: none
    I: Bootloader chainload address offset: 0x10000
    *** Booting Zephyr OS build v3.1.99-ncs1-1 ***
    Starting Cyclone app...

    App is still failing in BLE code and crashes almost immediately

    Main thread started
    E: Too big advertising data
    Failed to set advertising data (-22)
    E: ***** USAGE FAULT *****
    E: Unaligned memory access
    E: r0/a1: 0x00000030 r1/a2: 0x00000000 r2/a3: 0x20001bf8
    E: r3/a4: 0x00002125 r12/ip: 0x00000000 r14/lr: 0x0003dddb
    E: xpsr: 0x81000200
    E: s[ 0]: 0x00000000 s[ 1]: 0x00000000 s[ 2]: 0x00000000 s[ 3]: 0x00000000
    E: s[ 4]: 0x00000000 s[ 5]: 0x00000000 s[ 6]: 0x00000000 s[ 7]: 0x00000000
    E: s[ 8]: 0x00000000 s[ 9]: 0x00000000 s[10]: 0x00000000 s[11]: 0x00000000
    E: s[12]: 0x00000000 s[13]: 0x00000000 s[14]: 0x00000000 s[15]: 0x00000000
    E: fpscr: 0x00000000
    E: Faulting instruction address (r15/pc): 0x0003dbd0
    E: >>> ZEPHYR FATAL ERROR 0: CPU exception on CPU 0
    E: Current thread: 0x20002d40 (sysworkq)
    *** Booting Zephyr OS build v3.1.99-ncs1-1 ***

    PC points to kernel's remove_timeout()

    I'll debug this tomorrow

Children
  • Hi,

    Alex_S@Aperia said:
    1. Partition manager report:

    This looks fine 

    Alex_S@Aperia said:
    2. Yes, without MCUboot/DFU support software application runs as expected.

    Thank you for clarifying

    Alex_S@Aperia said:
    3. My firmware (as in standalone Zephyr-based app) runs fine on my board which has nRF82533, that has sufficient ROM (as in internal flash) for single application image.

    Noted

    Alex_S@Aperia said:
    To clarify - I'm not accessing external flash yet, just configuring bootloader for future use of it as secondary partition during OTA DFU.
    Right now, I'm simply invoking MCUBoot to start my app form primary partition in internal flash

    Noted

    Alex_S@Aperia said:
    What I'm not sure about is how I'm not sure how to make this call chain
    boot_initialize_area --> flash_area_get_sectors--> flash_area_layout --> flash_area_open --> get_flash_area_from_id 
    aware of  secondary being placed in external flash and therefore will not be included in internal flash map

    The call sequence should be set up relatively automatically if you're using any FW update methods present in the SDK, I believe this may be a distraction and that if we fix the external flash setup this should be resolved

    Alex_S@Aperia said:
    default_driver_kconfig: CONFIG_NORDIC_QSPI_NOR <----------------------

    QSPI is the default configuration for all nRF DKs when it comes to external flash with the exception of nRF91DKs and it is chosen when you add your external flash device. The overlay file in the serial_lte_modemhttps://github.com/nrfconnect/sdk-nrf/blob/v2.5.1/applications/serial_lte_modem/boards/nrf9160dk_nrf9160_ns.overlay, showcases how to add external flash over SPI and not QSPI

    ---

    To summarize: I think what may be causing most of the issues is the external flash device setup, so I suggest we focus on making sure this device is properly setup over SPI before we move on to debugging the rest

    Kind regards,
    Andreas

  • Thank you Andreas !

    Apologies for duplicating my other response (I think I messed up the Q/A sequence in the ticket)

    I think I "fixed" external flash set up by adding
    CONFIG_PM_OVERRIDE_EXTERNAL_DRIVER_CHECK=y
    (I added it in both prj.conf and in bootloader overlay config)
    I found couple of release notes in SDK that indicate that this flag is exactly for the cases of external flash being NOT QSPI, see
    nrf/scripts/partition_manager/partition_manager.rst
    nrf/doc/nrf/releases/release-notes-2.1.0.rst
    (my SDK version is 2.1.2)

    After this MCUboot launches app without any warnings:

    *** Booting Zephyr OS build v3.1.99-ncs1-1 ***
    I: Starting bootloader
    I: Primary image: magic=unset, swap_type=0x1, copy_done=0x3, image_ok=0x3
    I: Secondary image: magic=unset, swap_type=0x1, copy_done=0x3, image_ok=0x3
    I: Boot source: none
    I: Swap type: none
    I: Bootloader chainload address offset: 0x10000
    *** Booting Zephyr OS build v3.1.99-ncs1-1 ***
    Starting Cyclone app...

    The problem now is that app fails to set up BT stack and crashes, which it doesn't do when executed standalone, without bootloader.

    I: SoftDevice Controller build revision:
    I: 29 5c 92 f1 36 81 92 d1 |)\..6...
    I: b7 a9 f0 f1 99 e9 4c 19 |......L.
    I: 1f 23 83 4a |.#.J
    I: HW Platform: Nordic Semiconductor (0x0002)
    I: HW Variant: nRF52x (0x0002)
    I: Firmware: Standard Bluetooth controller (0x00) Version 41.37468 Build 2457941745
    I: Identity: FA:40:F4:1D:3F:D5 (random)
    I: HCI: version 5.3 (0x0c) revision 0x11d8, manufacturer 0x0059
    I: LMP: version 5.3 (0x0c) subver 0x11d8
    Failed to create advertiser set (-22)
    Failed to set-up connectable BLEble_init() failed
    Set Tx power level to 4
    Actual Tx Power: 4
    NFC Sense started
    UART init
    Starting pump main loop
    Pump main loop started
    E: Too big advertising data
    Failed to set advertising data (-22)
    E: ***** USAGE FAULT *****
    E: Unaligned memory access
    E: r0/a1: 0x00000030 r1/a2: 0x00000000 r2/a3: 0x20001bf8
    E: r3/a4: 0x00002125 r12/ip: 0x00000000 r14/lr: 0x0003dddb
    E: xpsr: 0x81000200
    E: s[ 0]: 0x00000000 s[ 1]: 0x00000000 s[ 2]: 0x00000000 s[ 3]: 0x00000000
    E: s[ 4]: 0x00000000 s[ 5]: 0x00000000 s[ 6]: 0x00000000 s[ 7]: 0x00000000
    E: s[ 8]: 0x00000000 s[ 9]: 0x00000000 s[10]: 0x00000000 s[11]: 0x00000000
    E: s[12]: 0x00000000 s[13]: 0x00000000 s[14]: 0x00000000 s[15]: 0x00000000
    E: fpscr: 0x00000000
    E: Faulting instruction address (r15/pc): 0x0003dbd0
    E: >>> ZEPHYR FATAL ERROR 0: CPU exception on CPU 0
    E: Current thread: 0x20002d40 (sysworkq)

    The only difference from an app's perspective between working and not working cases is that with bootloader it is linked with offset of 64KB (this is how much I allocate for bootloader)
    CONFIG_PM_PARTITION_SIZE_MCUBOOT=0x10000
    Unaligned memory access seems like a clue...

  • Solved the BLE stack problem, it was due to bug in my app invoking BT stack APIs incorrectly.

  • Hi,

    Alex_S@Aperia said:
    After this MCUboot launches app without any warnings:

    Thats great! Seems like this removed one variable from the investigation

    Alex_S@Aperia said:
    E: ***** USAGE FAULT *****
    E: Unaligned memory access

    As you've stated

    Alex_S@Aperia said:
    Solved the BLE stack problem, it was due to bug in my app invoking BT stack APIs incorrectly.

    in your latest reply, this is typically thrown when your attempting to access something outside the memory range or in an unexpected memory location, which could be caused by invoking the BT stack APIs incorrectly due to the partition change caused by adding the bootloader

    Nonetheless, I'm glad to hear that you were able to resolve the issue, and apologies for not checking in directly after your previous reply. I was unfortunately caught up with other business

    Let me know if you have any unanswered questions remaining in this case and I'll have a look! :) If not, I'll mark it as resolved, and you can as always feel free to open new cases if you have other questions in the future

    Kind regards,
    Andreas

Related