Trouble migrating from NCS 2.6 to 2.8

Due to issues with the wifi stack, I have to move my project from 2.6 to 2.8.

I downloaded the new tools set (toolchains\cf2149caf2) and new nordic connect version 2.8.0 as per

https://docs.nordicsemi.com/bundle/ncs-latest/page/nrf/installation/updating.html

I read the migration guides for 2.7 and 2.8, and the 'using sysbuild for multi-image builds guides:

https://docs.nordicsemi.com/bundle/ncs-latest/page/zephyr/build/sysbuild/index.html#sysbuild

https://docs.nordicsemi.com/bundle/ncs-latest/page/nrf/releases_and_maturity/migration/migration_sysbuild.html

https://docs.nordicsemi.com/bundle/ncs-latest/page/nrf/releases_and_maturity/migration/migration_guide_2.8.html

Mostly this gave me a headache.

I created sysbuild.conf (at the root where I run west - is this right?), and filled in the stuff from the migration guides for mcuboot and nrf70 support.
# MCU boot config now handled by sysbuild
SB_CONFIG_BOOTLOADER_MCUBOOT=y
SB_CONFIG_MCUBOOT_BOOTLOADER_MODE_SWAP_WITHOUT_SCRATCH=y
SB_CONFIG_MCUBOOT_GENERATE_UNSIGNED_IMAGE=y

SB_CONFIG_NETCORE_HCI_IPC=y
SB_CONFIG_NETCORE_APP_UPDATE=y
SB_CONFIG_BOOT_SIGNATURE_TYPE_ECDSA_P256=y
SB_CONFIG_BOOT_SIGNATURE_KEY_FILE="/work/dev/if-device-nrf53/keys/bootloader_priv-ecdsa256.pem"
#CONFIG_MCUBOOT_SIGNATURE_KEY_FILE="/work/dev/if-device-nrf53/keys/bootloader_priv-ecdsa256.pem"

SB_CONFIG_MCUBOOT_BOOTLOADER_MODE_SWAP_WITHOUT_SCRATCH=y

# nrf7002 firmware handled by sysbuild
SB_CONFIG_WIFI_NRF70=y
SB_CONFIG_WIFI_NRF70_SYSTEM_MODE=y
#SB_CONFIG_WIFI_PATCHES_EXT_FLASH_XIP=y     later

# build of dfu packages
SB_CONFIG_DFU_MULTI_IMAGE_PACKAGE_BUILD=y
SB_CONFIG_DFU_MULTI_IMAGE_PACKAGE_APP=y
SB_CONFIG_DFU_MULTI_IMAGE_PACKAGE_NET=y
SB_CONFIG_DFU_MULTI_IMAGE_PACKAGE_WIFI_FW_PATCH=y

SB_CONFIG_DFU_ZIP=y
SB_CONFIG_DFU_ZIP_APP=y
SB_CONFIG_DFU_ZIP_NET=y
SB_CONFIG_DFU_ZIP_WIFI_FW_PATCH=y

# put slot2 in external flash
SB_CONFIG_PARTITION_MANAGER=y
SB_CONFIG_PM_MCUBOOT_PAD=y
SB_CONFIG_PM_EXTERNAL_FLASH_MCUBOOT_SECONDARY=y
I updated my prj.conf for the changes to nrf70 config option names
I created sysbuild/mcuboot and copied in the prj.conf from the previous child_image/mcuboot.conf
There was already a sysbuild/hci_ipc/prj.conf, I updated it with changes I had made to child_image/hci_ipc.conf 

Then I set west to use my board and sysbuild and try to build:

> west config build.board cc1medv1_nrf5340_cpuapp
> west config build.sysbuild True
> west build --build-dir cc1-med/build cc1-med --board cc1medv1_nrf5340_cpuapp --pristine -DBOARD_ROOT=%CD%
This got me cmake failing.
-- west build: making build dir C:\work\dev\if-device-nrf53\cc1-med\build pristine
-- west build: generating a build system
Loading Zephyr module(s) (Zephyr base): sysbuild_default
-- Found Python3: C:/ncs/toolchains/2d382dcd92/opt/bin/python.exe (found suitable version "3.12.4", minimum required is "3.8") found components: Interpreter
-- Cache files will be written to: C:/ncs/v2.8.0/zephyr/.cache
-- Found west (found suitable version "1.2.0", minimum required is "0.14.0")
-- Board: cc1medv1_nrf5340_cpuapp
Parsing C:/work/dev/if-device-nrf53/cc1-med/Kconfig.sysbuild
Loaded configuration 'C:/work/dev/if-device-nrf53/cc1-med/build/_sysbuild/empty.conf'
Merged configuration 'C:/work/dev/if-device-nrf53/cc1-med/build/_sysbuild/empty.conf'
Configuration saved to 'C:/work/dev/if-device-nrf53/cc1-med/build/zephyr/.config'
Kconfig header saved to 'C:/work/dev/if-device-nrf53/cc1-med/build/_sysbuild/autoconf.h'
CMake Error at C:/ncs/v2.8.0/nrf/sysbuild/CMakeLists.txt:117 (list):
  list GET given empty list
Call Stack (most recent call first):
  cmake/modules/sysbuild_extensions.cmake:583 (nrf_PRE_CMAKE)
  cmake/modules/sysbuild_extensions.cmake:583 (cmake_language)
  cmake/modules/sysbuild_images.cmake:16 (sysbuild_module_call)
  cmake/modules/sysbuild_default.cmake:20 (include)
  C:/ncs/v2.8.0/zephyr/share/zephyr-package/cmake/ZephyrConfig.cmake:75 (include)
  C:/ncs/v2.8.0/zephyr/share/zephyr-package/cmake/ZephyrConfig.cmake:92 (include_boilerplate)
  C:/ncs/v2.8.0/zephyr/share/sysbuild-package/cmake/SysbuildConfig.cmake:8 (include)
  template/CMakeLists.txt:10 (find_package)


--
   *****************************
   * Running CMake for cc1-med *
   *****************************

Loading Zephyr default modules (Zephyr base).
-- Application: C:/work/dev/if-device-nrf53/cc1-med
-- CMake version: 3.21.0
-- Using NCS Toolchain 2.8.20241106.194216054162 for building. (C:/ncs/toolchains/2d382dcd92/cmake)
-- Found Python3: C:/ncs/toolchains/2d382dcd92/opt/bin/python.exe (found suitable version "3.12.4", minimum required is "3.8") found components: Interpreter
-- Cache files will be written to: C:/ncs/v2.8.0/zephyr/.cache
-- Zephyr version: 3.7.99 (C:/ncs/v2.8.0/zephyr)
-- Found west (found suitable version "1.2.0", minimum required is "0.14.0")
-- Board: cc1medv1_nrf5340_cpuapp
-- Found host-tools: zephyr 0.16.8 (C:/ncs/toolchains/2d382dcd92/opt/zephyr-sdk)
-- Found toolchain: zephyr 0.16.8 (C:/ncs/toolchains/2d382dcd92/opt/zephyr-sdk)
-- Found Dtc: C:/ncs/toolchains/2d382dcd92/opt/bin/dtc.exe (found suitable version "1.4.7", minimum required is "1.4.6")

-- Found BOARD.dts: C:/work/dev/if-device-nrf53/boards/arm/cc1medv1_nrf5340/cc1medv1_nrf5340_cpuapp.dts
'label' is marked as deprecated in 'properties:' in C:/ncs/v2.8.0/zephyr/dts/bindings\audio\nordic,nrf-pdm.yaml for node /soc/peripheral@50000000/pdm@26000.
devicetree error: gpio controller <Node /soc/peripheral@50000000/i2c@9000/mcp23017@20 in 'C:/ncs/v2.8.0/zephyr/misc/empty_file.c'> for <Node /soc/peripheral@50000000/pdm@26000/en in 'C:/ncs/v2.8.0/zephyr/misc/empty_file.c'> lacks binding
CMake Error at C:/ncs/v2.8.0/zephyr/cmake/modules/dts.cmake:295 (execute_process):
  execute_process failed command indexes:

    1: "Child return code: 1"

Call Stack (most recent call first):
  C:/ncs/v2.8.0/zephyr/cmake/modules/zephyr_default.cmake:133 (include)
  C:/ncs/v2.8.0/zephyr/share/zephyr-package/cmake/ZephyrConfig.cmake:66 (include)
  C:/ncs/v2.8.0/zep-- Configuring incomplete, errors occurred!
hyr/share/zephyr-package/cmake/ZephyrConfig.cmake:92 (include_boilerplate)
  CMakeLists.txt:8 (find_package)


CMake Error at cmake/modules/sysbuild_extensions.cmake:514 (message):
  CMake configure failed for Zephyr project: cc1-med

  Location: C:/work/dev/if-device-nrf53/cc1-med
Call Stack (most recent call first):
  cmake/modules/sysbuild_images.cmake:20 (ExternalZephyrProject_Cmake)
  cmake/modules/sysbuild_default.cmake:20 (include)
  C:/ncs/v2.8.0/zephyr/share/zephyr-package/cmake/ZephyrConfig.cmake:75 (include)
  C:/ncs/v2.8.0/zephyr/share/zephyr-package/cmake/ZephyrConfig.cmake:92 (include_boilerplate)
  C:/ncs/v2.8.0/zephyr/share/sysbuild-package/cmake/SysbuildConfig.cmake:8 (include)
  template/CMakeLists.txt:10 (find_package)


-- Configuring incomplete, errors occurred!
See also "C:/work/dev/if-device-nrf53/cc1-med/build/CMakeFiles/CMakeOutput.log".
←[91mFATAL ERROR: command exited with status 1: 'C:\ncs\toolchains\2d382dcd92\opt\bin\cmake.EXE' -DWEST_PYTHON=C:/ncs/toolchains/2d382dcd92/opt/bin/python.exe '-BC:\work\dev\if-device-nrf53\cc1-med\build' -GNinja -DBOARD=cc1medv1_nrf5340_cpuapp '-DBOARD_ROOT=C:\work\dev\if-device-nrf53' '-SC:\ncs\v2.8.0\zephyr\share\sysbuild' '-DAPP_DIR:PATH=C:\work\dev\if-device-nrf53\cc1-med'
The DTS also seems to now fail (didn't with 2.6) with:
devicetree error: gpio controller <Node /soc/peripheral@50000000/i2c@9000/mcp23017@20 in 'C:/ncs/v2.8.0/zephyr/misc/empty_file.c'> for <Node /soc/peripheral@50000000/pdm@26000/en in 'C:/ncs/v2.8.0/zephyr/misc/empty_file.c'> lacks binding
 
Now I'm stuck....I'll look at the DTS thing tomorrow... I think I already knew that the DTS stuff had changed 2.6 to 2.8... but the migration guide doesn't reference it as far as I can see?
  • So, wrt the build failing at the CMake config step, which is due to this error (in build/cc1-med/CMakeFiles/CMakeError.log):

    Its failing to link a cmake tool?

    c:/ncs/toolchains/2d382dcd92/opt/zephyr-sdk/arm-zephyr-eabi/bin/../lib/gcc/arm-zephyr-eabi/12.2.0/../../../../arm-zephyr-eabi/bin/ld.exe: c:/ncs/toolchains/2d382dcd92/opt/zephyr-sdk/arm-zephyr-eabi/bin/../lib/gcc/arm-zephyr-eabi/12.2.0/../../../../arm-zephyr-eabi/lib\libc.a(lib_a-exit.o): in function `exit':
    exit.c:(.text.exit+0x34): undefined reference to `_exit'
    collect2.exe: error: ld returned 1 exit status

    It seems this is due to having this in prj.conf:

    CONFIG_NRF_WIFI_PATCHES_BUILTIN=y
    WTF is pretty much all I can say.
    This is very frustrating to find this kind of issue, which has wasted 2 days of my time to hunt down...
    Lets see if any of the wifi stuff works when it finally builds...
  • Just saw your latest reply, so reposting.

    BrianW said:

    It seems this is due to having this in prj.conf:

    CONFIG_NRF_WIFI_PATCHES_BUILTIN=y

    Ah, I'll use this sample as a comparison basis myself then, since it might be more relevant to the audio sample.

    One thing that might be the reason for the errors you saw is if you're using the regular build or "build with pristine" actions in the extension. Sometimes these configure or build with artifacts from your previous build remaining (can't quite explain it, but the short explanation is "pristine" is not the same as "delete the previous build and build fresh with new configurations"). As a sanity check for those types of error I typically remove the build folder before building the project.

    BrianW said:
    CONFIG_NRF_WIFI_PATCHES_BUILTIN=y

    I'm not 100% sure, but this might be a child image/sysbuild rooted issue, i.e that CONFIG_DISK_DRIVER_FLASH should be present within mcuboot.conf as well as within prj.conf

    It could also be that something is missing w.r.t how the flash device is listed https://docs.nordicsemi.com/bundle/ncs-latest/page/zephyr/services/storage/disk/access.html (this is just throwing a link I've seen elsewhere in somewhat similar error messages at the problem and I've not verified that it's working. Hopefully it might be something else than a wild guess, but let me know if it was helpful)

    BrianW said:
    We'll see if any of this actually works once I get the build to fully complete!

    Glad to hear that atleast those two warnings were resolved!

    BrianW said:
    CONFIG_NRF_WIFI_PATCHES_BUILTIN=y

    I can't quite see why the Kconfig would cause configuring issues, but if I should guess it might be due to it not knowing where to place the files in the memory map. This config is not present in wifi/sta by defualt? https://github.com/nrfconnect/sdk-nrf/blob/main/samples/wifi/sta/prj.conf

    As mentioned I will be out of office for a while now, and I wish you good luck if you'll keep working until I'm back. I will pick up the case(s) on the 6th of January.

    Kind regards,
    Andreas

  • I'm not 100% sure, but this might be a child image/sysbuild rooted issue, i.e that CONFIG_DISK_DRIVER_FLASH should be present within mcuboot.conf as well as within prj.conf

    Well, currently I'm just building in child image mode (I know the sysbuild is breaking with a failure to find some partition stuff anyway). 

    And mcuboot seems to build ok for the CPUNET image, and also should not access the external flash as a 'disk' anyway.

    So I'm pretty sure its because the DTC is not generating the DT_HAS_ZEPHYR_FLASH_DISK_ENABLED

    I put in the DTS:

    &mx25r64 {
        partitions {
            compatible = "fixed-partitions";
            #address-cells = <1>;
            #size-cells = <1>;

            fatfs_partition: partition@0 {
                label = "storage";
                reg = <0x00000000 0x00600000>;
            };
        };
    };

    / {
        msc_disk0 {
            compatible = "zephyr,flash-disk";
            partition = <&fatfs_partition>;
            disk-name = "NAND";
            cache-size = <4096>;
        };
    };
    Which appears to have fixed it! (to be tested once I have an actually running image)
    So now... the build breaks because
    1/ the nrfx I2S API has been updated with breaking changes (to v3.7.0?) 
    2/ The nrfx PDM API has been updated with breaking changes...
    As mentioned I will be out of office for a while now, and I wish you good luck if you'll keep working until I'm back. I will pick up the case(s) on the 6th of January.

    Well, I hope the entire support isn't just you? Some of us have projects to deliver, and so far this migration has cost me a full week and its not done yet...

  • Update:

     - updated app code to deal with I2S and PDM API breaking changes

     - updated BLE code to deal with removal of bt_read_static_address()

     The build now completes in child_image mode (sysbuild is for another day).

    Subsequently testing the generated image:

     - increased system heap size from 70kB to 100kB to deal with increased malloc demands (probably wpa_supplient)

     - increased syswordq stack size from 4kB to 8kB to avoid stack overflow

     - disabled BLE as this crashes when calling bt_enable(NULL).

     [00:01:32.781,463] <err> os: ***** BUS FAULT *****
    [00:01:32.786,865] <err> os:   Precise data bus error
    [00:01:32.792,602] <err> os:   BFAR Address: 0xb722ca38
    [00:01:32.798,522] <err> os: r0/a1:  0xb722c9b4  r1/a2:  0x3fa470b8  r2/a3:  0x6a22f762
    [00:01:32.807,189] <err> os: r3/a4:  0x0002e335 r12/ip:  0x01010101 r14/lr:  0x0002e35d
    [00:01:32.815,826] <err> os:  xpsr:  0x21000200
    [00:01:32.821,044] <err> os: Faulting instruction address (r15/pc): 0x00075c16
    [00:01:32.828,948] <err> os: >>> ZEPHYR FATAL ERROR 25: Unknown error on CPU 0
    [00:01:32.836,822] <err> os: Current thread: 0x20007ba0 (sysworkq)
    [00:01:32.843,688] <err> os: Halting system
    

    I cannot find what causes this yet - the debugger shows its in the hci_init(). This worked fine on 2.6. Removed any shared ram/HCI partition stuff but no difference...I have tried updating the CPU-NET image with the new build in case not compatible but no difference.... (hoping that this is not necessary as this will make DFU for my existing devices hard as hadn't got CPU-NET update to work yet with mcuboot...)

    Any ideas?

    - got wifi to run (but not sure if its working properly as can't config it) after changing the DTS setup

     - now stuck with external flash access (as FAT32 filesystem) as this happens:

    [00:00:09.943,542] <err> qspi_nor: nRF5340 anomaly 159 conditions detected
    [00:00:09.951,019] <err> qspi_nor: Set the CPU clock to 64 MHz before starting QSPI operation

    and then the filesystem access (ELM) is broken.

    This appears to be a 'fix' for anomly 159 (which I was not aware of experiencing). Why does this break FS access? (which stops the app loading the wifi config to test WPA connection!)

    So now I have 2 blocking runtime issues (bt_enable() crash, QPSI 'fix breaks file system).

  • now stuck with external flash access (as FAT32 filesystem) as this happens:

    Fullscreen
    1
    2
    [00:00:09.943,542] <err> qspi_nor: nRF5340 anomaly 159 conditions detected
    [00:00:09.951,019] <err> qspi_nor: Set the CPU clock to 64 MHz before starting QSPI operation
    XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

    and then the filesystem access (ELM) is broken.

    This appears to be a 'fix' for anomly 159 (which I was not aware of experiencing). Why does this break FS access? (which stops the app loading the wifi config to test WPA connection!)

    Digging into this, the log comes from zephyr/drivers/flasqh/nrf_qspi_nor.c (which uses modules/hal/nordic/nrfx/drivers/src/nrf_qpsi.c). The log is actually just when the code translates the underlying error of  NRFX_ERROR_FORBIDDEN into a ECANCELED:

    #if NRF53_ERRATA_159_ENABLE_WORKAROUND
        case NRFX_ERROR_FORBIDDEN:
            LOG_ERR("nRF5340 anomaly 159 conditions detected");
            LOG_ERR("Set the CPU clock to 64 MHz before starting QSPI operation");
            return -ECANCELED;
    #endif
    The code in nrf_qspiçnor.c deals with forcing the first condition for the workaround of the anomaly 159 (HCLK_192M divider should be 0), but not the second (CPU clock must be 64HMz ie divider set to 1). And this is what is detected in nrfs_qspi.c :
    static bool .qspi_errata_159_conditions_check(void)
    {
    #if NRF_CLOCK_HAS_HFCLK192M && NRF53_ERRATA_159_ENABLE_WORKAROUND
        if ((nrf_clock_hfclk192m_div_get(NRF_CLOCK) != NRF_CLOCK_HFCLK_DIV_1) ||
            (nrf_clock_hfclk_div_get(NRF_CLOCK) != NRF_CLOCK_HFCLK_DIV_2))
        {
            return true;
        }
        else
    #endif
        {
            return false;
        }
    }
    Given that the flash qspi access already does the job of changing the hclk192M divider before and after access, why doesn't it also change the CPU clock to avoid the issue? Instead it just logs that it detected it and thats why your app is stuffed...
  • Related