IPC breaks when enabling bluetooth mesh on nrf54l15

Hello,

We are developing a product based on the nRF54L15, using BT Mesh. We also use the FLPR. For communication between the cores, we use ICMSG.

Separately, ICMSG and BT Mesh work well, but for some reason, when we include BT Mesh, cpuapp cannot set up ICMSG properly, even if the ICMSG setup is done before `bt_enable()`. The callback `ipc_ept_cfg.cb.bound` is never called. We are getting serial logging from the FLPR, so we know that is running and its `ipc_ept_cfg.cb.bound` is called.

This question (The IPC and bt_enable are turned on at the same time, and the chip is not working) about the nRF53 is similar, but there is no IPC Radio Firmware on the nRF54 as far as I can tell.

Here is the relevant part of our prj.conf for cpuapp:

CONFIG_SYSTEM_WORKQUEUE_STACK_SIZE=8192
CONFIG_MAIN_STACK_SIZE=8192
CONFIG_LOG_BUFFER_SIZE=8192
CONFIG_BT_RX_STACK_SIZE=5120

CONFIG_LOG=y
CONFIG_REQUIRES_FLOAT_PRINTF=y

CONFIG_BT_SETTINGS=y
CONFIG_FLASH=y
CONFIG_FLASH_MAP=y
CONFIG_SETTINGS=y
CONFIG_SOC_FLASH_NRF_PARTIAL_ERASE=n
CONFIG_SPI_NOR=n
CONFIG_NVS=n
CONFIG_NVS_LOOKUP_CACHE=n
CONFIG_SETTINGS_NVS_NAME_CACHE=n
CONFIG_ZMS=y
CONFIG_SETTINGS_ZMS_CUSTOM_SECTOR_COUNT=y
CONFIG_SETTINGS_ZMS_SECTOR_COUNT=8
CONFIG_ZMS_LOOKUP_CACHE=y
CONFIG_ZMS_LOOKUP_CACHE_SIZE=512
CONFIG_ZMS_LOOKUP_CACHE_FOR_SETTINGS=y

CONFIG_HWINFO=y
CONFIG_GPIO=y

CONFIG_BT=y
CONFIG_BT_L2CAP_TX_BUF_COUNT=5
CONFIG_BT_PERIPHERAL=y
CONFIG_BT_OBSERVER=y
CONFIG_BT_BROADCASTER=y

CONFIG_BT_CTLR_DUP_FILTER_LEN=0
CONFIG_BT_CTLR_LE_ENC=n
CONFIG_BT_CTLR_LE_PING=n
CONFIG_BT_DATA_LEN_UPDATE=n
CONFIG_BT_PHY_UPDATE=n
CONFIG_BT_CTLR_MIN_USED_CHAN=n
CONFIG_BT_CTLR_PRIVACY=n
CONFIG_BT_CTLR_CHAN_SEL_2=n

CONFIG_BT_MESH=y
CONFIG_BT_MESH_MODEL_EXTENSIONS=y
CONFIG_BT_MESH_RELAY=y
CONFIG_BT_MESH_FRIEND=y
CONFIG_BT_MESH_PB_GATT=y
CONFIG_BT_MESH_PB_ADV=y
CONFIG_BT_MESH_GATT_PROXY=y

CONFIG_BT_MESH_SUBNET_COUNT=2
CONFIG_BT_MESH_APP_KEY_COUNT=2
CONFIG_BT_MESH_MODEL_GROUP_COUNT=2
CONFIG_BT_MESH_LABEL_COUNT=3

CONFIG_BT_MESH_ADV_BUF_COUNT=10
CONFIG_BT_MESH_TX_SEG_MSG_COUNT=3

CONFIG_BT_MESH_RX_SEG_MAX=32
CONFIG_BT_MESH_TX_SEG_MAX=32

CONFIG_MBOX=y
CONFIG_IPC_SERVICE=y
CONFIG_IPC_SERVICE_BACKEND_ICMSG=y
CONFIG_IPC_SERVICE_LOG_LEVEL_DBG=y

Any help is appreciated

- Fridtjof

Parents
  • Hi Fridtjof,

     

    nRF53 architecture has a strict split between the cores, in terms of which peripheral and resources each core can access. This limitation is alot more configurable in nRF54L-series.

    It sounds like the cpuapp image is not booting, or is overwriting the cpuflpr image.

     

    Q1: Are you building with the "SNIPPET=nordic-flpr" in place?

    ie. west build -b nrf54l15dk/nrf54l15/cpuapp -- -DSNIPPET=nordic-flpr

     

    or in vscode:

     

    Q2: Have you checked out the IPC sample, specifically with how the "remote" example is pulled in?

    https://github.com/nrfconnect/sdk-nrf/blob/main/samples/ipc/ipc_service/sysbuild.cmake#L12-L17

     

    If you still see issues, could you share the whole build output?

     

    Kind regards,

    Håkon

  • Hi Håkon,

    We know that both cpuapp and cpuflpr are running, since we have RTT logging from cpuapp and serial logging from cpuflpr. In fact, we know the exact spot where cpuapp halts, which is on the line in our program corresponding to  ` k_sem_take(&bound_sem, K_FOREVER);` in `zephyr/samples/subsys/ipc/ipc_service/icmsg/src/main.c`. It halts here because the callback `ipc_ept_cfg.cb.bound` is never called by the IPC library.

    To answer your questions:

    A1: We do not directly use the nordic-flpr snippet. What we have done is copy the contents of the snippet and changed some of the values, since we have allocated more SRAM to FLPR than the default. This all works fine until we have both bluetooth mesh and ipc enabled in the config. This means we are using sysbuild and have added the relevant overlay code.

    A2: We based our IPC on the icmsg sample in zephyr (not sdk-nrf) because it had overlay and config files for nRF54L15. Looking at a diff between the two samples, I can't see any relevant differences which could impact our program. We already use the following in our project, which works fine:

    ExternalZephyrProject_Add(
    APPLICATION flpr
    SOURCE_DIR ${APP_DIR}/flpr
    BOARD nrf54l15dk/nrf54l15/cpuflpr
    )

    I will work on creating a minimal sample so I can show the complete build log.

    -Fridtjof

  • Hi Fridtjof,

     

    Fridtjof said:
    We know that both cpuapp and cpuflpr are running, since we have RTT logging from cpuapp and serial logging from cpuflpr. In fact, we know the exact spot where cpuapp halts, which is on the line in our program corresponding to  ` k_sem_take(&bound_sem, K_FOREVER);` in `zephyr/samples/subsys/ipc/ipc_service/icmsg/src/main.c`. It halts here because the callback `ipc_ept_cfg.cb.bound` is never called by the IPC library.

    Good that both cores are logging and in general are running code.

    Please note that since the icmsg (on both cpuapp and cpuflpr) logs from IRQ handlers, CONFIG_LOG_MODE_DEFERRED=y must be set on both projects.

     

    Fridtjof said:
    A1: We do not directly use the nordic-flpr snippet. What we have done is copy the contents of the snippet and changed some of the values, since we have allocated more SRAM to FLPR than the default. This all works fine until we have both bluetooth mesh and ipc enabled in the config. This means we are using sysbuild and have added the relevant overlay code.

    That sounds good, means you do not need to add the snippet each time you re-configure the build.

     

    Fridtjof said:

    A2: We based our IPC on the icmsg sample in zephyr (not sdk-nrf) because it had overlay and config files for nRF54L15. Looking at a diff between the two samples, I can't see any relevant differences which could impact our program. We already use the following in our project, which works fine:

    ExternalZephyrProject_Add(
    APPLICATION flpr
    SOURCE_DIR ${APP_DIR}/flpr
    BOARD nrf54l15dk/nrf54l15/cpuflpr
    )

    I will work on creating a minimal sample so I can show the complete build log.

    These cmake lines were the ones I was interested if you had, which is good to see is present.

    Based on your description, it sounds like the issue on the cpuapp side of things.

    I would be happy to look deeper, please let me know when you have a minimal sample.

     

    Kind regards,

    Håkon

  • When I created a minimal sample, the issue seems to disappear. This makes me think the issue comes from incorrectly partitioning the RAM. In the minimal sample, cpuapp and cpuflpr use 61% and 25% of their allocated RAM respectively, while in the full sample we used 71% and 92%.

    Full code:

    cpuapp:
    Memory region         Used Size  Region Size  %age Used
               FLASH:      424016 B      1404 KB     29.49%
                 RAM:       98606 B       136 KB     70.81%
            IDT_LIST:          0 GB        32 KB      0.00%
    
    cpuflpr:
    Memory region         Used Size  Region Size  %age Used
                 RAM:      112452 B       120 KB     91.51%
            IDT_LIST:          0 GB         2 KB      0.00%

    Minimal sample:

    cpuapp:
    Memory region         Used Size  Region Size  %age Used
               FLASH:      352984 B      1404 KB     24.55%
                 RAM:       85198 B       136 KB     61.18%
            IDT_LIST:          0 GB        32 KB      0.00%
    
    cpuflpr:
    Memory region         Used Size  Region Size  %age Used
                 RAM:       30336 B       120 KB     24.69%
            IDT_LIST:          0 GB         2 KB      0.00%

    Although it does seem strange. In another build, we have the full code on flpr, and no bluetooth on cpuapp (which means cpuapp has low RAM usage), and that works fine. So it is the increase in RAM usage on cpuapp which causes a failure in cpuapp. If partitioning was incorrect I would expect high memory usage on cpuapp to break cpuflpr by overwriting some of its memory.

Reply
  • When I created a minimal sample, the issue seems to disappear. This makes me think the issue comes from incorrectly partitioning the RAM. In the minimal sample, cpuapp and cpuflpr use 61% and 25% of their allocated RAM respectively, while in the full sample we used 71% and 92%.

    Full code:

    cpuapp:
    Memory region         Used Size  Region Size  %age Used
               FLASH:      424016 B      1404 KB     29.49%
                 RAM:       98606 B       136 KB     70.81%
            IDT_LIST:          0 GB        32 KB      0.00%
    
    cpuflpr:
    Memory region         Used Size  Region Size  %age Used
                 RAM:      112452 B       120 KB     91.51%
            IDT_LIST:          0 GB         2 KB      0.00%

    Minimal sample:

    cpuapp:
    Memory region         Used Size  Region Size  %age Used
               FLASH:      352984 B      1404 KB     24.55%
                 RAM:       85198 B       136 KB     61.18%
            IDT_LIST:          0 GB        32 KB      0.00%
    
    cpuflpr:
    Memory region         Used Size  Region Size  %age Used
                 RAM:       30336 B       120 KB     24.69%
            IDT_LIST:          0 GB         2 KB      0.00%

    Although it does seem strange. In another build, we have the full code on flpr, and no bluetooth on cpuapp (which means cpuapp has low RAM usage), and that works fine. So it is the increase in RAM usage on cpuapp which causes a failure in cpuapp. If partitioning was incorrect I would expect high memory usage on cpuapp to break cpuflpr by overwriting some of its memory.

Children
Related