Right after BLE is connected, nRF5340 Net core crahes when App core changes tx power via HCI command through IPC service

Hi Nordic team,

We have a project that uses nRF5340 &NCS 2.7.0, and we encountered net core crashing issue each time when ble is connected by central device and 5340 app core tries to set tx power vy sending IPC HCI command to net core.  We did not meet this issue on NCS version before 2.7.0.

So I did similar things to reproduce this issue on 5340DK and use hci_pwr_ctrl sample code in NCS2.7.0.
I can see nRF5340 Net core always crahes when I run the sample code on 
5340DK.

Steps to reproduce:

1. Open the existing sample project C:\ncs\v2.7.0\zephyr\samples\bluetooth\hci_pwr_ctrl by nRF connect extention in VsCode
2. Add a build configuration, select SDK and toolchain v2.7.0, nrf5340DK and 'no sysbuild' as below and do a pristine build.
     

3. After the building is finished, flash the merged hex image into 5340DK.
4. On serial port terminal such as TeralTerm, open the UART port of DK App core, watch the uart logs.
5. Reset the DK board and scan the BLE name "Bynamic test beacon" by nRF connector App on Android phone and Click 'connect' button once it's discovered.
6. Rseult:
    nRF connector App is stuck at connecting, never connected and uart logs also indicate that 5340DK has hanged, very possible in net core.

Here is the screenshot of nRF connector App


Here is the UART log printed out from 5340DK app core:

*** Booting nRF Connect SDK v2.7.0-5cb85570ca43 ***
*** Using Zephyr OS v3.6.99-838ff8ecfddd ***
Starting Dynamic Tx Power Beacon Demo
Get Tx power level ->[00:00:00.262,573] <wrn> bt_hci_core: bt_hci_cmd_send_sync: command opcode 0x0c03 timeout with err 0
[00:00:00.263,305] <wrn> bt_hci_core: bt_hci_cmd_send_sync: command opcode 0xfc0f timeout with err 0
--- 7 messages dropped ---
[00:00:00.271,453] <wrn> bt_hci_core: bt_hci_cmd_send_sync: command opcode 0x2018 timeout with err 0
[00:00:00.272,399] <wrn> bt_hci_core: bt_hci_cmd_send_sync: command opcode 0x2018 timeout with err 0
[00:00:00.275,360] <wrn> bt_hci_core: bt_hci_cmd_send_sync: command opcode 0x0c33 timeout with err 0
[00:00:00.276,123] <wrn> bt_hci_core: bt_hci_cmd_send_sync: command opcode 0x0c31 timeout with err 0
[00:00:00.276,916] <wrn> bt_hci_core: bt_hci_cmd_send_sync: command opcode 0x2003 timeout with err 0
[00:00:00.277,679] <wrn> bt_hci_core: bt_hci_cmd_send_sync: command opcode 0x2002 timeout with err 0
[00:00:00.278,442] <wrn> bt_hci_core: bt_hci_cmd_send_sync: command opcode 0x201c timeout with err 0
[00:00:00.279,235] <wrn> bt_hci_core: bt_hci_cmd_send_sync: command opcode 0x202f timeout with err 0
[00:00:00.279,998] <wrn> bt_hci_core: bt_hci_cmd_send_sync: command opcode 0x2024 timeout with err 0
[00:00:00.280,761] <wrn> bt_hci_core: bt_hci_cmd_send_sync: command opcode 0x2001 timeout with err 0
[00:00:00.281,555] <wrn> bt_hci_core: bt_hci_cmd_send_sync: command opcode 0x0c01 timeout with err 0
[00:00:00.282,318] <wrn> bt_hci_core: bt_hci_cmd_send_sync: command opcode 0xfc01 timeout with err 0
[00:00:00.282,348] <inf> bt_hci_core: HW Platform: Nordic Semiconductor (0x0002)
[00:00:00.282,379] <inf> bt_hci_core: HW Variant: nRF53x (0x0003)
[00:00:00.282,409] <inf> bt_hci_core: Firmware: Standard Bluetooth controller (0x00) Version 214.51162 Build 1926957230
[00:00:00.283,233] <wrn> bt_hci_core: bt_hci_cmd_send_sync: command opcode 0xfc02 timeout with err 0
[00:00:00.283,996] <wrn> bt_hci_core: bt_hci_cmd_send_sync: command opcode 0x1009 timeout with err 0
[00:00:00.284,790] <wrn> bt_hci_core: bt_hci_cmd_send_sync: command opcode 0xfc09 timeout with err 0
[00:00:00.285,552] <wrn> bt_hci_core: bt_hci_cmd_send_sync: command opcode 0x2005 timeout with err 0
[00:00:00.285,705] <inf> bt_hci_core: Identity: F2:D1:8F:D5:69:8A (random)
[00:00:00.285,736] <inf> bt_hci_core: HCI: version 5.4 (0x0d) revision 0x21fb, manufacturer 0x0059
[00:00:00.285,766] <inf> bt_hci_core: LMP: version 5.4 (0x0d) subver 0x21fb
Bluetooth initialized
[00:00:00.286,621] <wrn> bt_hci_core: bt_hci_cmd_send_sync: command opcode 0x2006 timeout with err 0
[00:00:00.287,414] <wrn> bt_hci_core: bt_hci_cmd_send_sync: command opcode 0x2008 timeout with err 0
[00:00:00.288,238] <wrn> bt_hci_core: bt_hci_cmd_send_sync: command opcode 0x2009 timeout with err 0
[00:00:00.289,184] <wrn> bt_hci_core: bt_hci_cmd_send_sync: command opcode 0x200a timeout with err 0
Dynamic Tx power Beacon started
Set Tx power level to 4
[00:00:05.264,343] <wrn> bt_hci_core: bt_hci_cmd_send_sync: command opcode 0xfc0e timeout with err 0
Actual Tx Power: 4
Get Tx power level -> [00:00:10.265,167] <wrn> bt_hci_core: bt_hci_cmd_send_sync: command opcode 0xfc0f timeout with err 0
TXP = 4
Set Tx power level to 0
[00:00:10.265,899] <wrn> bt_hci_core: bt_hci_cmd_send_sync: command opcode 0xfc0e timeout with err 0
Actual Tx Power: 0
Get Tx power level -> [00:00:15.266,754] <wrn> bt_hci_core: bt_hci_cmd_send_sync: command opcode 0xfc0f timeout with err 0
TXP = 0
Set Tx power level to -3
[00:00:15.267,517] <wrn> bt_hci_core: bt_hci_cmd_send_sync: command opcode 0xfc0e timeout with err 0
Actual Tx Power: -3
Get Tx power level -> [00:00:20.268,371] <wrn> bt_hci_core: bt_hci_cmd_send_sync: command opcode 0xfc0f timeout with err 0
TXP = -3
Set Tx power level to -8
[00:00:20.269,226] <wrn> bt_hci_core: bt_hci_cmd_send_sync: command opcode 0xfc0e timeout with err 0
Actual Tx Power: -8
Connected via connection (15) at 60:9D:4A:18:E8:AD (random)
[00:00:21.603,302] <wrn> bt_hci_core: bt_hci_cmd_send_sync: command opcode 0xfc0f timeout with err 0
Connection (15) - Initial Tx Power = -8
Get Tx power level -> Actual Tx Power: 2
TXP = 0
Connection (15) - Tx Power = 15
Connected (15) - RSSI = 15
Adaptive Tx power selected = -20
Actual Tx Power: 15
Connection (15) TXP = 15
Connected (15) - RSSI = 15
Adaptive Tx power selected = -20
Actual Tx Power: 15

As you can see UART logs stopped at line 65, but the sample is expected to adjust the tx power continuously, right? so I suspect net core is crashed.

7.  Go back to step 5-6, but this time attach net core's elf file to the running target by Ozone Debugger via Jlink,
I loaded hci_ipc efl file by command:  File.Open("C:\ncs\v2.7.0\zephyr\samples\bluetooth\hci_pwr_ctrl\child-parent-build\hci_ipc\zephyr\zephyr.elf"), and selected net core like below:


I observed that nRF5340 net core was crashed when nRF connector App is connecting to "Bynamic test beacon" in step 6, this aligns with the UART log I saw on App core. And the callstack in net core shows that it is BusFault exception in softdevice_controller_s140


And here is the zip file of the code C:\ncs\nick\v2.7.0\zephyr\samples\bluetooth\hci_pwr_ctrl, I did not touch its source code, but in case you want to review it.
I just renamed child_image/hci_rpmsg.conf -> child_image/hci_ipc.conf under C:\ncs\v2.7.0\zephyr\samples\bluetooth\hci_pwr_ctrl as I said in another tickect  simple gpio to control external FEM in NCS 2.7.0 & nrf5340
I did not change zephyr\samples\bluetooth\hci_ipc code either.

7827.hci_pwr_ctrl.zip

Please note I am not using sysbuild in NCS2.7.0, here is the build command I scratched from VSC:
west build --build-dir c:/ncs/v2.7.0/zephyr/samples/bluetooth/hci_pwr_ctrl/child-parent-build c:/ncs/v2.7.0/zephyr/samples/bluetooth/hci_pwr_ctrl --pristine --board nrf5340dk/nrf5340/cpuapp --no-sysbuild -- -DNCS_TOOLCHAIN_VERSION=NONE

Can you look into this issue? And can you give me some guidance about how to debug the network core of 5340?

Thanks

Related