Hci_core:329 semaphore

I have an application that runs on nrf9160 and uses nrf52811 for Bluetooth with client-host setup. Problem is it sometimes crashes on startup going into endless reboot loop.
In my code i first reboot nrf52 using the reboot pin, wait a bit then attempt to call bt_enable(). It sometimes works when i reset the nrf52 manually but i am not sure where the problem could be.

I am using sdk-nrf v2.2.0 and sdk-zephyr v3.2.99-ncs1. Below is a debug output from sample where I test bluetooth and have mcumgr enabled.


[00:00:00.472,473] <inf> hci_host: main: Starting bt host
[00:00:00.472,534] <dbg> bt_hci_core: hci_tx_thread: Started
[00:00:00.472,564] <dbg> bt_hci_core: hci_tx_thread: Calling k_poll with 2 events
[00:00:00.472,595] <dbg> bt_driver: h4_open:  
[00:00:00.473,083] <dbg> bt_driver: rx_thread: started
[00:00:00.473,083] <dbg> bt_driver: rx_thread: rx.buf (nil)
[00:00:00.473,114] <dbg> bt_hci_core: bt_hci_cmd_create: opcode 0x0c03 param_len 0
[00:00:00.473,144] <dbg> bt_hci_core: bt_hci_cmd_create: buf 0x2001bf80
[00:00:00.473,144] <dbg> bt_hci_core: bt_hci_cmd_send_sync: buf 0x2001bf80 opcode 0x0c03 len 3
[00:00:00.473,175] <dbg> bt_hci_core: process_events: count 2
[00:00:00.473,205] <dbg> bt_hci_core: process_events: ev->state 4
[00:00:00.473,205] <dbg> bt_hci_core: send_cmd: calling net_buf_get
[00:00:00.473,236] <dbg> bt_hci_core: send_cmd: calling sem_take_wait
[00:00:00.473,236] <dbg> bt_hci_core: send_cmd: Sending command 0x0c03 (buf 0x2001bf80) to driver
[00:00:00.473,266] <dbg> bt_hci_core: bt_send: buf 0x2001bf80 len 3 type 0
[00:00:00.473,266] <dbg> bt_driver: h4_send: buf 0x2001bf80 type 0 len 3
[00:00:00.473,327] <dbg> bt_hci_core: process_events: ev->state 0
[00:00:00.473,358] <dbg> bt_hci_core: hci_tx_thread: Calling k_poll with 2 events
[00:00:01.473,297] <wrn> lpuart: tx_timeout: Tx timeout
[00:00:10.473,449] <err> bt_hci_core: bt_hci_cmd_send_sync: k_sem_take failed with err -11
ASSERTION FAIL [err == 0] @ WEST_TOPDIR/zephyr/subsys/bluetooth/host/hci_core.c:332
       k_sem_take failed with err -11
[00:00:10.473,541] <err> os: esf_dump: r0/a1:  0x00000003  r1/a2:  0x00000000  r2/a3:  0x00000003
[00:00:10.473,541] <err> os: esf_dump: r3/a4:  0x2000d1b0 r12/ip:  0x0000000c r14/lr:  0x00025467
[00:00:10.473,571] <err> os: esf_dump:  xpsr:  0x41000000
[00:00:10.473,571] <err> os: esf_dump: Faulting instruction address (r15/pc): 0x00025472
[00:00:10.473,602] <err> os: z_fatal_error: >>> ZEPHYR FATAL ERROR 3: Kernel oops on CPU 0
[00:00:10.473,663] <err> os: z_fatal_error: Current thread: 0x2000f110 (main)
[00:00:10.847,290] <err> os: k_sys_fatal_error_handler: Halting system

err = k_sem_take(&sync_sem, HCI_CMD_TIMEOUT);
BT_ERR("k_sem_take failed with err %d", err);
BT_ASSERT_MSG(err == 0, "k_sem_take failed with err %d", err);
  • In my code i first reboot nrf52 using the reboot pin, wait a bit then attempt to call bt_enable().

    How long is waited a bit? If the nRF52 use an external 32kHz crystal the startup time could be several hundreds of ms.

    Make sure to check that the nRF52 reset when you do a pin reset.

    Kenneth

  • I wait for 1second, i have attempted to increase this time to fix the issue but it has no effect on the outcome. 
    As you suggested I connected to nrf52 to see if it gets reset after the signal and aparently not. I have CONFIG_GPIO_AS_PINRESET=y and when i check the output with picoscope the signal is there.

    As mentioned inside  nrf52811 reset pin I reflashed it and erased the memmory. This fixed the problem on the first run, but I am experiencing this problem when changing the version over DFU. Is there a solution like this for when FW gets updated over DFU? 

  • I assume you can turn on logging (e.g. RTT) from the nRF52811 and also set CONFIG_RESET_ON_FATAL_ERROR=n and CONFIG_LOG_MODE_MINIMAL=y, such that you can catch why it's in an endless reset loop?

    Kenneth

  • Yes, the same error I described in my question happens each time nrf91 attempts to start, causing it to crash and reboot. Currently I think problem is somewhere inside nrf52 causing it to not reset properly and keeping it in a state that crashes the nrf91, but I can't get what that could be from the Rtt logs.

  • About my comment on turn on logging and catch any errors, you can do this on both nRF52 and nRF91.

    Upon startup you can also read out the NRF_POWER->RESETREASON register on the nRF52, this will tell a bit about the cause of the reset on the nRF52.

    Maybe you should consider adding a logic analyzer trace on the pins between the nRF91 and nRF52 to check if the pins have the states you expect, also check the power supply to both chips. 

    Kenneth

Related