HCI command tx timeout

Host side:

Hardware: Torizon Verdin iMX8MP

System: custom Yocto project distribution

Linux kernel: 6.6.54

BlueZ version: 5.79

Controller side:

Hardware: custom nRF52 840 board

Firmware: HCI uart sample

nRF Connect SDK version: 2.5.1

UART baudrate: 460800

prj.conf

CONFIG_CONSOLE=n
CONFIG_STDOUT_CONSOLE=n
CONFIG_UART_CONSOLE=n
CONFIG_GPIO=y
CONFIG_SERIAL=y
CONFIG_UART_INTERRUPT_DRIVEN=y
CONFIG_BT=y
CONFIG_BT_HCI_RAW=y
CONFIG_BT_HCI_RAW_H4=y
CONFIG_BT_HCI_RAW_H4_ENABLE=y
CONFIG_BT_BUF_ACL_RX_SIZE=255
CONFIG_BT_BUF_CMD_TX_SIZE=255
CONFIG_BT_BUF_EVT_DISCARDABLE_SIZE=255
CONFIG_BT_CTLR_ASSERT_HANDLER=y
CONFIG_BT_MAX_CONN=16
CONFIG_BT_TINYCRYPT_ECC=n
CONFIG_BT_CTLR_DTM_HCI=y

CONFIG_SYSTEM_WORKQUEUE_STACK_SIZE=512

# Workaround: Unable to allocate command buffer when using K_NO_WAIT since
# Host number of completed commands does not follow normal flow control.
CONFIG_BT_BUF_CMD_TX_COUNT=10

CONFIG_CLOCK_CONTROL_NRF_K32SRC_500PPM=y
CONFIG_CLOCK_CONTROL_NRF_K32SRC_RC=y
CONFIG_BT_SMP=y

devicetree.overlay

/* SPDX-License-Identifier: Apache-2.0 */

&uart0 {
	compatible = "nordic,nrf-uart";
	current-speed = <460800>;
	status = "okay";
	hw-flow-control;
};

&uart0_default {
	group1 {
		psels = <NRF_PSEL(UART_TX, 0, 23)>, <NRF_PSEL(UART_RTS, 0, 20)>;
	};

	group2 {
		psels = <NRF_PSEL(UART_RX, 0, 22)>, <NRF_PSEL(UART_CTS, 0, 21)>;
	};
};

&uart0_sleep {
	group1 {
		psels = <NRF_PSEL(UART_TX, 0, 23)>,
				<NRF_PSEL(UART_RX, 0, 22)>,
				<NRF_PSEL(UART_RTS, 0, 20)>,
				<NRF_PSEL(UART_CTS, 0, 21)>;
	};
};

&qspi {
	status = "disabled";
};

/ {
	nrf_radio_fem: rf_path {
		compatible = "generic-fem-two-ctrl-pins";
		ctx-gpios = <&gpio0 14 0>;
       	crx-gpios = <&gpio0 13 0>;
       	tx-gain-db = < 0 >;
       	rx-gain-db = < 0 >;
       	ctx-settle-time-us = < 10 >;
       	crx-settle-time-us = < 10 >;
	};
};

&radio {
	fem = <&nrf_radio_fem>;
};


&led0 { 	status = "disabled";};
&led1 { 	status = "disabled";};
&pwm0 { 	status = "disabled";};

Normal working conditions:

Scanning multiple peripheral sensor devices (based on nRF52833), collecting data from advertisements, connecting periodically to read characteristic data.

Currently we are testing stability of scan and connection by connecting much faster, once every 5-20 seconds. Also our test environment is noisy with lots of other BLE devices.

Issue description:

Spontaneously this message appears and device is unable to start scanning.

[ 3916.269822] Bluetooth: hci1: command 0x200c tx timeout
[ 3916.269822] Bluetooth: hci1: Opcode 0x200c failed: -110
[ 3916.269837] Bluetooth: hci1: Unable to disable scanning: -110
[ 3916.285978] Bluetooth: hci1: Opcode 0x200c failed: -110
[ 3916.291229] Bluetooth: hci1: Unable to disable scanning: -110
[ 3916.296977] Bluetooth: hci1: Unable to disable scanning: -110
[ 3918.349861] Bluetooth: hci1: command 0x200c tx timeout
[ 3918.349863] Bluetooth: hci1: Opcode 0x200c failed: -110
[ 3918.349873] Bluetooth: hci1: Unable to disable scanning: -110
[ 3918.349877] Bluetooth: hci1: stop background scanning failed: -110
[ 3920.429905] Bluetooth: hci1: command 0x200c tx timeout

Sometimes Nordic is stuck in scan, while BlueZ crashes and says scan is disabled.

Only way to recover is to power-cycle the BLE controller, attach again and restart bluetooth service.

Logging from Nordic controller side:

<dbg> bt_sdc_hci_driver: hci_driver_send: Exit: 0
<dbg> bt_sdc_hci_driver: event_packet_process: Command Status (0x2016) status: 0x00
<dbg> bt_hci_raw: bt_recv: buf 0x2000e994 len 6
<dbg> bt_hci_raw: bt_send: buf 0x2000ddac len 28
<dbg> bt_sdc_hci_driver: hci_driver_send: 
<dbg> bt_sdc_hci_driver: cmd_handle: 
<dbg> bt_sdc_hci_driver: hci_driver_send: Exit: 0
<dbg> bt_sdc_hci_driver: event_packet_process: Command Status (0x200d) status: 0x0b
<dbg> bt_hci_raw: bt_recv: buf 0x2000e994 len 6
<dbg> bt_sdc_hci_driver: event_packet_process: LE Meta Event (0x04), len (12)
<dbg> bt_hci_raw: bt_recv: buf 0x2000e994 len 14
<dbg> bt_sdc_hci_driver: event_packet_process: Event (0x05) len 4
<dbg> bt_hci_raw: bt_recv: buf 0x2000e994 len 6

Is there anything I can change in my configuration to remedy this?

Would greatly appreciate any suggestions or directions.

Parents Reply Children
  • Migrated to 2.9.0, unfortunately same issue persists.

    Here are logs after Nordic Controller becomes unresponsive:

    Thread analyze:
     HCI uart TX         : STACK: unused 1048 usage 488 / 1536 (31 %); CPU: 0 %
                         : Total CPU cycles used: 544115
     thread_analyzer     : STACK: unused 528 usage 496 / 1024 (48 %); CPU: 0 %
                         : Total CPU cycles used: 91685
     sysworkq            : STACK: unused 320 usage 192 / 512 (37 %); CPU: 0 %
                         : Total CPU cycles used: 1
     MPSL Work           : STACK: unused 704 usage 320 / 1024 (31 %); CPU: 0 %
                         : Total CPU cycles used: 12555467
     logging             : STACK: unused 480 usage 544 / 1024 (53 %); CPU: 0 %
                         : Total CPU cycles used: 153741
     idle                : STACK: unused 256 usage 64 / 320 (20 %); CPU: 99 %
                         : Total CPU cycles used: 2642233638
     main                : STACK: unused 1440 usage 608 / 2048 (29 %); CPU: 0 %
                         : Total CPU cycles used: 2804754
     ISR0                : STACK: unused 1120 usage 928 / 2048 (45 %)

  • Hi,

    A colleague suggested you try to merge this pull request: https://github.com/nrfconnect/sdk-nrf/pull/19869

    Can you try that? They suspect lost HCI Command Buffers, the PR will detect misconfigurations.

    Kenneth

Related