btattach command hanging when using HCI UART sample nrf52840

When flashing the chip with a build from the samples/bluetooth/hci_uart via west build -b nrf52840dk/nrf52840 samples/bluetooth/hci_uart --pristine and west flash off the current main of sdk-zephyr, a btattach command from a linux host connected to the Zephyr device over UART hangs indefinitely:


btattach -B /dev/ttyMSM` -S 1000000 -P h4

Attaching Primary controller to /dev/ttyMSM2
Switched line discipline from 0 to 15
= New Index: 00:00:00:00:00:00 (Primary,UART,hci1) #0 [hci1] 4311.006931
= Open Index: 00:00:00:00:00:00 [hci1] 4311.007519
< HCI Command: Reset (0x03|0x0003) plen 0 #1 [hci1] 4311.007671
Device index 1 attached

The bluez version is 5.54, zephyr-sdk is the latest.


Looking at the backtrace on the device via gdb shows the CPU as idle, and no attempts to access the UART are seen.

When attempting to disable the config value CONFIG_BT_CTLR as suggested here, the build fails with the below error. What must be done to enable HCI UART support on the Zephyr device?

-- Zephyr version: 4.0.0-rc2 (/Users/samantha.cho/clones/zephyrproject/zephyr), build: v4.0.0-rc2-53-gb9fc4cc4151f
[139/144] Linking C executable zephyr/zephyr_pre0.elf
FAILED: zephyr/zephyr_pre0.elf zephyr/zephyr_pre0.map /Users/samantha.cho/clones/zephyrproject/build/zephyr/zephyr_pre0.map
: && ccache /Users/samantha.cho/zephyr-sdk-0.17.0/arm-zephyr-eabi/bin/arm-zephyr-eabi-gcc  -gdwarf-4 -gdwarf-4 zephyr/CMakeFiles/zephyr_pre0.dir/misc/empty_file.c.obj -o zephyr/zephyr_pre0.elf  zephyr/CMakeFiles/offsets.dir/./arch/arm/core/offsets/offsets.c.obj  -T  zephyr/linker_zephyr_pre0.cmd  -Wl,-Map=/Users/samantha.cho/clones/zephyrproject/build/zephyr/zephyr_pre0.map  -Wl,--whole-archive  app/libapp.a  zephyr/libzephyr.a  zephyr/arch/common/libarch__common.a  zephyr/arch/arch/arm/core/libarch__arm__core.a  zephyr/arch/arch/arm/core/cortex_m/libarch__arm__core__cortex_m.a  zephyr/arch/arch/arm/core/mpu/libarch__arm__core__mpu.a  zephyr/lib/libc/picolibc/liblib__libc__picolibc.a  zephyr/lib/libc/common/liblib__libc__common.a  zephyr/lib/net_buf/liblib__net_buf.a  zephyr/soc/soc/nrf52840/libsoc__nordic.a  zephyr/subsys/bluetooth/common/libsubsys__bluetooth__common.a  zephyr/subsys/bluetooth/host/libsubsys__bluetooth__host.a  zephyr/drivers/clock_control/libdrivers__clock_control.a  zephyr/drivers/gpio/libdrivers__gpio.a  zephyr/drivers/pinctrl/libdrivers__pinctrl.a  zephyr/drivers/serial/libdrivers__serial.a  zephyr/drivers/timer/libdrivers__timer.a  modules/hal_nordic/nrfx/libmodules__hal_nordic__nrfx.a  modules/segger/libmodules__segger.a  -Wl,--no-whole-archive  zephyr/kernel/libkernel.a  -L/Users/samantha.cho/clones/zephyrproject/build/zephyr  zephyr/arch/common/libisr_tables.a  -mcpu=cortex-m4  -mthumb  -mabi=aapcs  -mfp16-format=ieee  -mtp=soft  -fuse-ld=bfd  -Wl,--gc-sections  -Wl,--build-id=none  -Wl,--sort-common=descending  -Wl,--sort-section=alignment  -Wl,-u,_OffsetAbsSyms  -Wl,-u,_ConfigAbsSyms  -nostdlib  -static  -Wl,-X  -Wl,-N  -Wl,--orphan-handling=warn  -Wl,-no-pie  -specs=picolibc.specs  -DPICOLIBC_LONG_LONG_PRINTF_SCANF -L"/Users/samantha.cho/zephyr-sdk-0.17.0/arm-zephyr-eabi/bin/../lib/gcc/arm-zephyr-eabi/12.2.0/thumb/v7e-m/nofp" -lc -lgcc && cd /Users/samantha.cho/clones/zephyrproject/build/zephyr && /opt/homebrew/Cellar/cmake/3.30.5/bin/cmake -E true
/Users/samantha.cho/zephyr-sdk-0.17.0/arm-zephyr-eabi/bin/../lib/gcc/arm-zephyr-eabi/12.2.0/../../../../arm-zephyr-eabi/bin/ld.bfd: zephyr/subsys/bluetooth/host/libsubsys__bluetooth__host.a(hci_raw.c.obj):/Users/samantha.cho/clones/zephyrproject/zephyr/subsys/bluetooth/host/hci_raw.c:60: undefined reference to `__device_dts_ord_133'
collect2: error: ld returned 1 exit status
ninja: build stopped: subcommand failed.

  • Yes, I verified the pins are connected for the HW flow control.

    ```

    [00:02:03.923,492] <wrn> hci_uart: rx is ready
    [00:02:03.923,522] <wrn> hci_uart: read 1 req 1
    [00:02:03.923,522] <wrn> hci_uart: read 3 req 3
    [00:02:03.923,553] <wrn> hci_uart: read 0 req 0
    [00:02:03.923,583] <wrn> hci_uart: putting RX packet in queue.
    [00:02:03.923,614] <wrn> hci_uart: spurious interrupt
    [00:02:03.923,889] <wrn> hci_uart: buf 0x2000e620 type 1 len 7

    ```

    I am seeing when using the RTT debugger that when the `hciconfig hci0 up` command is initiated from the host, although the DK receives packages over the UART, it never attempts to transmit via `tx_isr`. I also do not see the device moving the tx to enabled.

  • Upon further debugging efforts, what I am seeing is that it seems like the BT controller is not responding the way we are expected.
    - On the first issue of the `btattach` command from the host, the DK responds with `0e 04 01 03 0c 00`, which I believe to be the successful completion of a reset of the BT controller.

    - the `tx_isr` function is triggered via interrupt, and the transmit buffer currently contains 7 bytes, `04 0e 04 01 03 0c 00`, which is the reset response.

    - the buffer is left with 6 bytes (the value of the rest of the reset command) and these are never transmitted

    - subsequent requests are propagated to the BT controller, but the interrupt to attempt to send them is never triggered, thus the tx_isr is never called and no data is attempted to be transmitted over uart.

    Should the whole message be sent in tx_isr via the `uart_fifo_fill` function? If not, when are the rest of the bytes supposed to be transmitted? Also, when data is added via `h4_send`, why is the interrupt to send the data via tx_isr not being triggered?

    Also is this statement in hci_uart/src/main.c correct?
    ```

    if (!(uart_irq_rx_ready(hci_uart_dev) ||
    uart_irq_tx_ready(hci_uart_dev))) {
    LOG_DBG("spurious interrupt");
    }

    ```

    I am wondering if it should be `if (!(uart_irq_rx_ready(hci_uart_dev) || uart_irq_tx_ready(hci_uart_dev))`?

  • Hi,

    samanthacho said:
    he `btattach` command from the host, the DK responds with `0e 04 01 03 0c 00`, which I believe to be the successful completion of a reset of the BT controller.

    You are right. I made a HCI capture with btmon showing the command exchange when I run btattach and I see the same response to the reset command. 

    HCI capture (can be viewed in Wireshark):

    btmon_trace_attach.pcapng

    samanthacho said:

    - subsequent requests are propagated to the BT controller, but the interrupt to attempt to send them is never triggered, thus the tx_isr is never called and no data is attempted to be transmitted over uart.

    Should the whole message be sent in tx_isr via the `uart_fifo_fill` function? If not, when are the rest of the bytes supposed to be transmitted? Also, when data is added via `h4_send`, why is the interrupt to send the data via tx_isr not being triggered?

    Also is this statement in hci_uart/src/main.c correct?

    I see the same "spurious interrupt" message when enabling debug logging over RTT:

    00:01:04.220,855] <dbg> hci_uart: h4_send: buf 0x2000c544 type 1 len 7
    [00:01:04.222,137] <dbg> hci_uart: h4_read: read 1 req 1
    [00:01:04.222,137] <dbg> hci_uart: h4_read: read 1 req 3
    [00:01:04.222,167] <dbg> hci_uart: h4_read: read 2 req 2
    [00:01:04.222,198] <dbg> hci_uart: h4_read: read 0 req 2
    [00:01:04.222,229] <dbg> hci_uart: bt_uart_isr: spurious interrupt rx:0 tx:0
    [00:01:04.223,419] <dbg> hci_uart: h4_read: read 1 req 2
    [00:01:04.223,449] <dbg> hci_uart: h4_read: read 1 req 1
    [00:01:04.223,449] <dbg> hci_uart: rx_isr: putting RX packet in queue.
    [00:01:04.223,480] <dbg> hci_uart: h4_read: read 0 req 1
    [00:01:04.223,510] <dbg> hci_uart: bt_uart_isr: spurious interrupt rx:0 tx:0
    [00:01:04.223,632] <dbg> hci_uart: h4_send: buf 0x2000c544 type 1 len 7
    [00:01:04.232,543] <dbg> hci_uart: h4_read: read 1 req 1
    [00:01:04.232,574] <dbg> hci_uart: h4_read: read 2 req 3
    [00:01:04.232,604] <dbg> hci_uart: h4_read: read 0 req 1
    [00:01:04.232,635] <dbg> hci_uart: bt_uart_isr: spurious interrupt rx:0 tx:0
    [00:01:04.235,870] <dbg> hci_uart: h4_read: read 1 req 1
    [00:01:04.235,900] <dbg> hci_uart: h4_read: read 0 req 0
    [00:01:04.235,931] <dbg> hci_uart: rx_isr: putting RX packet in queue.
    [00:01:04.236,053] <dbg> hci_uart: h4_send: buf 0x2000c544 type 1 len 15

    I assume maybe the code is triggering the TX interrupt when it is not supposed to, but have not been able to confirm that yet. However, it does not seem to affect the HCI communication. I can still scan for devices in btmgr (with 'find -l')

    But it is strange that the tx fifo is not being emptied in your case. It makes me suspect that could be an issue with the flow control. Could you try updating your sdk to the v2.8.0-rc2 tag and try the HCI 3-wire (H:5) sample without flow control?

    Note that this sample requires you to use the hciattach command to attach the device.

    sudo hciattach -n /dev/ttyACM0 3wire 1000000

    v2.7.0/nrf$ git fetch
    v2.7.0/nrf$ git checkout v2.8.0-rc2
    v2.7.0/nrf$ west update

    Edit: Another way to check if this issue is caused by flow control is to lower the baud rate in the board overlay from 1M to 115200 and disable hardware flow control in your existing project. Although it may not be stable without flow control, you should be able to receive command responses if flow control is the problem.

  • Thank you!! I did the second option, and am seeing the chip respond as expected!

    A follow up question -- is there a way to disable the LE Privacy operation? I am seeing the bluetoothd daemon error when attempting to get the address due to not being able to open the crypto module `Failed to open crypto` -- It seems like this could be mitigated by enabling CONFIG_CRYPTO_USER, but this is not an option for us due to incompatibilities with other kernel modules.

    When looking at the bluetoothd/bluez5 code, it seems like we can skip it all together if we disable LE privacy in the BT controller:
    ```

    /*
    * If the controller does not support LE Privacy operation,
    * there is no support for loading identity resolving keys
    * into the kernel.
    */

    ```

    ^ That is from bluez/src/adapter.c, 5.54 version.

    I have patched other sections to avoid using the crypto library callouts in this fashion, and to instead use openssl, but I am wondering if there is a way to disable the LE privacy mode as I don't need to support it.

  • I'm glad to hear that it worked. This likely means that there is a problem with the UART flow control signals. This problem should be addressed if you want to use the H:4 protocol. Otherwise, the new 3-wire HCI sample may be an alternative.

    Regarding privacy, this can be disabled in the controller by adding CONFIG_BT_CTLR_PRIVACY=n to your project configuration file.

Related