NRF5340 HCI_USB & HCI_UART Stop Scanning

I am developing a custom nrf5340 board design, that is using the BL5340PA external antenna module to implement the nrf5340 chip in the design. The BL5340 module allows all of the base functionality in the nrf5340 module, but also adds an external FEM in increase the TX power. I am utilizing the Zephyr/nrf SDK version 2.4.1 (received from Laird for updated TX power setting rules: GitHub - LairdCP/bl5340pa_manifest: Manifest for the Laird Connectivity fork of the nRF Connect SDK with support for the BL5340PA). I am attempting to utilize either the HCI_USB or HCI_UART firmware projects, that come with Zephyr, in order to make this custom board a reliable BLE adapter for Linux. My Linux device is running off of kernel version 5.4.0-174-generic and is connected to the custom board through the NRF USB interface on the nrf5340. My BlueZ is version 5.66. The below actions are what I am trying to accomplish:

  1. Reliable BLE Adapter over USB or CDC-ACM USB
  2. Application Core Firmware Update over CDC-ACM USB
    1. Currently using MCUMGR
  3. Network Core Firmware Update over CDC-ACM USB
    1. Not functional, as no external flash on custom board. Need method to use internal flash

The items above I would like to complete in that priority as well. Therefore I will focus on the BLE Adapter functionality with mentions to the others. 

My Issue:

  • Utilizing the HCI_USB or HCI_USB project, HCI_UART is sent through the CDC-ACM driver and setup with BTATTACH (HCIATTACH tried too, same effect, and it is deprecated now) in Linux as H4 protocol with 1000000 speed, whereas HCI_USB is automatically recognized by BTUSB driver in Linux and setup. I can communicate, connect, pair, and perform all necessary BT actions between both of these projects (this has been tested using BluetoothCTL and HCI_BUS commands in C++ code). After some time (HCI_UART, few minutes to an hour. HCI_USB, hour to two hours) of Scanning for devices the adapter will stop functioning, communication will halt on a timeout (This is shown in the image below, as output from DMESG, below that is what appears in BTMON). In some Debug Messages from the custom board I have received a "<wrn> bt_hci_driver: Couldn't allocate a buffer after waiting 10 seconds." error when this occurs. After this error occurs there are a few different methods of resetting the board that work to recover and regain BLE functionality with BlueZ. 
    • Recovery Methods
      • HCI_USB
        • bluetoothctl.power off -> bluetoothctl.power on
        • Or any hard power reset of board
          • Details: Default Zephyr Project
      • HCI_UART
        • Hard power reset of board
          • Details: Default Zephyr Project, with addition of CDC-ACM configuration for bt-c2h-uart redirection. 
      • HCI_USB with MCUMGR
        • MCUMGR reset
        • or any hard power reset of board
          • Details:
            • This project is implemented with a change in the SDK to allow the CDC-ACM interface to always be second, allowing HCI_USB to use the first interface and BTUSB to recognize this. Both USB interfaces are active at the same time, as seen when plugging into Windows, but Linux only allows one to be active as the BTUSB and CDC drivers conflict on assignment. Meaning that in order to switch the USB interface of the board has to be unbinded.  
      • HCI_UART with MCUMGR
        • MCUMGR reset
        • or any hard power reset of board
          • Details:
            • 2 CDC-ACM ports, one for MCUMGR and one for HCI_UART

I have seen this issue across every attempt with these firmware projects, and any configuration I try with them. Eventually the board will stop functioning as a BT adapter and some, possibly extensive, actions have to be taken to recover it. From my investigation, this seems to be, on the surface, a memory or issue in the speed of communication. As the issue occurs faster when there are many more Bluetooth devices available to be scanned. I tried using the same testing methods with another Linux BT Adapter (Intel AX210) and they did not occur, which let me know that this is an issue within my Zephyr firmware. This issue seems to occur faster with the HCI_UART version of the firmware, as opposed to HCI_USB.

I have researched this issue everywhere and cannot find a good solution or any solution to resolve this. Below are some links that I have gone to find any solution:

https://lists.zephyrproject.org/g/devel/topic/hci_interface_stopped_working/71746540?p=,,,20,0,0,0::recentpostdate%2Fsticky,,,20,2,640,71746540

https://github.com/zephyrproject-rtos/zephyr/issues/20250

https://github.com/zephyrproject-rtos/zephyr/issues/37731

Code from my HCI_UART_MCUMGR project:

-  prj.conf

"prj.conf"

#################################################################################
#
#  Custom-Board Project Configuration 
#
#################################################################################


##########################################################################
## HCI_UART Project Configuration --------------------------------------
# --- Sets up the UART interface for HCI control of the BT Carrier Board
# -- From default HCI_UART project with few removals due to conflicts

CONFIG_STDOUT_CONSOLE=n
CONFIG_UART_CONSOLE=n
CONFIG_GPIO=y
CONFIG_SERIAL=y
CONFIG_UART_INTERRUPT_DRIVEN=y
CONFIG_BT=y
CONFIG_BT_HCI_RAW=y
CONFIG_BT_HCI_RAW_H4=y
CONFIG_BT_HCI_RAW_H4_ENABLE=y
CONFIG_BT_BUF_ACL_RX_SIZE=255
CONFIG_BT_BUF_CMD_TX_SIZE=255
CONFIG_BT_BUF_EVT_DISCARDABLE_SIZE=255
CONFIG_BT_MAX_CONN=16
CONFIG_BT_TINYCRYPT_ECC=n

# Workaround: Unable to allocate command buffer when using K_NO_WAIT since
# Host number of completed commands does not follow normal flow control.
CONFIG_BT_BUF_CMD_TX_COUNT=10

#=========================================================================

##########################################################################
## HCI -> USB Project Configuration --------------------------------------
# --- Sets up the USB interface to be used for communication
# --- USB Interface is used for MCUMGR and HCI_UART
# --- Composite USB configuration allowing multiple USB interfaces

CONFIG_USB_DEVICE_STACK=y
CONFIG_USB_DEVICE_PRODUCT="Custom-Board"

#CONFIG_USB_DEVICE_PID=
#CONFIG_USB_DEVICE_VID=

CONFIG_USB_CDC_ACM=y
CONFIG_USB_DEVICE_INITIALIZE_AT_BOOT=n
CONFIG_UART_LINE_CTRL=y

#=========================================================================

##########################################################################
## MCUMGR/MCUBOOT Project Configuration --------------------------------------
# --- Enables MCUMGR that is used for Image Management
# --- Enables MCUBOOT bootloader to handle the images on the carrier board, as well as booting

# Enable MCUmgr and dependencies.
CONFIG_NET_BUF=y
CONFIG_ZCBOR=y
CONFIG_CRC=y
CONFIG_MCUMGR=y
CONFIG_STREAM_FLASH=y
CONFIG_FLASH_MAP=y

# Some command handlers require a large stack.
CONFIG_SYSTEM_WORKQUEUE_STACK_SIZE=2304
CONFIG_MAIN_STACK_SIZE=2048

# Ensure an MCUboot-compatible binary is generated.
CONFIG_BOOTLOADER_MCUBOOT=y

# Enable flash operations.
CONFIG_FLASH=y

# Required by the `taskstat` command.
CONFIG_THREAD_MONITOR=y

# Support for taskstat command
CONFIG_MCUMGR_GRP_OS_TASKSTAT=y

# Enable statistics and statistic names.
CONFIG_STATS=y
CONFIG_STATS_NAMES=y

# Enable most core commands.
CONFIG_FLASH=y
CONFIG_IMG_MANAGER=y
CONFIG_MCUMGR_GRP_IMG=y
CONFIG_MCUMGR_GRP_OS=y
CONFIG_MCUMGR_GRP_STAT=y

# Enable logging
CONFIG_LOG=y
CONFIG_MCUBOOT_UTIL_LOG_LEVEL_WRN=y

# Disable debug logging
CONFIG_LOG_MAX_LEVEL=3

CONFIG_MCUMGR_TRANSPORT_UART=y
CONFIG_BASE64=y

#=========================================================================

- mcuboot.conf

"mcuboot.conf"

#################################################################################
#
#  MCUBOOT Child Image Configuration 
#
#################################################################################


#--------------------------------------------------------------------------------
# Enable Pin Control
#--------------------------------------------------------------------------------
CONFIG_PINCTRL=y


#--------------------------------------------------------------------------------
# Enable code size optimization on the compiler
#--------------------------------------------------------------------------------
CONFIG_SIZE_OPTIMIZATIONS=y


#--------------------------------------------------------------------------------
# Enable multi threading.
#--------------------------------------------------------------------------------
CONFIG_MULTITHREADING=y


#--------------------------------------------------------------------------------
# Add private key for MCUboot. Refer to these sites for more information:
# https://developer.nordicsemi.com/nRF_Connect_SDK/doc/latest/nrf/app_dev/bootloaders_and_dfu/bootloader_adding.html#id11
# https://developer.nordicsemi.com/nRF_Connect_SDK/doc/latest/nrf/app_dev/bootloaders_and_dfu/fw_update.html#ug-fw-update-keys-python
#--------------------------------------------------------------------------------
CONFIG_BOOT_SIGNATURE_KEY_FILE="custom_file_and_path_here.pem"

- hci_rpmsg.conf (to setup FEM)

"hci_rpmsg.conf"

#################################################################################
#
#  HCI_RPMSP Child Image Configuration 
#
#################################################################################

#--------------------------------------------------------------------------------
# Enable Pin Control
#--------------------------------------------------------------------------------
CONFIG_PINCTRL=y


#--------------------------------------------------------------------------------
# Enable SPI
#--------------------------------------------------------------------------------

# Enabled for FEM control
CONFIG_SPI=y


#--------------------------------------------------------------------------------
# Enable MPSL and FEM
#--------------------------------------------------------------------------------
CONFIG_MPSL=y
CONFIG_MPSL_FEM=y
CONFIG_MPSL_FEM_NRF21540_GPIO=y
CONFIG_MPSL_FEM_NRF21540_TX_GAIN_DB=20
CONFIG_BT_CTLR_TX_PWR_ANTENNA=20
CONFIG_MPSL_FEM_NRF21540_GPIO_SPI=y

- bl5340pa_dvk_cpuapp.overlay (I have custom board files made for my board, but for ease of testing it also works with the bl5340pa DVK default board files. Both configurations give the same results)

"bl5340pa_dvk_cpuapp.overlay"

/ {
	chosen {
	   zephyr,uart-mcumgr = &cdc_acm1;
	   zephyr,bt-c2h-uart = &cdc_acm0;
	};
 };
 

&zephyr_udc0 {
	cdc_acm0: cdc_acm0 {
		compatible = "zephyr,cdc-acm-uart";
	};
	cdc_acm1: cdc_acm1 {
		compatible = "zephyr,cdc-acm-uart";
	};
};

&uart0 {
	current-speed = <1000000>;
	status = "disabled";
	hw-flow-control;
};

 

The Main.C src file of HCI_USB and HCI_UART is unedited from either of the examples.

As a note on the above, this has also been tested without any of the FEM or MCUBOOT/MCUMGR configurations to ensure that is not affecting the projects. Same effect observed.

Questions:

  1. Is there something that I am missing that is causing this issue?
    1. On the Zephyr Configuration side?
    2. On the Linux/BlueZ setup side?
  2. Is there another method to perform my desired actions that would avoid these issues?
  3. Is this the right place to investigate this?
  4. Or any other advice would be greatly appreciated.

Thank you to whomever can help with this issue, as it has been causing me a lot of troubles. Let me know if I can supply any addition information that would assist in resolving this. 

  • Hi Devin

    There are some configuration changes you could try on the nRF side in order to try and improve the reliability, in particular the CONFIG_BT_RX_STACK_SIZE configuration. 

    Could you try to add the following config settings both for the appcore (hci_usb/hci_uart) and netcore (hci_rpmsg/hci_ipc), and see if it works better?

    CONFIG_BT_BUF_CMD_TX_COUNT=10
     
    CONFIG_BT_BUF_EVT_RX_COUNT=16
     
    CONFIG_BT_BUF_EVT_RX_SIZE=255
    CONFIG_BT_BUF_ACL_RX_SIZE=255
    CONFIG_BT_BUF_ACL_TX_SIZE=251
    CONFIG_BT_BUF_CMD_TX_SIZE=255
     
    CONFIG_BT_RX_STACK_SIZE=2048
     

    If it is still unreliable, would you be able to provide the full btmon log, and ideally also the nRF side log, for analysis? 

    Best regards
    Torbjørn

  •    Thanks for your response!

    I inserted the log files from dmesg, btmon, and log from the Zephyr Console. This test was on the HCI_UART with CDC_ACM driver and a 2nd CDC-ACM interface used for output of Zephyr Console Logs.
    7624.btmon.txt

    4643.dmesg.txt

    The bottom of dmesg file is applicable to the current test.

    5224.log.txt

    I switched back to the default HCI_UART sample, added configuration for FEM, LOG, CDC-ACM, and the suggested additions from you as well. I also then tested with additions for Clock and DCDC, with the same results. (I also tested the additions on HCI_USB sample with additions for FEM, with similar results).

    Below are the new config files for reference: (CLOCK and DCDC included)

    --- PRJ.CONF ---
    
    CONFIG_USB_DEVICE_STACK=y
    CONFIG_USB_DEVICE_PRODUCT="Zephyr HCI UART sample"
    CONFIG_USB_CDC_ACM=y
    CONFIG_USB_DEVICE_INITIALIZE_AT_BOOT=n
    
    
    CONFIG_GPIO=y
    CONFIG_SERIAL=y
    CONFIG_UART_INTERRUPT_DRIVEN=y
    CONFIG_BT=y
    CONFIG_BT_HCI_RAW=y
    CONFIG_BT_HCI_RAW_H4=y
    CONFIG_BT_HCI_RAW_H4_ENABLE=y
    CONFIG_BT_BUF_ACL_RX_SIZE=255
    CONFIG_BT_BUF_CMD_TX_SIZE=255
    CONFIG_BT_BUF_EVT_DISCARDABLE_SIZE=255
    CONFIG_BT_CTLR_ASSERT_HANDLER=y
    CONFIG_BT_MAX_CONN=16
    CONFIG_BT_TINYCRYPT_ECC=n
    CONFIG_BT_CTLR_DTM_HCI=y
    
    CONFIG_SYSTEM_WORKQUEUE_STACK_SIZE=512
    
    # Workaround: Unable to allocate command buffer when using K_NO_WAIT since
    # Host number of completed commands does not follow normal flow control.
    CONFIG_BT_BUF_CMD_TX_COUNT=10
    
    #---------Nordic
    # Already in the Example HCI_UART
    CONFIG_BT_BUF_CMD_TX_COUNT=10  
    
    CONFIG_BT_BUF_EVT_RX_COUNT=16
     
    CONFIG_BT_BUF_EVT_RX_SIZE=255
    # Already in the Example HCI_UART
    CONFIG_BT_BUF_ACL_RX_SIZE=255   
    CONFIG_BT_BUF_ACL_TX_SIZE=251
    # Already in the Example HCI_UART
    CONFIG_BT_BUF_CMD_TX_SIZE=255    
     
    CONFIG_BT_RX_STACK_SIZE=2048
    
    #LOG---------------------------
    
    CONFIG_LOG=y
    
    CONFIG_THREAD_NAME=y
    CONFIG_THREAD_ANALYZER=y
    CONFIG_THREAD_ANALYZER_AUTO=y
    CONFIG_THREAD_ANALYZER_RUN_UNLOCKED=y
    
    CONFIG_HW_STACK_PROTECTION=y
    
    CONFIG_LOG_MODE_DEFERRED=y
    CONFIG_LOG_MODE_OVERFLOW=y
    CONFIG_CONSOLE=y
    CONFIG_STDOUT_CONSOLE=y
    CONFIG_LOG_BACKEND_UART=y
    CONFIG_LOG_DEFAULT_LEVEL=3
    
    # INDV
    CONFIG_USB_DRIVER_LOG_LEVEL_INF=y
    CONFIG_USB_DEVICE_LOG_LEVEL_INF=y
    CONFIG_BT_DEBUG_LOG=y
    CONFIG_BT_HCI_DRIVER_LOG_LEVEL_DBG=y
    CONFIG_BT_HCI_CORE_LOG_LEVEL_DBG=y
    
    #--------------------------------------------------------------------------------
    # Enable External Crystals
    #--------------------------------------------------------------------------------
    CONFIG_CLOCK_CONTROL_NRF_K32SRC_20PPM=y
    
    # Enable the External Low Frequency Oscillator
    CONFIG_SOC_ENABLE_LFXO=y
    CONFIG_SOC_LFXO_CAP_INT_7PF=y
    
    # Enable High Frequency Crystal internal capacitors
    CONFIG_SOC_HFXO_CAP_INTERNAL=y
    CONFIG_SOC_HFXO_CAP_INT_VALUE_X2=27
    
    
    
    #--------------------------------------------------------------------------------
    # Enable DC/DC regulators
    #--------------------------------------------------------------------------------
    # Enable App DC/DC regulator
    CONFIG_BOARD_ENABLE_DCDC_APP=y
    
    # Enable Net DC/DC regulator
    CONFIG_BOARD_ENABLE_DCDC_NET=y
    
    # Disable High Voltage DC/DC regulator
    CONFIG_BOARD_ENABLE_DCDC_HV=n

    --- HCI_RPMSG.CONF ---
    
    #################################################################################
    #
    #  HCI_RPMSP Child Image Configuration 
    #
    #################################################################################
    
    ##########################################################################
    ## LOGGING Configuration --------------------------------------
    CONFIG_LOG=y
    CONFIG_BT_DEBUG_LOG=y
    CONFIG_LOG_MODE_DEFERRED=y
    CONFIG_LOG_MODE_OVERFLOW=y
    CONFIG_LOG_DEFAULT_LEVEL=3
    
    #--------------------------------------------------------------------------------
    # Disable network cpu serial port, console, and UART console for low power
    #--------------------------------------------------------------------------------
    #CONFIG_SERIAL=n
    #CONFIG_CONSOLE=n
    #CONFIG_UART_CONSOLE=n
    #CONFIG_LOG_BACKEND_UART=n
    
    #=========================================================================
    #--------------------------------------------------------------------------------
    # Enable Pin Control
    #--------------------------------------------------------------------------------
    CONFIG_PINCTRL=y
    #--------------------------------------------------------------------------------
    # Enable Pin Control
    #--------------------------------------------------------------------------------
    
    #--------------------------------------------------------------------------------
    # Enable SPI
    #--------------------------------------------------------------------------------
    
    # Enabled for FEM control
    CONFIG_SPI=y
    
    
    #--------------------------------------------------------------------------------
    # Enable MPSL and FEM
    #--------------------------------------------------------------------------------
    CONFIG_MPSL=y
    CONFIG_MPSL_FEM=y
    CONFIG_MPSL_FEM_NRF21540_GPIO=y
    CONFIG_MPSL_FEM_NRF21540_TX_GAIN_DB=20
    CONFIG_BT_CTLR_TX_PWR_ANTENNA=20
    CONFIG_MPSL_FEM_NRF21540_GPIO_SPI=y
    
    CONFIG_BT_PHY_UPDATE=n
    
    
    ##---------------NORDIC
    CONFIG_BT_BUF_CMD_TX_COUNT=10
     
    CONFIG_BT_BUF_EVT_RX_COUNT=16
     
    CONFIG_BT_BUF_EVT_RX_SIZE=255
    CONFIG_BT_BUF_ACL_RX_SIZE=255
    CONFIG_BT_BUF_ACL_TX_SIZE=251
    CONFIG_BT_BUF_CMD_TX_SIZE=255
     
    CONFIG_BT_RX_STACK_SIZE=2048

     --- HCI_RPMSG.overlay ---
    //Disable for FEM use, as conflicts with SPI on netcore
    &uart0{
        status = "disabled";
    };

    --- bl5340pa_dvk_cpuapp.overlay ---
    
    / {
    	chosen {
    	   zephyr,bt-c2h-uart = &cdc_acm0;
    	   zephyr,console = &cdc_acm1;
    	};
     };
     
    
    &zephyr_udc0 {
    	cdc_acm0: cdc_acm0 {
    		compatible = "zephyr,cdc-acm-uart";
    	};
    	cdc_acm1: cdc_acm1 {
    		compatible = "zephyr,cdc-acm-uart";
    	};
    };
    &uart0 {
    	compatible = "nordic,nrf-uarte";
    	current-speed = <1000000>;
    	status = "okay";
    	hw-flow-control;
    };

    The "test" ran was setting up the board as a serial bt adapter using BTAttach (pictured below), and start scanning with Bluetoothctl until it fails. Then when it failed I ran some more Bluetoothctl commands to test the failure (pictured below).

    Let me know if there are any questions, if you need different LOG settings, or any other information from me. Thank you for your assistance!

    Edit:

    Seems that after failure any HCI command gets this response from the bl5340pa USB or some version of the "Final HCI Buffer". The above output was logged after trying to bring the HCI device back online, as the board never freezes, the HCI interface just stops functioning. I used hciconfig hci1 up, which timed out.

     

    Edit: I also tested this across Linux Kernel Versions 5.4.0, 5.15.0, 5.16.0 (all generic), in order to rule out this possibility, and the same issue occurred in all tests.

  • Hi 

    Thanks for sharing the additional details. The Bluetooth experts didn't have any immediate ideas unfortunately, this doesn't appear to be an issue we have encountered before. 

    Is there any chance you could test this in the latest SDK (v2.6.0), or would you be dependent on an update from the Laird side? 

    On that note, would you be able to test this on a standard nRF5340DK to see if the issue could somehow be related to the Laird module and/or SDK fork? 

    If you were to reproduce the issue on the nRF5340DK it would also be a lot easier for us to investigate this, since it would allow us to reproduce it internally. 

    Best regards
    Torbjørn

  • No problem, thanks for asking around on this issue. 

    I will test on the same SDK today with the nrf5340DK. I also have BL5340DK without the FEM that I will try. Then if the issue persists, I will move onto the newer version of the SDK. 

    This branch of the 2.4.1 SDK is required for the BL5340PA, in order to ensure compliance in TX power, but will help narrow down the issue. I may need to reach out to Laird, depending on the results of this test.

    Thanks, will update with results and logs.

  •  

    I did some further testing today, but got a little busy and my results were confusing and conflicting. I will reach out when I have more information on this either Friday or Monday. Thanks for your help so far!

Related