NRF5340 HCI_USB & HCI_UART Stop Scanning

I am developing a custom nrf5340 board design, that is using the BL5340PA external antenna module to implement the nrf5340 chip in the design. The BL5340 module allows all of the base functionality in the nrf5340 module, but also adds an external FEM in increase the TX power. I am utilizing the Zephyr/nrf SDK version 2.4.1 (received from Laird for updated TX power setting rules: GitHub - LairdCP/bl5340pa_manifest: Manifest for the Laird Connectivity fork of the nRF Connect SDK with support for the BL5340PA). I am attempting to utilize either the HCI_USB or HCI_UART firmware projects, that come with Zephyr, in order to make this custom board a reliable BLE adapter for Linux. My Linux device is running off of kernel version 5.4.0-174-generic and is connected to the custom board through the NRF USB interface on the nrf5340. My BlueZ is version 5.66. The below actions are what I am trying to accomplish:

  1. Reliable BLE Adapter over USB or CDC-ACM USB
  2. Application Core Firmware Update over CDC-ACM USB
    1. Currently using MCUMGR
  3. Network Core Firmware Update over CDC-ACM USB
    1. Not functional, as no external flash on custom board. Need method to use internal flash

The items above I would like to complete in that priority as well. Therefore I will focus on the BLE Adapter functionality with mentions to the others. 

My Issue:

  • Utilizing the HCI_USB or HCI_USB project, HCI_UART is sent through the CDC-ACM driver and setup with BTATTACH (HCIATTACH tried too, same effect, and it is deprecated now) in Linux as H4 protocol with 1000000 speed, whereas HCI_USB is automatically recognized by BTUSB driver in Linux and setup. I can communicate, connect, pair, and perform all necessary BT actions between both of these projects (this has been tested using BluetoothCTL and HCI_BUS commands in C++ code). After some time (HCI_UART, few minutes to an hour. HCI_USB, hour to two hours) of Scanning for devices the adapter will stop functioning, communication will halt on a timeout (This is shown in the image below, as output from DMESG, below that is what appears in BTMON). In some Debug Messages from the custom board I have received a "<wrn> bt_hci_driver: Couldn't allocate a buffer after waiting 10 seconds." error when this occurs. After this error occurs there are a few different methods of resetting the board that work to recover and regain BLE functionality with BlueZ. 
    • Recovery Methods
      • HCI_USB
        • bluetoothctl.power off -> bluetoothctl.power on
        • Or any hard power reset of board
          • Details: Default Zephyr Project
      • HCI_UART
        • Hard power reset of board
          • Details: Default Zephyr Project, with addition of CDC-ACM configuration for bt-c2h-uart redirection. 
      • HCI_USB with MCUMGR
        • MCUMGR reset
        • or any hard power reset of board
          • Details:
            • This project is implemented with a change in the SDK to allow the CDC-ACM interface to always be second, allowing HCI_USB to use the first interface and BTUSB to recognize this. Both USB interfaces are active at the same time, as seen when plugging into Windows, but Linux only allows one to be active as the BTUSB and CDC drivers conflict on assignment. Meaning that in order to switch the USB interface of the board has to be unbinded.  
      • HCI_UART with MCUMGR
        • MCUMGR reset
        • or any hard power reset of board
          • Details:
            • 2 CDC-ACM ports, one for MCUMGR and one for HCI_UART

I have seen this issue across every attempt with these firmware projects, and any configuration I try with them. Eventually the board will stop functioning as a BT adapter and some, possibly extensive, actions have to be taken to recover it. From my investigation, this seems to be, on the surface, a memory or issue in the speed of communication. As the issue occurs faster when there are many more Bluetooth devices available to be scanned. I tried using the same testing methods with another Linux BT Adapter (Intel AX210) and they did not occur, which let me know that this is an issue within my Zephyr firmware. This issue seems to occur faster with the HCI_UART version of the firmware, as opposed to HCI_USB.

I have researched this issue everywhere and cannot find a good solution or any solution to resolve this. Below are some links that I have gone to find any solution:

https://lists.zephyrproject.org/g/devel/topic/hci_interface_stopped_working/71746540?p=,,,20,0,0,0::recentpostdate%2Fsticky,,,20,2,640,71746540

https://github.com/zephyrproject-rtos/zephyr/issues/20250

https://github.com/zephyrproject-rtos/zephyr/issues/37731

Code from my HCI_UART_MCUMGR project:

-  prj.conf

"prj.conf"

#################################################################################
#
#  Custom-Board Project Configuration 
#
#################################################################################


##########################################################################
## HCI_UART Project Configuration --------------------------------------
# --- Sets up the UART interface for HCI control of the BT Carrier Board
# -- From default HCI_UART project with few removals due to conflicts

CONFIG_STDOUT_CONSOLE=n
CONFIG_UART_CONSOLE=n
CONFIG_GPIO=y
CONFIG_SERIAL=y
CONFIG_UART_INTERRUPT_DRIVEN=y
CONFIG_BT=y
CONFIG_BT_HCI_RAW=y
CONFIG_BT_HCI_RAW_H4=y
CONFIG_BT_HCI_RAW_H4_ENABLE=y
CONFIG_BT_BUF_ACL_RX_SIZE=255
CONFIG_BT_BUF_CMD_TX_SIZE=255
CONFIG_BT_BUF_EVT_DISCARDABLE_SIZE=255
CONFIG_BT_MAX_CONN=16
CONFIG_BT_TINYCRYPT_ECC=n

# Workaround: Unable to allocate command buffer when using K_NO_WAIT since
# Host number of completed commands does not follow normal flow control.
CONFIG_BT_BUF_CMD_TX_COUNT=10

#=========================================================================

##########################################################################
## HCI -> USB Project Configuration --------------------------------------
# --- Sets up the USB interface to be used for communication
# --- USB Interface is used for MCUMGR and HCI_UART
# --- Composite USB configuration allowing multiple USB interfaces

CONFIG_USB_DEVICE_STACK=y
CONFIG_USB_DEVICE_PRODUCT="Custom-Board"

#CONFIG_USB_DEVICE_PID=
#CONFIG_USB_DEVICE_VID=

CONFIG_USB_CDC_ACM=y
CONFIG_USB_DEVICE_INITIALIZE_AT_BOOT=n
CONFIG_UART_LINE_CTRL=y

#=========================================================================

##########################################################################
## MCUMGR/MCUBOOT Project Configuration --------------------------------------
# --- Enables MCUMGR that is used for Image Management
# --- Enables MCUBOOT bootloader to handle the images on the carrier board, as well as booting

# Enable MCUmgr and dependencies.
CONFIG_NET_BUF=y
CONFIG_ZCBOR=y
CONFIG_CRC=y
CONFIG_MCUMGR=y
CONFIG_STREAM_FLASH=y
CONFIG_FLASH_MAP=y

# Some command handlers require a large stack.
CONFIG_SYSTEM_WORKQUEUE_STACK_SIZE=2304
CONFIG_MAIN_STACK_SIZE=2048

# Ensure an MCUboot-compatible binary is generated.
CONFIG_BOOTLOADER_MCUBOOT=y

# Enable flash operations.
CONFIG_FLASH=y

# Required by the `taskstat` command.
CONFIG_THREAD_MONITOR=y

# Support for taskstat command
CONFIG_MCUMGR_GRP_OS_TASKSTAT=y

# Enable statistics and statistic names.
CONFIG_STATS=y
CONFIG_STATS_NAMES=y

# Enable most core commands.
CONFIG_FLASH=y
CONFIG_IMG_MANAGER=y
CONFIG_MCUMGR_GRP_IMG=y
CONFIG_MCUMGR_GRP_OS=y
CONFIG_MCUMGR_GRP_STAT=y

# Enable logging
CONFIG_LOG=y
CONFIG_MCUBOOT_UTIL_LOG_LEVEL_WRN=y

# Disable debug logging
CONFIG_LOG_MAX_LEVEL=3

CONFIG_MCUMGR_TRANSPORT_UART=y
CONFIG_BASE64=y

#=========================================================================

- mcuboot.conf

"mcuboot.conf"

#################################################################################
#
#  MCUBOOT Child Image Configuration 
#
#################################################################################


#--------------------------------------------------------------------------------
# Enable Pin Control
#--------------------------------------------------------------------------------
CONFIG_PINCTRL=y


#--------------------------------------------------------------------------------
# Enable code size optimization on the compiler
#--------------------------------------------------------------------------------
CONFIG_SIZE_OPTIMIZATIONS=y


#--------------------------------------------------------------------------------
# Enable multi threading.
#--------------------------------------------------------------------------------
CONFIG_MULTITHREADING=y


#--------------------------------------------------------------------------------
# Add private key for MCUboot. Refer to these sites for more information:
# https://developer.nordicsemi.com/nRF_Connect_SDK/doc/latest/nrf/app_dev/bootloaders_and_dfu/bootloader_adding.html#id11
# https://developer.nordicsemi.com/nRF_Connect_SDK/doc/latest/nrf/app_dev/bootloaders_and_dfu/fw_update.html#ug-fw-update-keys-python
#--------------------------------------------------------------------------------
CONFIG_BOOT_SIGNATURE_KEY_FILE="custom_file_and_path_here.pem"

- hci_rpmsg.conf (to setup FEM)

"hci_rpmsg.conf"

#################################################################################
#
#  HCI_RPMSP Child Image Configuration 
#
#################################################################################

#--------------------------------------------------------------------------------
# Enable Pin Control
#--------------------------------------------------------------------------------
CONFIG_PINCTRL=y


#--------------------------------------------------------------------------------
# Enable SPI
#--------------------------------------------------------------------------------

# Enabled for FEM control
CONFIG_SPI=y


#--------------------------------------------------------------------------------
# Enable MPSL and FEM
#--------------------------------------------------------------------------------
CONFIG_MPSL=y
CONFIG_MPSL_FEM=y
CONFIG_MPSL_FEM_NRF21540_GPIO=y
CONFIG_MPSL_FEM_NRF21540_TX_GAIN_DB=20
CONFIG_BT_CTLR_TX_PWR_ANTENNA=20
CONFIG_MPSL_FEM_NRF21540_GPIO_SPI=y

- bl5340pa_dvk_cpuapp.overlay (I have custom board files made for my board, but for ease of testing it also works with the bl5340pa DVK default board files. Both configurations give the same results)

"bl5340pa_dvk_cpuapp.overlay"

/ {
	chosen {
	   zephyr,uart-mcumgr = &cdc_acm1;
	   zephyr,bt-c2h-uart = &cdc_acm0;
	};
 };
 

&zephyr_udc0 {
	cdc_acm0: cdc_acm0 {
		compatible = "zephyr,cdc-acm-uart";
	};
	cdc_acm1: cdc_acm1 {
		compatible = "zephyr,cdc-acm-uart";
	};
};

&uart0 {
	current-speed = <1000000>;
	status = "disabled";
	hw-flow-control;
};

 

The Main.C src file of HCI_USB and HCI_UART is unedited from either of the examples.

As a note on the above, this has also been tested without any of the FEM or MCUBOOT/MCUMGR configurations to ensure that is not affecting the projects. Same effect observed.

Questions:

  1. Is there something that I am missing that is causing this issue?
    1. On the Zephyr Configuration side?
    2. On the Linux/BlueZ setup side?
  2. Is there another method to perform my desired actions that would avoid these issues?
  3. Is this the right place to investigate this?
  4. Or any other advice would be greatly appreciated.

Thank you to whomever can help with this issue, as it has been causing me a lot of troubles. Let me know if I can supply any addition information that would assist in resolving this. 

Parents
  • Hi Devin

    There are some configuration changes you could try on the nRF side in order to try and improve the reliability, in particular the CONFIG_BT_RX_STACK_SIZE configuration. 

    Could you try to add the following config settings both for the appcore (hci_usb/hci_uart) and netcore (hci_rpmsg/hci_ipc), and see if it works better?

    CONFIG_BT_BUF_CMD_TX_COUNT=10
     
    CONFIG_BT_BUF_EVT_RX_COUNT=16
     
    CONFIG_BT_BUF_EVT_RX_SIZE=255
    CONFIG_BT_BUF_ACL_RX_SIZE=255
    CONFIG_BT_BUF_ACL_TX_SIZE=251
    CONFIG_BT_BUF_CMD_TX_SIZE=255
     
    CONFIG_BT_RX_STACK_SIZE=2048
     

    If it is still unreliable, would you be able to provide the full btmon log, and ideally also the nRF side log, for analysis? 

    Best regards
    Torbjørn

  •    Thanks for your response!

    I inserted the log files from dmesg, btmon, and log from the Zephyr Console. This test was on the HCI_UART with CDC_ACM driver and a 2nd CDC-ACM interface used for output of Zephyr Console Logs.
    7624.btmon.txt

    4643.dmesg.txt

    The bottom of dmesg file is applicable to the current test.

    5224.log.txt

    I switched back to the default HCI_UART sample, added configuration for FEM, LOG, CDC-ACM, and the suggested additions from you as well. I also then tested with additions for Clock and DCDC, with the same results. (I also tested the additions on HCI_USB sample with additions for FEM, with similar results).

    Below are the new config files for reference: (CLOCK and DCDC included)

    --- PRJ.CONF ---
    
    CONFIG_USB_DEVICE_STACK=y
    CONFIG_USB_DEVICE_PRODUCT="Zephyr HCI UART sample"
    CONFIG_USB_CDC_ACM=y
    CONFIG_USB_DEVICE_INITIALIZE_AT_BOOT=n
    
    
    CONFIG_GPIO=y
    CONFIG_SERIAL=y
    CONFIG_UART_INTERRUPT_DRIVEN=y
    CONFIG_BT=y
    CONFIG_BT_HCI_RAW=y
    CONFIG_BT_HCI_RAW_H4=y
    CONFIG_BT_HCI_RAW_H4_ENABLE=y
    CONFIG_BT_BUF_ACL_RX_SIZE=255
    CONFIG_BT_BUF_CMD_TX_SIZE=255
    CONFIG_BT_BUF_EVT_DISCARDABLE_SIZE=255
    CONFIG_BT_CTLR_ASSERT_HANDLER=y
    CONFIG_BT_MAX_CONN=16
    CONFIG_BT_TINYCRYPT_ECC=n
    CONFIG_BT_CTLR_DTM_HCI=y
    
    CONFIG_SYSTEM_WORKQUEUE_STACK_SIZE=512
    
    # Workaround: Unable to allocate command buffer when using K_NO_WAIT since
    # Host number of completed commands does not follow normal flow control.
    CONFIG_BT_BUF_CMD_TX_COUNT=10
    
    #---------Nordic
    # Already in the Example HCI_UART
    CONFIG_BT_BUF_CMD_TX_COUNT=10  
    
    CONFIG_BT_BUF_EVT_RX_COUNT=16
     
    CONFIG_BT_BUF_EVT_RX_SIZE=255
    # Already in the Example HCI_UART
    CONFIG_BT_BUF_ACL_RX_SIZE=255   
    CONFIG_BT_BUF_ACL_TX_SIZE=251
    # Already in the Example HCI_UART
    CONFIG_BT_BUF_CMD_TX_SIZE=255    
     
    CONFIG_BT_RX_STACK_SIZE=2048
    
    #LOG---------------------------
    
    CONFIG_LOG=y
    
    CONFIG_THREAD_NAME=y
    CONFIG_THREAD_ANALYZER=y
    CONFIG_THREAD_ANALYZER_AUTO=y
    CONFIG_THREAD_ANALYZER_RUN_UNLOCKED=y
    
    CONFIG_HW_STACK_PROTECTION=y
    
    CONFIG_LOG_MODE_DEFERRED=y
    CONFIG_LOG_MODE_OVERFLOW=y
    CONFIG_CONSOLE=y
    CONFIG_STDOUT_CONSOLE=y
    CONFIG_LOG_BACKEND_UART=y
    CONFIG_LOG_DEFAULT_LEVEL=3
    
    # INDV
    CONFIG_USB_DRIVER_LOG_LEVEL_INF=y
    CONFIG_USB_DEVICE_LOG_LEVEL_INF=y
    CONFIG_BT_DEBUG_LOG=y
    CONFIG_BT_HCI_DRIVER_LOG_LEVEL_DBG=y
    CONFIG_BT_HCI_CORE_LOG_LEVEL_DBG=y
    
    #--------------------------------------------------------------------------------
    # Enable External Crystals
    #--------------------------------------------------------------------------------
    CONFIG_CLOCK_CONTROL_NRF_K32SRC_20PPM=y
    
    # Enable the External Low Frequency Oscillator
    CONFIG_SOC_ENABLE_LFXO=y
    CONFIG_SOC_LFXO_CAP_INT_7PF=y
    
    # Enable High Frequency Crystal internal capacitors
    CONFIG_SOC_HFXO_CAP_INTERNAL=y
    CONFIG_SOC_HFXO_CAP_INT_VALUE_X2=27
    
    
    
    #--------------------------------------------------------------------------------
    # Enable DC/DC regulators
    #--------------------------------------------------------------------------------
    # Enable App DC/DC regulator
    CONFIG_BOARD_ENABLE_DCDC_APP=y
    
    # Enable Net DC/DC regulator
    CONFIG_BOARD_ENABLE_DCDC_NET=y
    
    # Disable High Voltage DC/DC regulator
    CONFIG_BOARD_ENABLE_DCDC_HV=n

    --- HCI_RPMSG.CONF ---
    
    #################################################################################
    #
    #  HCI_RPMSP Child Image Configuration 
    #
    #################################################################################
    
    ##########################################################################
    ## LOGGING Configuration --------------------------------------
    CONFIG_LOG=y
    CONFIG_BT_DEBUG_LOG=y
    CONFIG_LOG_MODE_DEFERRED=y
    CONFIG_LOG_MODE_OVERFLOW=y
    CONFIG_LOG_DEFAULT_LEVEL=3
    
    #--------------------------------------------------------------------------------
    # Disable network cpu serial port, console, and UART console for low power
    #--------------------------------------------------------------------------------
    #CONFIG_SERIAL=n
    #CONFIG_CONSOLE=n
    #CONFIG_UART_CONSOLE=n
    #CONFIG_LOG_BACKEND_UART=n
    
    #=========================================================================
    #--------------------------------------------------------------------------------
    # Enable Pin Control
    #--------------------------------------------------------------------------------
    CONFIG_PINCTRL=y
    #--------------------------------------------------------------------------------
    # Enable Pin Control
    #--------------------------------------------------------------------------------
    
    #--------------------------------------------------------------------------------
    # Enable SPI
    #--------------------------------------------------------------------------------
    
    # Enabled for FEM control
    CONFIG_SPI=y
    
    
    #--------------------------------------------------------------------------------
    # Enable MPSL and FEM
    #--------------------------------------------------------------------------------
    CONFIG_MPSL=y
    CONFIG_MPSL_FEM=y
    CONFIG_MPSL_FEM_NRF21540_GPIO=y
    CONFIG_MPSL_FEM_NRF21540_TX_GAIN_DB=20
    CONFIG_BT_CTLR_TX_PWR_ANTENNA=20
    CONFIG_MPSL_FEM_NRF21540_GPIO_SPI=y
    
    CONFIG_BT_PHY_UPDATE=n
    
    
    ##---------------NORDIC
    CONFIG_BT_BUF_CMD_TX_COUNT=10
     
    CONFIG_BT_BUF_EVT_RX_COUNT=16
     
    CONFIG_BT_BUF_EVT_RX_SIZE=255
    CONFIG_BT_BUF_ACL_RX_SIZE=255
    CONFIG_BT_BUF_ACL_TX_SIZE=251
    CONFIG_BT_BUF_CMD_TX_SIZE=255
     
    CONFIG_BT_RX_STACK_SIZE=2048

     --- HCI_RPMSG.overlay ---
    //Disable for FEM use, as conflicts with SPI on netcore
    &uart0{
        status = "disabled";
    };

    --- bl5340pa_dvk_cpuapp.overlay ---
    
    / {
    	chosen {
    	   zephyr,bt-c2h-uart = &cdc_acm0;
    	   zephyr,console = &cdc_acm1;
    	};
     };
     
    
    &zephyr_udc0 {
    	cdc_acm0: cdc_acm0 {
    		compatible = "zephyr,cdc-acm-uart";
    	};
    	cdc_acm1: cdc_acm1 {
    		compatible = "zephyr,cdc-acm-uart";
    	};
    };
    &uart0 {
    	compatible = "nordic,nrf-uarte";
    	current-speed = <1000000>;
    	status = "okay";
    	hw-flow-control;
    };

    The "test" ran was setting up the board as a serial bt adapter using BTAttach (pictured below), and start scanning with Bluetoothctl until it fails. Then when it failed I ran some more Bluetoothctl commands to test the failure (pictured below).

    Let me know if there are any questions, if you need different LOG settings, or any other information from me. Thank you for your assistance!

    Edit:

    Seems that after failure any HCI command gets this response from the bl5340pa USB or some version of the "Final HCI Buffer". The above output was logged after trying to bring the HCI device back online, as the board never freezes, the HCI interface just stops functioning. I used hciconfig hci1 up, which timed out.

     

    Edit: I also tested this across Linux Kernel Versions 5.4.0, 5.15.0, 5.16.0 (all generic), in order to rule out this possibility, and the same issue occurred in all tests.

  • Devin,

    I am trying to reproduce this issue, and just wanted to clarify a few things. Can you confirm or refute whether these combinations fail?

    1. Unmodified HCI_UART sample with your custom board on Laird 2.4.1 fork.
    2. Unmodified HCI_UART sample with your custom board on stock nRF Connect SDK 2.4.1.
    3. Unmodified HCI_UART sample with nRF5340DK on Laird 2.4.1 fork.
    4. Unmodified HCI_UART sample with nRF5340DK on stock nRF Connect SDK 2.4.1.

    Did you run all of them with no FEM enabled, without coded PHY or any other changes?

    Regards,
    Knut

  •    Thanks for your response. I have ran these tests a few times on the custom board, BL5340PA DK, BL5340 DK, but my nrf5340DK was acting up so I can give that another try as well. For a sanity check for me and others supporting this issue, I will run each of these tests today, but they will still all need the cdc-acm modification made to the projects to utilize the USB interface. 

    I will create fresh copies of the 2.4.1 fork and stock version of the SDK to ensure nothing else has been changed.

  • HCI_BLE_TEST.zip

    An update on the tests performed today. I tested HCI_USB, and HCI_UART example projects on the BL5340DK, BL5340PADK, and nrf5340DK on a fresh copy of the laird fork of the 2.4.1 SDK. Below were my findings per device per project. Note that the max runtime I allowed for these was an hour before I cut it off without failure. I have also attached a .zip of all the projects that I used for these tests. They are named accordingly with the example they were taken from, and have the built build directories for the boards that were tested with those projects, if looking through the generated configs is desired. 

    Time Run: Amount of time ran without issue, if includes "+" means that failure did not occur

    Devices Scanned: Count of devices listed in the Bluetoothctl "devices" command after scanning was complete

    HCI_USB_NO_PA (no modification)

    - BL5340DK: Time Ran: 1+ hours, Devices Scanned: 1259

    - BL5340PADK: Time Ran: 2.5min, 30seconds, Devices Scanned(30 sec): 114
    - nrf5340DK: Time Ran: 1+ hours, Devices Scanned: 901

    HCI_USB_WITH_PA (Added child_image/hci_rpmsg.conf with MPSL PA config)

    - BL5340PADK: Time Ran: 45 sec, 20 sec, 30 sec Devices Scanned(30 sec): 134

    HCI_UART_NO_PA (Added CDC-ACM redirection)

    - BL5340DK: Time Ran: 1+hours Devices Scanned: 1520

    - BL5340PADK: Time Ran: 1min, 1.5min, Devices Scanned(1min): 107

    - nrf5340DK: Time Ran: 1+hours, Devices Scanned: 902

    HCI_UART_WITH_PA (Added CDC-ACM redirection, and child_image/hci_rpmsg.conf)

    - BL5340PADK: Time Ran: 45 sec, 45 sec, Devices Scanned: 156

    I did not run this test with the original 2.4.1 Zephyr/nrf SDK as I did not have any device fail in these hour long tests besides the BL5340PADK. I have had the same failure on the BL5340DK without the PA before, but I will be running that test tonight to give it more time to fail and to confirm this. I did not find it useful testing the BL5340PADK on the original 2.4.1 zephyr/nrf SDK as scanning is really slow/inoperable due to not having the configuration to deal with the FEM/External antenna. 

    I used the DK's only as I have seen the same behavior on the BL5340PADK and the custom board and this should be easier to test for others following along. 

    No other changes were made in these projects besides what is listed next to the project name, but as we move forward we can edit these further to test. And I am able to add segger RTT logs for netcore and either CDC-ACM or segger logs for the app core if desired.


    As a reminder this same error is the error that is occurring across HCI_USB and HCI_UART.


    Thanks!

  • Thanks a lot for taking the time to do these tests. We will look into your findings and get back to you.

  • No problem, thank you.

    This morning my BL5340DK on HCI_USB default project is still running with no issues. This has been running for about 16 hours, in my very busy BT environment with only a single "scan on" request to start and continue running. The DK had a little under 7000 mac addresses stored, from devices scanned, in the "bluetoothctl devices" command.


    The above logs are from dmesg on the Linux device running the DK, as these are really all I can retrieve with the current project settings. I see the normal "advertising data len corrected" that seems to appear on any BT adapter with enough runtime, and the "bt_err_ratelimited" which I have seen on the functional tests of the HCI projects. 



    Since the failure I saw before was on the BL5340DK with HCI_UART, but it had more features enabled, I will run a long run test with the Laird SDK and the default HCI_UART today to see if I can catch any failure to compare to the normal Zephyr/nrf 2.4.1 SDK. 

Reply
  • No problem, thank you.

    This morning my BL5340DK on HCI_USB default project is still running with no issues. This has been running for about 16 hours, in my very busy BT environment with only a single "scan on" request to start and continue running. The DK had a little under 7000 mac addresses stored, from devices scanned, in the "bluetoothctl devices" command.


    The above logs are from dmesg on the Linux device running the DK, as these are really all I can retrieve with the current project settings. I see the normal "advertising data len corrected" that seems to appear on any BT adapter with enough runtime, and the "bt_err_ratelimited" which I have seen on the functional tests of the HCI projects. 



    Since the failure I saw before was on the BL5340DK with HCI_UART, but it had more features enabled, I will run a long run test with the Laird SDK and the default HCI_UART today to see if I can catch any failure to compare to the normal Zephyr/nrf 2.4.1 SDK. 

Children
  • Default project HCI_UART with the BL5340DK ran without issues for 23 hours of continuous scanning as well on the laird fork of the SDK. Thanks!

  • At the end of my day on Friday, I tried disabling the SPI communication from the netcore on the BL5340PA, and in turn disabling the MSPL FEM SPI communication. From my understanding, the FEM with the BL5340PA is function with just GPIO control, but does not allow you to change the FEM gain from its default value of 20dBm (Assumed from the BL5340PA datasheet). I then started scanning and let it run for a longer period of time as I am not in an environment currently with a large amount of BT devices to be scanned (As well as the BT devices already in my environment, I placed an advertiser that advertises every minute for 30 seconds to speed up testing). The DK continuously scanned without interaction for 2.5 days until I stopped it today. 

    Seeing this I did some further testing, with default projects to confirm. I made a default HCI_USB project and added in the HCI_RPMSG configs:

    CONFIG_MPSL=y
    CONFIG_MPSL_FEM=y
    CONFIG_MPSL_FEM_NRF21540_GPIO=y
    CONFIG_MPSL_FEM_NRF21540_TX_GAIN_DB=20
    CONFIG_BT_CTLR_TX_PWR_ANTENNA=20
    CONFIG_MPSL_FEM_NRF21540_GPIO_SPI=y

    Enabling the SPI MPSL communication. This project, in my new environment, ran for 6-10minutes over a few attempts before it encountered the problem. 

    Then I used the same HCI_RPMSG configs except changed CONFIG_MPSL_FEM_NRF21540_GPIO_SPI=n.This new image ran for 40 minutes without issue before I stopped it. 

    I then repeated the above test with the defualt HCI_UART project (Adding in the CDC-ACM configs as well). I saw the same results with SPI being enabled it stopped in 6-10 minutes, and without it has continued running for over an hour without issue.

    This to me seems to confirm that it is an issue with SPI communication to the FEM. Looking at the assert that was raised in the original netcore error log, I would think this is due to a thread priority issue between SPI and radio/other processes in the netcore running, or an issue with the communication speed to the FEM. I also noticed that in the BL5340PA datasheet it states that the SPI interface cannot be used while the radio is being used:

    and possibly some conflict here is causing this issue.

    I will be doing some more tests with logging enabled on the net and app cores to see if I can catch any differences between the two, SPI enabled/disabled. But, let me know if there are any ideas or suggestions around this. Thanks again!

    Genera Note: In a low-power device that we have the BL5340PA in, not the one in question here, I found that without the SPI communication enabled there were power consumption issues at low power (which does not matter as much for this BLE adapter) and Laird told me this communication was required for proper functionality of the module, which it may only be needed for certifications other than FCC which I will confirm, but CE is a requirement for me making this SPI communication necessary. 

  • Thanks again for the detailed report. We will look into this and get back to you.

  • I just want to add, the team of Laird FAE's assisting as well have found issues with the nordic FEM driver while including the SPI communication previously. (Not relating specifically to the hci_usb uart projects we are looking into)

    The ticket number is: 311698

     But it seems I am unable to access this ticket. Just wanted to put this here in case it helps the nordic team reference any information as I was told this was tested with the nordic hardware.

    Thanks!

  • Hey     just wanted to check in to see if you were able to reproduce this issue on your side and if there are any updates, progress, or other issues you have run into on this.

    To me it seems as though the behavior of the SPI communication in the nrf FEM driver needs to be edited slightly to do less SPI communication, or to only use GPIO control unless on power up or TX power change where SPI is needed to set the FEM gain, if that matches the nrf FEM behavior. My use case I only need a static tx power, that just needs to be changed only to adhere to FCC, CE, etc regulations.

    Thanks!

Related