MPSL Fault upon connection when using nRF Connect Desktop BLE App

Hi,

Our team has encountered an intermittent issue when attempting to connect to our nRF device (nRF52833) using the nRF Connect Desktop BLE app. This issue does not occur when connecting using the nRF Connect Mobile app for both Android and iOS. Our application is using the nRF Connect SDK v2.5.1.

This error occurs as soon as the device connects, but before the BT connected callback is triggered. We have tried increasing the MPSL work queue stack size (CONFIG_MPSL_WORK_STACK_SIZE) to 8K but the issue persists. The Bluetooth logs were set to debug level but no differences were observed in them for when the error occurred or not.

Below are the relevant Kconfigs:

CONFIG_BT=y
CONFIG_BT_ID_MAX=1
CONFIG_BT_PERIPHERAL=y
CONFIG_BT_BAS=y
CONFIG_BT_HRS=n
CONFIG_BT_DEVICE_APPEARANCE=31368
CONFIG_BT_LIM_ADV_TIMEOUT=32
# seconds
# BT Pairing
# CONFIG_BT_MAX_CONN Has to be 2 when calling "bt_le_adv_start" from disconnected() callback
CONFIG_BT_SMP=y
CONFIG_BT_SMP_APP_PAIRING_ACCEPT=y
CONFIG_BT_MAX_CONN=2
CONFIG_BT_MAX_PAIRED=2
CONFIG_BT_FIXED_PASSKEY=y
CONFIG_BT_FILTER_ACCEPT_LIST=y
# The use of RPA is not enabled
CONFIG_BT_PRIVACY=n
# DIS Items
CONFIG_BT_DIS=y
CONFIG_BT_DIS_PNP=n
CONFIG_BT_DIS_HW_REV=y
CONFIG_BT_DIS_HW_REV_STR="HW V1"
CONFIG_BT_DIS_FW_REV=y
CONFIG_BT_DIS_FW_REV_STR="FW V0.4.1"
CONFIG_BT_DIS_SETTINGS=y
CONFIG_BT_DIS_STR_MAX=20
# Below is setup to let DIS information be read from settings
# CONFIG_SETTINGS_NONE=y
# Enable BT Bonding and DIS info read from settings
CONFIG_BT_SETTINGS=y
CONFIG_SETTINGS=y
CONFIG_SETTINGS_NVS=y
CONFIG_SETTINGS_RUNTIME=y
CONFIG_FLASH=y
CONFIG_FLASH_PAGE_LAYOUT=y
CONFIG_FLASH_MAP=y
CONFIG_NVS=y
CONFIG_NVS_LOG_LEVEL_DBG=y
CONFIG_BT_BONDABLE=y
CONFIG_BT_ID_UNPAIR_MATCHING_BONDS=y
# BT size settings
#CONFIG_BT_L2CAP_TX_MTU=247
#CONFIG_BT_BUF_ACL_RX_SIZE=251
#CONFIG_BT_BUF_ACL_TX_SIZE=251
#CONFIG_BT_CTLR_DATA_LENGTH_MAX=251
# Stack sizes
CONFIG_MPU_STACK_GUARD=y
CONFIG_BT_RX_STACK_SIZE=2048
# Note: This stack must be 2K or higher to avoid stack overflow when pairing to Android Devices
CONFIG_MPSL_WORK_STACK_SIZE=2048
CONFIG_BT_HCI_TX_STACK_SIZE_WITH_PROMPT=y
CONFIG_BT_HCI_TX_STACK_SIZE=1024

Here are the RTT error logs from the nRF device:

[00:00:22.271,484] <dbg> bt_gatt: bt_gatt_attr_read: handle 0x0000 offset 0 length 2
[00:00:22.272,033] <dbg> bt_gatt: bt_gatt_attr_read: handle 0x0000 offset 0 length 5
[00:00:22.272,888] <dbg> bt_gatt: bt_gatt_attr_read: handle 0x0000 offset 0 length 5
[00:00:22.273,803] <dbg> bt_gatt: bt_gatt_attr_read: handle 0x0000 offset 0 length 5
[00:00:22.274,353] <dbg> bt_gatt: bt_gatt_attr_read: handle 0x0000 offset 0 length 2
[00:00:22.274,963] <dbg> bt_gatt: bt_gatt_attr_read: handle 0x0000 offset 0 length 5
[00:00:22.275,787] <dbg> bt_gatt: bt_gatt_attr_read: handle 0x0000 offset 0 length 5
[00:00:22.276,641] <dbg> bt_gatt: bt_gatt_attr_read: handle 0x0000 offset 0 length 5
[00:00:22.277,252] <dbg> bt_gatt: bt_gatt_attr_read: handle 0x0000 offset 0 length 16
[00:00:22.278,289] <dbg> bt_gatt: bt_gatt_attr_read: handle 0x0000 offset 0 length 19
[00:00:22.279,449] <dbg> bt_gatt: bt_gatt_attr_read: handle 0x0000 offset 0 length 19
[00:00:22.280,609] <dbg> bt_gatt: bt_gatt_attr_read: handle 0x0000 offset 0 length 2
[00:00:22.281,219] <dbg> bt_gatt: bt_gatt_attr_read: handle 0x0000 offset 0 length 5
[00:00:22.282,104] <dbg> bt_gatt: bt_gatt_attr_read: handle 0x0000 offset 0 length 2
[00:00:22.282,714] <dbg> bt_gatt: bt_gatt_attr_read: handle 0x0000 offset 0 length 5
[00:00:22.283,569] <dbg> bt_gatt: bt_gatt_attr_read: handle 0x0000 offset 0 length 5
[00:00:22.284,179] <dbg> bt_gatt: bt_gatt_attr_read: handle 0x0000 offset 0 length 5
[00:00:22.285,095] <dbg> bt_gatt: bt_gatt_attr_read: handle 0x0000 offset 0 length 5
[00:00:22.286,224] <dbg> bt_gatt: bt_gatt_attr_read: handle 0x0000 offset 0 length 5
[00:00:22.286,743] <dbg> bt_gatt: bt_gatt_attr_read: handle 0x0000 offset 0 length 16
00:22.287,628] <dbg> bt_gatt: bt_gatt_attr_read: handle 0x0000 offset 0 length 19
[00:00:22.289,672] <dbg> bt_gatt: bt_gatt_attr_read: handle 0x0000 offset 0 length 16
[00:00:22.290,374] <dbg> bt_gatt: bt_gatt_attr_read: handle 0x0000 offset 0 length 19
[00:00:22.291,259] <dbg> bt_gatt: bt_gatt_attr_read: handle 0x0000 offset 0 length 19
[00:00:22.292,175] <dbg> bt_gatt: bt_gatt_attr_read: handle 0x0000 offset 0 length 16
[00:00:22.292,846] <dbg> bt_gatt: bt_gatt_attr_read: handle 0x0000 offset 0 length 19
[00:00:22.293,731] <dbg> bt_gatt: db_hash_gen: Hash:
                                  54 6d 2d c6 e1 e6 d3 79  22 5b 9d 50 ba a9 7f 92 |Tm-....y "[.P....
[00:00:22.323,242] <dbg> bt_hci_core: bt_recv: buf 0x2001c4e0 len 68
[00:00:22.323,669] <dbg> bt_hci_core: rx_work_handler: Getting net_buf from queue
[00:00:22.324,096] <dbg> bt_hci_core: rx_work_handler: buf 0x2001c4e0 type 1 len 68
[00:00:22.324,523] <dbg> bt_hci_core: hci_event: event 0x3e
[00:00:22.324,890] <dbg> bt_hci_core: hci_le_meta_event: subevent 0x08
[00:00:22.325,256] <dbg> bt_ecc: bt_hci_evt_le_pkey_complete: status: 0x00
[00:00:22.325,653] <dbg> bt_smp: bt_smp_pkey_ready:
[00:00:25.279,876] <dbg> bt_sdc_hci_driver: event_packet_process: LE Meta Event (0x0a), len (31)
[00:00:25.280,548] <dbg> bt_hci_core: bt_recv: buf 0x2001c4e0 len 33
[00:00:25.281,158] <err> os: ***** BUS FAULT *****
[00:00:25.281,463] <err> os:   Imprecise data bus error
[00:00:25.281,829] <err> os: r0/a1:  0x0014043e  r1/a2:  0x2000410b  r2/a3:  0x0000000a
[00:00:25.282,318] <err> os: r3/a4:  0x0014043f r12/ip:  0x00000007 r14/lr:  0x0000366b
[00:00:25.282,775] <err> os:  xpsr:  0x21000000
[00:00:25.283,142] <err> os: s[ 0]:  0x200040e4  s[ 1]:  0x2000230c  s[ 2]:  0x00000000  s[ 3]:  0x2000237c
[00:00:25.283,782] <err> os: s[ 4]:  0x20004107  s[ 5]:  0x200040fe  s[ 6]:  0x00000000  s[ 7]:  0x000035ab
[00:00:25.284,393] <err> os: s[ 8]:  0x20004106  s[ 9]:  0x00000004  s[10]:  0x2000be89  s[11]:  0x2000230c
[00:00:25.285,034] <err> os: s[12]:  0x20007744  s[13]:  0x00000000  s[14]:  0x00054519  s[15]:  0x00000000
[00:00:25.285,614] <err> os: fpscr:  0x00000000
[00:00:25.285,949] <err> os: Faulting instruction address (r15/pc): 0x00011540
[00:00:25.286,376] <err> os: >>> ZEPHYR FATAL ERROR 26: Unknown error on CPU 0
[00:00:25.286,804] <err> os: Current thread: 0x200049c0 (MPSL Work)
[00:00:25.297,119] <err> os: Halting system

Here are the logs from the nRF Connect Desktop BLE app:

15:01:54.723	Scan started
15:02:02.547	Connecting to device
15:02:02.578	Connected to device F7:5A:42:28:61:C8: interval: 12.5ms, timeout: 4000ms, latency: 0
15:02:02.626	Data length updated for device F7:5A:42:28:61:C8, new value is 69
15:02:07.035	Disconnected from device F7:5A:42:28:61:C8, reason: BLE_HCI_CONNECTION_TIMEOUT

Is this a known issue? We are hoping you could shed some light on what could be going wrong.

Any help is appreciated.

  • Hello,

    So does the issue occur every time you try to connect using nRF Connect for Desktop -> Bluetooth Low Energy, or only some of the times?

    What hardware are you running on? Is it an nRf52833 DK, or is it custom HW? If it is custom, does it have an LFXTAL? If so, which one? And does it also happen if you run the same application on a DK? 

    The obvious answer is that the application on the nRF crashes, and the connection from nRF Connect on Desktop times out. But we need to find the reason for the crash. 

    There is a clue in the log from the nRF, in the line saying "ZEPHYR_FATAL_ERROR 26: ...". Can you check what is on that address? Please note that this address may change every time you build the application. So please run it again, to get the latest, up to date, addres, and then run the following command using your build folder:

    arm-none-eabi-addr2line -e build\zephyr\zephyr.elf 0x<address> 

    Please note that arm-none-eabi-addr2line is an external tool, but you may have it if you have installed the GNU arm embedded toolchain on your computer. If not, you can try to run arm-zephyr-eabi-addr2line using the same parameters, but you need to run it from an environment where the zephyr toolchain is loaded. Does it point to somewhere in the MPSL part of your application? 

    Best regards,

    Edvin

  • Our application is using LVGL and we were running the lvgl_task_handler in the system workqueue. After moving the lvgl_task_handler to a dedicated thread this problem went away. We are still not entirely sure what the cause of the issue was.

Related