Intermittent BLE connection

I am working with nRF Connect SDK 2.6.1 on a Central and a Peripheral and I am running the BAP similar to the nRF Audio Example. I am using out of band pairing over UART for the initial secure connection. The OOB pairing works fine.

After connection is established and bonds are saved, the peripheral basically always advertises if bonds exist. The central device is turned on and off and scans when it turns on. I know this is a little different than traditionally, but the Basic Audio Profile forced which devices are the Central and the Peripheral.

If I continuously cycle power to the Central and watch the log output, quite frequently I can see that sometimes it needs multiple attempts to connect. It seems to happen more frequently when the area is a little more ACTIVE in the BLE band. 

I'm using Wireshark with the nRF Sniffer to look at the BLE traffic, and it seems like the Peripheral (Slave on the Wireshark log), stops responding to the Centrals (Master) packets. I'm using security level 4 so Wireshark is unable to decrypt the message, but I should still see a response from the Peripheral in order to maintain the connectivity.

It seems quite frequently the peripheral stops responding after it sends the OP Code LL_START_ENC_REQ.  Here's the log of the connection and when the peripheral stops responding.

Here is a successful connection where you can see constant communication between the central and the peripheral:

On my log output on the Central and the Peripheral when the failure occurs I can see the connection occur but then a "Security failed" notification. It then seems to struggle with the disconnection but eventually does, it then starts advertising again and then successfully connects for the most part on the 2nd attempt. Sometimes it takes a couple of attempts but it seems to always reconnect.


-- [00:04:21.230,010] <dbg> event: thread_status_event_supv: Conn evnt
-- [00:04:21.580,444] <wrn> ble_supv: Security failed: level 1 err 9
-- [00:04:21.580,474] <wrn> ble_supv: Failed to disconnect -128
-- [00:04:21.580,718] <dbg> event: thread_status_event_supv: Disc Evnt
-- [00:04:21.640,258] <inf> ble_supv: Advertising to xx:xx:xx:xx:xx:xx (random) started

-- [00:04:25.829,498] <dbg> event: thread_status_event_supv: Conn evnt

I am wondering about the procedure when the Central device is powered down and disconnects. I'm basically just calling the functions to stop streaming and disconnect and then powering down. The peripheral does seem to take a couple of seconds to recognize the disconnection and receive the "disconnect" callback. But even if I ensure to wait for this callback on the peripheral to occur before turning the central back on, I still get the occasional failure. Is there still something hanging up in the peripheral device from the connection that is occuring due to my procedure for the disconnection? What should the procedure be to properly stop streaming and disconnect from a peripheral? Is it possible its somehow using the "conn" handle from the previous connection?

The current BLE disconnection procedure when the central is shut down. There is a little more error checking, but basically if we're streaming, stop the streaming and then disconnect:

if (!bt_content_ctlr_media_state_playing()){
        return 0; // we are already paused
    }
bt_content_ctrl_stop(default_conn);
bt_conn_disconnect(default_conn, BT_HCI_ERR_REMOTE_USER_TERM_CONN);
Its also possible the disconnection procedure has nothing to do with this failure. I'm using whitelisting on the peripheral so it only connects to the bonded device. But I am just using regular advertising rather than directed advertising. I didn't see the need for directed advertising since we're using out of band pairing. 
  • Could you please share how the v2.7.0 change item could result in the issue? I don't quite follow.

    One thing stands out in the Peripheral log to me:

    <dbg> bt_conn: bt_conn_set_state: connected -> disconnect-complete

    This is odd because this log can only be produced here: https://github.com/nrfconnect/sdk-zephyr/blob/v3.5.99-ncs1/subsys/bluetooth/host/conn.c#L800

    Does this consistently appear before a failure?

    Could you please set a breakpoint there and use debug mode to get the call stack leading up to that?

    While doing that, could you please also enable CONFIG_BT_HCI_CORE_LOG_LEVEL_DBG?

    Also, regarding the initial suspicion that the connection handle is somehow being reused and mixed up, do you see this being the case in the full log? The log so far is only for the attempt where the issue happens, so I don't see what the connection handle is before that.

    Another thing is, if you want to upload large amount of log text, you might want to use the Insert > Code function:

    Unfortunately, I am out of office tomorrow. Another engineer will follow-up with you. We are all facing abnormally high loading, so please excuse us if there is any delay in following up.

  • Regarding the v2.7.0 change, I don't know what exactly is going on in the BLE stack during the connection procedure, but I was speculating if the Central tried to connect a 2nd time to the Peripheral in the middle of the connection process, it could cause a disconnection from the 1st attempt or possibly a failed 2nd connection attempt because its now already connected. I think this would be a question for someone more familiar with whoever addressed this "issue" that was resolved in 2.7.0. Etc-were other people seeing similar issues that I'm seeing?

    I will follow up with the responses to the remainder of your questions.

  • -- [00:08:27.978,240] <inf> ble_supv: Advertising to CB:80:08:52:DA:8D (random) started
    
    -- [00:08:31.791,351] <dbg> bt_hci_core: bt_recv: buf 0x20023f5c len 33
    -- [00:08:31.791,381] <dbg> bt_hci_core: rx_work_handler: Getting net_buf from queue
    -- [00:08:31.791,381] <dbg> bt_hci_core: rx_work_handler: buf 0x20023f5c type 1 len 33
    -- [00:08:31.791,381] <dbg> bt_hci_core: hci_event: event 0x3e
    -- [00:08:31.791,412] <dbg> bt_hci_core: hci_le_meta_event: subevent 0x0a
    -- [00:08:31.791,534] <dbg> bt_hci_core: bt_hci_le_enh_conn_complete: status 0x00 handle 119                                                                                                 role 1 peer CB:80:08:52:DA:8D (random) peer RPA 00:00:00:00:00:00
    -- [00:08:31.791,595] <dbg> bt_hci_core: bt_hci_le_enh_conn_complete: local RPA 00:00:00:00:0                                                                                                0:00
    -- [00:08:31.791,595] <dbg> bt_conn: bt_conn_ref: handle 0 ref 1 -> 2
    -- [00:08:31.791,625] <dbg> bt_conn: bt_conn_unref: handle 0 ref 2 -> 1
    -- [00:08:31.791,625] <dbg> bt_conn: bt_conn_ref: handle 0 ref 1 -> 2
    -- [00:08:31.791,656] <dbg> bt_conn: bt_conn_set_state: connecting-adv -> connected
    -- [00:08:31.791,687] <dbg> bt_smp: bt_smp_accept: conn 0x20005778 handle 119
    -- [00:08:31.791,687] <dbg> bt_smp: bt_smp_connected: chan 0x200066d4 cid 0x0006
    -- [00:08:31.792,022] <dbg> bt_hci_core: bt_hci_cmd_create: opcode 0x2016 param_len 2
    -- [00:08:31.792,053] <dbg> bt_hci_core: bt_hci_cmd_create: buf 0x200252e8
    -- [00:08:31.792,053] <dbg> bt_hci_core: bt_hci_cmd_send_sync: buf 0x200252e8 opcode 0x2016 l                                                                                                en 5
    -- [00:08:31.792,083] <dbg> bt_hci_core: process_events: count 2
    -- [00:08:31.792,083] <dbg> bt_hci_core: process_events: ev->state 4
    -- [00:08:31.792,114] <dbg> bt_hci_core: send_cmd: calling net_buf_get
    -- [00:08:31.792,114] <dbg> bt_hci_core: send_cmd: calling sem_take_wait
    -- [00:08:31.792,114] <dbg> bt_hci_core: send_cmd: Sending command 0x2016 (buf 0x200252e8) to                                                                                                 driver
    -- [00:08:31.792,114] <dbg> bt_hci_core: bt_send: buf 0x200252e8 len 5 type 0
    -- [00:08:31.792,144] <dbg> bt_hci_core: process_events: ev->state 1
    -- [00:08:31.792,175] <dbg> bt_conn: bt_conn_prepare_events:
    -- [00:08:31.792,175] <dbg> bt_conn: conn_prepare_events: Adding conn 0x20005778 to poll list
    -- [00:08:31.792,175] <dbg> bt_conn: conn_prepare_events: wait on host fifo
    -- [00:08:31.792,205] <dbg> bt_hci_core: hci_tx_thread: Calling k_poll with 3 events
    -- [00:08:31.792,236] <dbg> bt_hci_core: bt_recv: buf 0x20023f7c len 8
    -- [00:08:31.792,327] <dbg> event: thread_status_event_supv: Conn evnt
    -- [00:08:31.792,327] <inf> oob: Push:20
    -- [00:08:31.792,419] <dbg> event: thread_status_event_supv: Bonds:Conn Evnt
    -- [00:08:31.792,419] <dbg> event: thread_status_event_supv: No phy conn
    -- [00:08:31.792,449] <inf> oob: MSG 20
    -- [00:08:31.792,449] <inf> oob: OOB msgq during BLE
    -- [00:08:31.792,572] <dbg> bt_hci_core: bt_recv: buf 0x200252e8 len 6
    -- [00:08:31.792,572] <dbg> bt_hci_core: hci_cmd_status: opcode 0x2016
    -- [00:08:31.792,572] <dbg> bt_hci_core: hci_cmd_done: opcode 0x2016 status 0x00 buf 0x200252                                                                                                e8
    -- [00:08:31.792,602] <dbg> bt_hci_core: bt_hci_cmd_send_sync: rsp 0x200252e8 opcode 0x2016 l                                                                                                en 0
    -- [00:08:31.792,633] <dbg> bt_conn: bt_conn_unref: handle 119 ref 2 -> 1
    -- [00:08:31.792,633] <dbg> bt_hci_core: rx_work_handler: Getting net_buf from queue
    -- [00:08:31.792,663] <dbg> bt_hci_core: rx_work_handler: buf 0x20023f7c type 1 len 8
    -- [00:08:31.792,663] <dbg> bt_hci_core: hci_event: event 0x3e
    -- [00:08:31.792,663] <dbg> bt_hci_core: hci_le_meta_event: subevent 0x12
    -- [00:08:31.792,694] <dbg> bt_adv: bt_hci_le_adv_set_terminated: status 0x00 adv_handle 0 co                                                                                                nn_handle 0x77 num 0
    -- [00:08:31.792,694] <dbg> bt_conn: bt_conn_ref: handle 119 ref 1 -> 2
    -- [00:08:31.792,694] <dbg> bt_conn: bt_conn_unref: handle 119 ref 2 -> 1
    -- [00:08:31.842,376] <dbg> bt_hci_core: bt_recv: buf 0x200246cc len 11
    -- [00:08:31.842,407] <dbg> bt_hci_core: rx_work_handler: Getting net_buf from queue
    -- [00:08:31.842,407] <dbg> bt_hci_core: rx_work_handler: buf 0x200246cc type 3 len 11
    -- [00:08:31.842,437] <dbg> bt_hci_core: hci_acl: buf 0x200246cc
    -- [00:08:31.842,437] <dbg> bt_hci_core: hci_acl: handle 119 len 7 flags 2
    -- [00:08:31.842,437] <dbg> bt_conn: bt_conn_ref: handle 119 ref 1 -> 2
    -- [00:08:31.842,468] <dbg> bt_conn: tx_notify: conn 0x20005778
    -- [00:08:31.842,468] <dbg> bt_conn: bt_conn_recv: handle 119 len 7 flags 02
    -- [00:08:31.842,468] <dbg> bt_conn: bt_acl_recv: First, len 7 final 3
    -- [00:08:31.842,468] <dbg> bt_conn: bt_acl_recv: Successfully parsed 7 byte L2CAP packet
    -- [00:08:31.842,498] <dbg> bt_conn: bt_conn_send_cb: conn handle 119 buf len 7 cb 0x59169 us                                                                                                er_data 0x20014954
    -- [00:08:31.842,529] <dbg> bt_conn: bt_conn_ref: handle 119 ref 2 -> 3
    -- [00:08:31.842,529] <dbg> bt_conn: bt_conn_unref: handle 119 ref 3 -> 2
    -- [00:08:31.842,529] <dbg> bt_hci_core: bt_hci_host_num_completed_packets: Reporting complet                                                                                                ed packet for handle 119
    -- [00:08:31.842,559] <dbg> bt_hci_core: bt_hci_cmd_create: opcode 0x0c35 param_len 5
    -- [00:08:31.842,559] <dbg> bt_hci_core: bt_hci_cmd_create: buf 0x200252e8
    -- [00:08:31.842,559] <dbg> bt_hci_core: bt_hci_cmd_send: opcode 0x0c35 len 8
    -- [00:08:31.842,590] <dbg> bt_hci_core: bt_send: buf 0x200252e8 len 8 type 0
    -- [00:08:31.842,620] <dbg> bt_conn: bt_conn_unref: handle 119 ref 2 -> 1
    -- [00:08:31.842,651] <dbg> bt_hci_core: process_events: count 3
    -- [00:08:31.842,651] <dbg> bt_hci_core: process_events: ev->state 0
    -- [00:08:31.842,651] <dbg> bt_hci_core: process_events: ev->state 0
    -- [00:08:31.842,651] <dbg> bt_hci_core: process_events: ev->state 4
    -- [00:08:31.842,681] <dbg> bt_conn: bt_conn_process_tx: conn 0x20005778
    -- [00:08:31.842,681] <dbg> bt_conn: send_buf: conn 0x20005778 buf 0x20025820 len 7
    -- [00:08:31.842,681] <dbg> bt_conn: send_buf: send single
    -- [00:08:31.842,681] <dbg> bt_conn: do_send_frag: conn 0x20005778 buf 0x20025820 len 7 flags                                                                                                 0x02
    -- [00:08:31.842,712] <dbg> bt_hci_core: bt_send: buf 0x20025820 len 11 type 2
    -- [00:08:31.842,742] <dbg> bt_conn: bt_conn_prepare_events:
    -- [00:08:31.842,742] <dbg> bt_conn: conn_prepare_events: Adding conn 0x20005778 to poll list
    -- [00:08:31.842,773] <dbg> bt_conn: conn_prepare_events: wait on host fifo
    -- [00:08:31.842,773] <dbg> bt_hci_core: hci_tx_thread: Calling k_poll with 3 events
    -- [00:08:31.992,156] <dbg> bt_hci_core: bt_recv: buf 0x200248a8 len 7
    -- [00:08:31.992,156] <dbg> bt_hci_core: hci_num_completed_packets: num_handles 1
    -- [00:08:31.992,156] <dbg> bt_hci_core: hci_num_completed_packets: handle 119 count 1
    -- [00:08:31.992,187] <dbg> bt_conn: bt_conn_ref: handle 119 ref 1 -> 2
    -- [00:08:31.992,218] <dbg> bt_conn: bt_conn_unref: handle 119 ref 2 -> 1
    -- [00:08:31.992,218] <dbg> bt_conn: tx_complete_work: conn 0x20005778
    -- [00:08:31.992,248] <dbg> bt_conn: tx_notify: conn 0x20005778
    -- [00:08:31.992,248] <dbg> bt_conn: tx_notify: tx 0x20014588 cb 0x59169 user_data 0x20014954
    -- [00:08:31.992,584] <dbg> bt_hci_core: bt_recv: buf 0x20023f7c len 15
    -- [00:08:31.992,614] <dbg> bt_hci_core: rx_work_handler: Getting net_buf from queue
    -- [00:08:31.992,614] <dbg> bt_hci_core: rx_work_handler: buf 0x20023f7c type 1 len 15
    -- [00:08:31.992,645] <dbg> bt_hci_core: hci_event: event 0x3e
    -- [00:08:31.992,645] <dbg> bt_hci_core: hci_le_meta_event: subevent 0x05
    -- [00:08:31.992,645] <dbg> bt_hci_core: le_ltk_request: handle 119
    -- [00:08:31.992,675] <dbg> bt_conn: bt_conn_ref: handle 119 ref 1 -> 2
    -- [00:08:31.992,675] <dbg> bt_hci_core: bt_hci_cmd_create: opcode 0x201a param_len 18
    -- [00:08:31.992,675] <dbg> bt_hci_core: bt_hci_cmd_create: buf 0x200252e8
    -- [00:08:31.992,706] <dbg> bt_hci_core: bt_hci_cmd_send: opcode 0x201a len 21
    -- [00:08:31.992,706] <dbg> bt_conn: bt_conn_unref: handle 119 ref 2 -> 1
    -- [00:08:31.992,736] <dbg> bt_hci_core: process_events: count 3
    -- [00:08:31.992,736] <dbg> bt_hci_core: process_events: ev->state 4
    -- [00:08:31.992,767] <dbg> bt_hci_core: send_cmd: calling net_buf_get
    -- [00:08:31.992,767] <dbg> bt_hci_core: send_cmd: calling sem_take_wait
    -- [00:08:31.992,767] <dbg> bt_hci_core: send_cmd: Sending command 0x201a (buf 0x200252e8) to                                                                                                 driver
    -- [00:08:31.992,797] <dbg> bt_hci_core: bt_send: buf 0x200252e8 len 21 type 0
    -- [00:08:31.992,797] <dbg> bt_hci_core: process_events: ev->state 0
    -- [00:08:31.992,828] <dbg> bt_hci_core: process_events: ev->state 0
    -- [00:08:31.992,828] <dbg> bt_conn: bt_conn_prepare_events:
    -- [00:08:31.992,828] <dbg> bt_conn: conn_prepare_events: Adding conn 0x20005778 to poll list
    -- [00:08:31.992,858] <dbg> bt_conn: conn_prepare_events: wait on host fifo
    -- [00:08:31.992,858] <dbg> bt_hci_core: hci_tx_thread: Calling k_poll with 3 events
    -- [00:08:31.993,438] <dbg> bt_hci_core: bt_recv: buf 0x200252e8 len 8
    -- [00:08:31.993,438] <dbg> bt_hci_core: hci_cmd_complete: opcode 0x201a
    -- [00:08:31.993,438] <dbg> bt_hci_core: hci_cmd_done: opcode 0x201a status 0x00 buf 0x200252                                                                                                e8
    -- [00:08:32.192,291] <dbg> bt_hci_core: bt_recv: buf 0x20023f7c len 6
    -- [00:08:32.192,291] <dbg> bt_hci_core: hci_disconn_complete_prio: status 0x00 handle 119 re                                                                                                ason 0x3d
    -- [00:08:32.192,291] <dbg> bt_conn: bt_conn_ref: handle 119 ref 1 -> 2
    -- [00:08:32.192,321] <dbg> bt_conn: bt_conn_set_state: connected -> disconnect-complete
    -- [00:08:32.192,321] <dbg> bt_conn: bt_conn_unref: handle 119 ref 2 -> 1
    -- [00:08:32.192,352] <dbg> bt_hci_core: rx_work_handler: Getting net_buf from queue
    -- [00:08:32.192,352] <dbg> bt_hci_core: rx_work_handler: buf 0x20023f7c type 1 len 6
    -- [00:08:32.192,382] <dbg> bt_hci_core: hci_event: event 0x05
    -- [00:08:32.192,382] <dbg> bt_hci_core: hci_disconn_complete: status 0x00 handle 119 reason                                                                                                 0x3d
    -- [00:08:32.192,382] <dbg> bt_conn: bt_conn_ref: handle 119 ref 1 -> 2
    -- [00:08:32.192,413] <dbg> bt_conn: bt_conn_set_state: disconnect-complete -> disconnected
    -- [00:08:32.192,413] <dbg> bt_conn: tx_notify: conn 0x20005778
    -- [00:08:32.192,443] <dbg> bt_conn: bt_conn_unref: handle 119 ref 2 -> 1
    -- [00:08:32.192,474] <dbg> bt_hci_core: process_events: count 3
    -- [00:08:32.192,474] <dbg> bt_hci_core: process_events: ev->state 0
    -- [00:08:32.192,474] <dbg> bt_hci_core: process_events: ev->state 1
    -- [00:08:32.192,474] <dbg> bt_hci_core: process_events: ev->state 0
    -- [00:08:32.192,504] <dbg> bt_conn: bt_conn_prepare_events:
    -- [00:08:32.192,535] <dbg> bt_hci_core: hci_tx_thread: Calling k_poll with 2 events
    -- [00:08:32.192,565] <dbg> bt_conn: deferred_work: conn 0x20005778
    -- [00:08:32.192,565] <dbg> bt_smp: bt_smp_disconnected: chan 0x200066d4 cid 0x0006
    -- [00:08:32.192,565] <dbg> bt_smp: smp_pairing_complete: got status 0x8
    -- [00:08:32.192,596] <dbg> bt_smp: bt_smp_encrypt_change: chan 0x200066d4 conn 0x20005778 ha                                                                                                ndle 119 encrypt 0x00 hci status 0x1f
    -- [00:08:32.192,596] <wrn> ble_supv: Security failed: level 1 err 9
    -- [00:08:32.192,596] <wrn> ble_supv: Failed to disconnect -128
    

  • Above is the log for turning on the HCI core debug logging.
    Also, on the peripheral, that bt_conn_set_state occurs every time there is a disconnection. But on the Central it only occurs when there is a failure.

    I was unable to set a breakpoint at that location on the peripheral. I think because its a delayed work queue. The breakpoint just advances to the next function. 

    My suspicion's regarding the connection handle was based on a previous post I had seen regarding not waiting for the disconnect callback before attempting another connection but I was unable to find that issue in my logs.

Related