nRF5340 LE disconnection Issue.

Hi,

We are using nRF5340 DK and the nRF Connect SDK Version 2.6.1.

We are experiencing an issue with disconnections. We use bt_conn_ref() when receiving the connected event and bt_conn_unref() when receiving the disconnection event.

However, sometimes after a disconnection, when we retry, we encounter the error: "bt_conn: bt_conn_exists_le: Found valid connection (0x20001a18) with address FF:F5:55:5A:D6:6A (random) in disconnected state".

Why does this issue occur even after the disconnection event has been processed?

If we call bt_conn_unref() twice during the disconnection event as shown in the below code (essentially incrementing once upon connection and decrementing twice upon disconnection), the issue does not occur.

Fullscreen
1
2
3
4
5
6
7
void connected(struct bt_conn *conn, uint8_t conn_err){
bt_conn_ref(conn);
}
void disconnected(struct bt_conn *conn, uint8_t reason){
bt_conn_unref(conn);
bt_conn_unref(conn);
}
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

Could this behavior be due to an internal counter issue?

Parents
  • Hi,

    I'm not aware of any known issues where the Bluetooth host fails to release its own reference. Are you developing a peripheral or central application?

    Thanks,

    Vidar

  • Hi, 

    Thanks for the response.

    We are developing a mesh network. Our requirement is the one device can connect to three other devices. Using one connection as a peripheral and up to two connections as a central for mesh connections.

  • I tried to run the sample on 2 nRF5340 DKs, and noticed that the call to bt_conn_unref() was commented in the disconnect callback. Also, broadcastMeshData() seems to be called regardless of the connection state.

  • Hi,

    Thanks for the response.

    I commented out bt_conn_unref() to check if the issue was related to that. However, the issue persists whether or not bt_conn_unref() is used in the disconnected callback. You can try enabling it as well.

    As I mentioned earlier, the issue occurs only when data is being sent, so I'm using the function broadcastMeshData(); to send the data every 20ms.

  • HI,

    Yes, but it seems that broadcastMeshData() is being called even when there is no connection. Do you still encounter the problem if you only call this function when connected? Also, did you try to fix the build warnings as I suggested?

  • Hi,

    In the broadcastMeshData() function, I only send packets after a connection is established. If there is no connection, no data is sent.

    However, I can't stop sending data after a disconnection because the disconnection event is missing.

    Also, did you try to fix the build warnings as I suggested?

    Not yet, I still have to work on it. But the warnings are in the application files, not in the config files, correct?

  • Hi,

    Just a quick update on the status on my end: I have managed to reproduce the missing disconnect issue using two nRF5340 DK boards. It also appears that the bt_gatt_write_without_response_cb() function becomes blocking and never returns, which prevents the main thread from running again. However, the SPI, logger, and idle threads continue to run. 

Reply
  • Hi,

    Just a quick update on the status on my end: I have managed to reproduce the missing disconnect issue using two nRF5340 DK boards. It also appears that the bt_gatt_write_without_response_cb() function becomes blocking and never returns, which prevents the main thread from running again. However, the SPI, logger, and idle threads continue to run. 

Children
  • Hi,

    Thanks for the response.

    Please suggest if you have a workaround?

  • You should be able to use the WDT (Task Watchdog or WD driver) as a temporary workaround for the problem. At least one of the WD reload registers need to be reserved for  the main loop to catch the scenario where bt_gatt_write_without_response_cb does not return.

  • Hi,

    The bt_gatt_write_without_response_cb() function becomes blocking when we continue sending data after the stack buffer is full. To prevent this blocking behavior, we will stop sending data before the buffer reaches its limit.

    The maximum supported number of connection is 3. and the CONFIG_BT_BUF_ACL_TX_COUNT is set to 30, allowing 10 buffers for each connection. I'm using a write complete callback to monitor buffer usage. If the buffer usage for any connection exceeds 10 due to the stack buffer full , I stop writing to that connection until it is cleared.

    So our application will avoid entering a blocking state. You can review the logic that implements this in the broadcastMeshData() function.

    If you'd like to test without blocking the bt_gatt_write_without_response_cb() function, you can modify the code as shown below. I've added comments indicating which lines need to be commented out and which ones should be uncommented.

    Fullscreen
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    if((!StackBufferManage.BufferFull[i]) && (ConnectionData.node_conn[i].HandshakeDone)){ // un comment this line
    //if((ConnectionData.node_conn[i].HandshakeDone)){ // comment this line
    #if(BLE_TXMTION_TYPE== WRITE_WITHOUT_RES)
    err = ble_gatt_write_txmtion(&ConnectionData.node_conn[i], data, len);
    #elif(BLE_TXMTION_TYPE== NOTIFICATION)
    if(bt_gatt_is_subscribed(ConnectionData.node_conn[i].conn, notify_attr, BT_GATT_CCC_NOTIFY))
    err=bt_gatt_notify_cb(ConnectionData.node_conn[i].conn,&tx_notify_params);
    #endif
    if (!err) {
    StackBufferManage.time[i] = k_uptime_get_32();
    StackBufferManage.InIndex[i]++;
    // printk("TX[%d] \n",i);
    }else{
    printk("bt_gatt_write error : %d\n",err);
    }
    }
    XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

    The disconnection event is missing, even though the bt_gatt_write_without_response_cb() function is non-blocking.

  • Hi,

    I understand, but bt_gatt_write_without_response_cb() should not remain blocking after the connection has been lost, so I suspect it may be a symptom of the same root cause that prevents the disconnect event from coming through. I'm currently working with R&D to find out if this may be caused by a bug in the BT stack.

  • Hi, I'm also coming across this issue based on ncs v2.7.0
    nRF53 is missing the disconnect event specifically when I seem to have a pending gatt write, and a device disconnects ungracefully.
    Even the supervision timeout does not cause a disconnect event.

    When the ATT timeout hits (30s later), it unsubscribes from the characteristic. If I call disconnect from this context and bt_conn_unref my indexes - internally the conn state stays state Disconnecting, and new scans show the device already having a conn, so a new connection cannot be made. 

    I am wondering if there is any progress here, or if I should work towards the newer bluetooth host controller which does not use the system workqueue. :)