BLE Error when calling disconnect inside callback for connected

Hi,

I am developing a peripheral in ble, and I receive the following error when I disconnect a new connection within the connection callback.

bt_conn_disconnect(conn, BT_HCI_ERR_REMOTE_USER_TERM_CONN);

[00:00:07.533,966] <wrn> bt_att: att_get: Not connected
[00:00:07.631,561] <err> bt_conn: bt_conn_send_cb: not connected!

I have also seen this error when I repeatedly connect to the peripheral:

[02:11:23.832,550] <err> bt_conn: bt_conn_send_cb: not connected!

The connected callback simply looks at the connection info (bt_conn_get_info) and then disconnects(bt_conn_disconnect).

The errors are inconsistent.  They don't always happen, and they're not always the same.

Often I see the bt_att issue on the first connection, but not always.

I am repeatedly connecting to the peripheral at a fast rate using the nRF Connect phone app.

Interestingly, when the peripheral disconnects the phone as described, the phone sometimes gets stuck in a re-connection loop.  This is how the subscriptions are coming in so quickly in some cases.

To be clear though, I do frequently see the very simple case described above on the first connection and no looping.

Notably, the return value from the bt_conn_disconnect is 0 (success) on those errors.

Question - what is the right way to use the API to avoid these errors?  I want to be able to disconnect new connections within the connected callback.

Thanks.

Parents
  • Hello,

    Without digging too deep into the call stack, I guess that this is just caused by the connection not being set in the correct instance before you call bt_conn_disconnect(). 

    Remember that there are many handlers that subscribe to the BT events, such as the connected event. If then, you disconnect from within the connected event, and this triggers some internal event, which checks the connection state before that handler ever got the connected event, you will get these scenarios.

    A couple of suggestions that you can check:

    1: Make sure to call 

    current_conn = bt_conn_ref(conn);
    before you disconnect. This will tell the bluetooth stack that it is connected, and update some internal connection handle. I am not sure whether this is the one that it is complaining about or not.
     
    2: Try to add a delay from the connected event. Add a delay of e.g. 50ms by starting a timer in the connected event, and then disconnect in the timeout handler.
    3: Try to set a breakpoint in bt_conn_send_cb() on line 381 in conn.c (the one printing "not connected!"), and look at the callstack. See if you can see what the conn pointer is, and what the conn->state is.
    4: Try to ignore this error message. You are trying to disconnect either way. Does it trigger the disconnected event? Are you in a connection at a later point in time, or does it start advertising again?
    Best regards,
    Edvin
  • Hi,

    1 - I already do call bt_conn_ref(conn).

    3 - conn->state == BT_CONN_DISCONNECTING.  When the connected callback fired, the state was BT_CONN_CONNECTED and err=0.

    The zephyr disconnect callback does fire after I bt_conn_disconnect within the connect callback.

    Regarding points 2 and 4 - This sounds like "work around or ignore" the problem.  My specific question on this case is what the right way to use the API is.  I cannot find documentation that explains when I'm allowed to call which functions under which scenarios.  Can you please direct me?

    Adding a timer leads to many many negative issues, like my code now having to track that a connected conn "isn't really supposed to be connected so ignore any callbacks from it."  That's the zephyr API's job to present an interface which I can reliably work with to build an application.

    Regarding ignoring the problem, should I ship a commercial product with outright errors being produced by the underlying library?  Is there library state corruption occurring?  What specifically is the consequence of this error such that I can safely ignore it?  My concern is that non-deterministic errors are being produced when my code executes in exactly the same sequence, and it is highly unsettling.

    Thanks.

Reply
  • Hi,

    1 - I already do call bt_conn_ref(conn).

    3 - conn->state == BT_CONN_DISCONNECTING.  When the connected callback fired, the state was BT_CONN_CONNECTED and err=0.

    The zephyr disconnect callback does fire after I bt_conn_disconnect within the connect callback.

    Regarding points 2 and 4 - This sounds like "work around or ignore" the problem.  My specific question on this case is what the right way to use the API is.  I cannot find documentation that explains when I'm allowed to call which functions under which scenarios.  Can you please direct me?

    Adding a timer leads to many many negative issues, like my code now having to track that a connected conn "isn't really supposed to be connected so ignore any callbacks from it."  That's the zephyr API's job to present an interface which I can reliably work with to build an application.

    Regarding ignoring the problem, should I ship a commercial product with outright errors being produced by the underlying library?  Is there library state corruption occurring?  What specifically is the consequence of this error such that I can safely ignore it?  My concern is that non-deterministic errors are being produced when my code executes in exactly the same sequence, and it is highly unsettling.

    Thanks.

Children
  • Hello,

    I suggest that you look into the "Using a Timer Expiry function" section in the NCS documentation. 

    douglas.malnati said:
    That's the zephyr API's job to present an interface which I can reliably work with to build an application.

    In BLE, as a peripheral, you are supposed to allow anyone to connect. Whether you take action on the events generated from the connected device is up to you. There are mechanisms (bonding and so on) to prevent malicious devices from doing the wrong things.

    douglas.malnati said:
    Regarding ignoring the problem, should I ship a commercial product with outright errors being produced by the underlying library?

    I am just saying that the error says that you are not connected, and it is triggered from the action of trying to disconnect before the connection event has propagated though the operating system (Zephyr).

    Sorry for the short answer, but I need to leave the office, and will be out for the weekend.

    Best regards,

    Edvin

  • Thank you for your response, however I feel the API needs to be written and clarified.

    I'm not seeing concrete statements about what I can/cannot do, and that is the function of an API, and it should be documented.  I am not finding documentation describing what I can/cannot do either and I asked to be directed to any which didn't happen either.

    In BLE, as a peripheral, you are supposed to allow anyone to connect

    As a peripheral, I did let anyone connect, and subsequently chose to disconnect them.  There is a disconnect function I used to do that, so clearly this is a concept that is supported.

    I am just saying that the error says that you are not connected, and it is triggered from the action of trying to disconnect before the connection event has propagated though the operating system

    I understand you are saying that.  But what you aren't saying is when it is supported for me to disconnect, and you aren't pointing me to documentation that tells me.  I've been suggested to "try" waiting, as though there isn't a definitive supported answer to that question, which is exactly the point I'm struggling with here.

    The API as I'm using it is already inconsistent, there is an intermittent issue where sometimes there is a warning, other times not, for the exact same code path execution.  How should I know that when I "try" waiting I'm simply not seeing another instance of an intermittent issue?

    The answer to all of these things is having an API which conclusively explains the right way to operate the system.

    Please think if there is something specific you can say about how to use this API correctly.

    Also please pass along a report that the API is not clear in this (and so many other) places.

    Thank you.

Related