Beware that this post is related to an SDK in maintenance mode
More Info: Consider nRF Connect SDK for new designs
This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

LL_CONNECTION_UPDATE_IND before response to Connection Parameters Update Request causes disconnect

Our peripheral application sends an L2CAP Connection Parameters Update Request to change the connection interval 30 seconds after connecting.  For most hosts, it works fine.  But with an Android 6 tablet it disconnects right after the switch to the new connection interval.  Wireshark shows the following packet sequence.

  • Slave sends L2CAP Connection Parameters Update Request
  • Master sends L2CAP Connection Parameters Update Request (Accepted)
  • Slave sends empty PDU in response
  • Master sends LL_CONNECTION_UPDATE_IND with details of the new parameters and when to switch
  • At the proper time, the two sides change to the new connection interval and everything is fine

But with the Android 6 tablet, the order of the Master's packets is reversed, and the slave (our device) disconnects right after the interval switch:

  • Slave sends L2CAP Connection Parameters Update Request
  • Master sends LL_CONNECTION_UPDATE_IND with details of the new parameters and when to switch
  • Slave sends empty PDU in response
  • Master sends L2CAP Connection Parameters Update Request (Accepted)
  • At the proper time, the two sides change to the new connection interval for just the first interval
  • After that the slave immediately disconnects, stops responding and starts advertising again
  • The host eventually gives up due to the lack of responses and also disconnects

We are using an nRF52833 with S113 SoftDevice, and version 17.0.2 of the SDK.  We get the same results when we change the requested range of the connection interval (e.g.- 100-200 msecs, 200-400, 20-200...) and the Android system always picks an interval within the requested range, so that doesn't seem to be the problem.

Does the order of the master's response matter?  The order that works matches what you show in your Peripheral Connection Parameter Update message sequence chart (https://infocenter.nordicsemi.com/topic/com.nordic.infocenter.s113.api.v7.2.0/group___b_l_e___g_a_p___c_p_u___m_s_c.html).  Does the BLE stack get confused and disconnect if the LL_CONNECTION_UPDATE_IND comes before the L2CAP acceptance of the request?

We don't seem to have this problem with later versions of Android (such as 9.x), but I don't have one here to try with Wireshark to see if its order is different.

Normal order works:

Reverse order from master.  Slave disconnects right after switching connection interval time.

Parents
  • Can you post the wireshark pcap file for the failed connection, you should be able to save the capture file and post it here. Additionally apply the profile that has been shipped with the nRF Sniffer so you will get a better view in wireshark. See section 2.3 "Add a Wireshark profile to nRF Sniffer" in the nRF Sniffer v3.2 User guide.

    It looks as if the slave radio stopped or lost the sync with the master at the instant the connection parameters were updated. It does not appear to be a normal Disconnect.

    Please post the pcap file and it will be easier to help.

  • 4606.LS drop connection with Android 6 tablet on Conn Interval change.pcapng

    Hi David.  Here is the pcap file from when it fails.  It was captured with the nRF Sniffer profile in Wireshark.  The connection starts on line 92 and the timing change and disconnect starts at 1486, 30 seconds later.  I've marked relevant packets.

    My first thought was that the slave lost sync as well, but when the master switches from 38.75 msec connection intervals to the new 397.5 msec timing, the slave responds one last time.

    Here is some additional information.  The connection is always lost immediately after the connection interval change which happens 30 seconds after connecting.  So I changed the FIRST_CONN_PARAMS_UPDATE_DELAY to 60 seconds to verify that it was related to the parameter change.  Now it disconnects (but still just from the Android 6 master, not others) after 37 seconds consistently, long before the parameter update would have happened.

    I've attached a 2nd pcap file with the conn_params update delayed until 60 seconds after connect.  The connection is at line 92, 6.85 seconds.  After the GATT scan the connection just idles and then the slave stops responding and disconnects at line 2034, 44.33 seconds, or about 37.5 seconds after connection.  So this leads me to believe that the disconnection right after the 30 second update was just a coincidence and that something else is going on here.  Does it makes sense that the Android tablet's clock might be so far off that the timing of the two sides drifts apart quickly enough to get out of sync that fast?  Doesn't the timing get resynced with each communication somehow?  (Unfortunately, I don't have any other Android tablets handy to test that theory.)

    LS drop connection with Android 6 tablet with NO Interval change.pcapng

    Finally, just to be sure it wasn't an issue with the RTC crystal or something on the Nordic board, I tried another one and got the same results, with the connection being dropped after 37 seconds.  (Our prototypes use our circuits on a daughterboard on top of the nRF52833 Dev Kit boards.)

Reply
  • 4606.LS drop connection with Android 6 tablet on Conn Interval change.pcapng

    Hi David.  Here is the pcap file from when it fails.  It was captured with the nRF Sniffer profile in Wireshark.  The connection starts on line 92 and the timing change and disconnect starts at 1486, 30 seconds later.  I've marked relevant packets.

    My first thought was that the slave lost sync as well, but when the master switches from 38.75 msec connection intervals to the new 397.5 msec timing, the slave responds one last time.

    Here is some additional information.  The connection is always lost immediately after the connection interval change which happens 30 seconds after connecting.  So I changed the FIRST_CONN_PARAMS_UPDATE_DELAY to 60 seconds to verify that it was related to the parameter change.  Now it disconnects (but still just from the Android 6 master, not others) after 37 seconds consistently, long before the parameter update would have happened.

    I've attached a 2nd pcap file with the conn_params update delayed until 60 seconds after connect.  The connection is at line 92, 6.85 seconds.  After the GATT scan the connection just idles and then the slave stops responding and disconnects at line 2034, 44.33 seconds, or about 37.5 seconds after connection.  So this leads me to believe that the disconnection right after the 30 second update was just a coincidence and that something else is going on here.  Does it makes sense that the Android tablet's clock might be so far off that the timing of the two sides drifts apart quickly enough to get out of sync that fast?  Doesn't the timing get resynced with each communication somehow?  (Unfortunately, I don't have any other Android tablets handy to test that theory.)

    LS drop connection with Android 6 tablet with NO Interval change.pcapng

    Finally, just to be sure it wasn't an issue with the RTC crystal or something on the Nordic board, I tried another one and got the same results, with the connection being dropped after 37 seconds.  (Our prototypes use our circuits on a daughterboard on top of the nRF52833 Dev Kit boards.)

Children
  • We can list the weirdness in this trace first which can trigger a stop on the slave radio and then examine the timing as a later step.

    1. LL_LENGTH_REQ from the slave does not seem to get the required LL_LENGTH_RSP response from the master. This can trigger a disconnect where the radio will stop. I see this in both traces so its seems something is broken on the master.

    2. The master is sending a LL_PING_RSP as the packet starting the PING procedure instead of sending the LL_PING_REQ, the slave responds with LL_UNKNOWN_RSP. This should not trigger a radio stop but the LL code on the master seems to be buggy.

    Lets try to stop the slave from sending the LL_LENGTH_REQ, can you set the max packet length to the 27 byte legacy limit ?

    If that is stopped, then we can try to stop the PING_RSP by sending a PING_REQ as soon as the link is connected.

  • Thanks David!  I changed the max packet length to 27 and that cleaned up the strange initial communications, and the connection is no longer lost later.  Removing the LL_LENGTH_REQ also caused the Android master to no longer send the unsolicited LL_PING_RSP.  Perhaps the old Android stack was misinterpreting the length request as a ping request???  Here is the new trace if you'd like to see it:

    LS with Android 6 tablet - max packet size 27 makes it work.pcapng

    Most of the time, the legacy packet limit is ok, but we had hoped to use the longer length to optimize the speed for some infrequent large file transfers.  Perhaps when our Android app is further along we can try changing the length from that side where we'll know the Android version and whether it can handle it properly.

  • Good to hear that.
    The bug of sending a PING_RSP instead of a LL_LENGTH_RSP is usually a controller level issue as well so checking just on android OS version may be a bit tricky. 

    Can you let me know the phone/tablet and the specific OS version including minor numbers that you are using  and the Bluetooth IC in it if possible.

    Please accept my answer and give it an upvote.

  • The tablet is a Samsung 8" Galaxy Tab E, model SM-T377V running Android 6.0.1 (we have many like this in the field, so I didn't upgrade for testing).  The hardware version is T377V.01, but I don't know how to find the Bluetooth IC used.  At least some Samsung Galaxy models use Broadcom chips.  Websites say this model is Bluetooth 4.1 in the US and 4.0 elsewhere, but I can't confirm that and don't know if that is up to date.

  • You can also take control of the   sd_ble_gap_data_length_update procedure, and see if the resulting update event does not arrive and the link disconnects with a reason code as below and fall back to a new connection without Data Length extensions.

    Check the error code that BLE_GAP_EVT_DISCONNECTED would provide, I hope it would be a 0x22 (LL Response timeout) instead of a generic Connection Timeout (0x08).

    You are correct that it appears to be at a 4.1 or 4.0 based on the LL_FEATURE_REQ from the master and it actually does not support the data length extension as the feature bit for that seems to be set to false.

    As you have surmised you may be able to better control the procedure from the Android and maybe use a handshake to signal the nRF that is ok to start the DLE or not do so at all.

Related