Unreliable BLE connection for continuos high bandwidth data stream

I want to send a continuos stream of sensor data over BLE, currently using the NUS GATT profile, connecting directly to a third-party device e.g. computer or phone.

The bandwidth required is about ~256kbit/s so in theory more than enough bandwidth available than the practical limit of around 1.4Mbit/s with 2M PHY that I so far observed with the throughput sample.

So far I have only been able to get a reliable stream going when heavily reducing the sensor sampling rate to 128-180Kbit/s but even then only in very specific combinations of MTU size and connection interval do I get a reliable transfer and without any clear reason as to why as both a larger and small MTU as well as longer and shorter connection intervals can cause the connection to get better or worse. I seem to sometimes hit magic numbers by trial and error that yield stable results.

So this has been very unintuitive so far as to how to tune these connection variables to get to a stable result and reasonable latency (<100ms).
Right now battery consumption is not the highest priority so it would be okay to use the most wasteful configuration as long as the bandwidth can be reached.

What is currently happening code wise is the sensor (dmic) reads into a memory slab. I pass the block pointer to the ble write thread and let the dmic continue to sample and grab the next block in the memory slab. For debugging I currently log the memory slabs block utilization.
The ble write thread then sends out the data in one or more messages depending on MTU size and memory block size.

Sometimes things go badly immediately and the memory blocks get used up immediately but sometimes its stable for a little bit and then suddenly goes sideways and the connection can't keep up with the sensor stream and fills up the slab.

How do I go about debugging this and are there deeper guides for this scenario in terms of PHY settings available to me.
I only found this on the topic so far: Building a Bluetooth application on nRF Connect SDK - Part 3 Optimizing the connection 

I have looked at what comes through the air with the BLE sniffer but could not spot anything obvious like too many retries.
Would a custom GATT profile be better suited? The NUS service seemed like a useful basis for this application but maybe I am overlooking a constraint here.

Parents
  • Hi Timon

    It is odd that you would get such low rates. Are you running Nordic devices on both sides of the link, or are you communicating with something else, like a smart phone? 

    Are you using custom hardware or standard devkits? 

    Did you confirm from the sniffer trace that data length and large MTU is probably utilized? 
    Could you see many packets sent back to back within a connection event, or only a single packet pr connection event? 

    For testing purposes I would recommend starting with dummy data just to rule out an issue in the handover between the DMIC reading and the Bluetooth transmission. 

    Best regards
    Torbjørn

  • I'm mostly testing with my Windows machine as the central. It has a Mediatek chipset for Bluetooth.
    I had also tried out the NUS Central sample but similar behaviour occurred though I have not dug deeper there other than upping the UART baud rate.
    In the end the application would have to work at least with your average Windows or Mac for now, later on also with phones so it would have to be somewhat reliable enough with third party chipsets too in any case.
    For bandwidth testing I stuck to the nRF5340DK instead of the custom hardware.

    Good call with testing dummy data, I don't think I have done that so far. From the profiling so far it didn't seem too much of a data handling issue as it was usually hanging at nus_send but good to double check.
    Below is a short block of what I captured with a 16hz sampling rate being transmitted.

    Both MTU size and 2M PHY were ack'ed by the central during connection handshake.
    But there are multiple L2CAP fragments between each event. I'm not knowledgable enough on the BLE PHY end to know if that is how a transmission works or if that should show up as a single packet equal to MTU size.

  • Hi Timon

    Looking at the trace excerpt it seems that the data length is not increased, which means all the data needs to be segmented in 27 byte packets (of which only 20 bytes include user data), and this will severely impact the overall throughput. 

    This could either be caused by a limitation on the Windows side (hopefully not), or a configuration issue after establishing the connection. 

    Would you be able to save the full trace to file and share it in the case?

    Then I can take a look at it and see if I can spot an issue with the data length request procedure. 

    If you can share your project configuration this could also be useful (prj.conf). 

    Best regards
    Torbjørn

  • Yea sure, is there a way to privately share it? Would like to avoid sharing MAC addresses of the BLE devices publicly.

    It's odd, both in Zephyr it confirms the MTU via bt_nus_get_mtu() and also sniffing the traffic it seems to ack the MTU.

  • heh okay I found the issue. There was a
    CONFIG_BT_USER_DATA_LEN_UPDATE=y
    CONFIG_BT_USER_PHY_UPDATE=y
    lurking after my
    CONFIG_BT_GAP_AUTO_UPDATE_CONN_PARAMS=y

    which disabled actually updating the phy and data length as I'm not doing that explicitly in my app as its done in the throughput example. I'm guessing this fragment made it in when copying some connection parameters over from the passthrough example.
    I also noticed while both sides advertised 2M PHY initially they never actually switched to it.
    Removed those config lines and now its updating everything automatically and runs just fine at full sampling bandwidth :)
    Thanks for your pointers they put me on the right track. I compared captures between throughput sample and my app and saw that that exchange was missing.

Reply
  • heh okay I found the issue. There was a
    CONFIG_BT_USER_DATA_LEN_UPDATE=y
    CONFIG_BT_USER_PHY_UPDATE=y
    lurking after my
    CONFIG_BT_GAP_AUTO_UPDATE_CONN_PARAMS=y

    which disabled actually updating the phy and data length as I'm not doing that explicitly in my app as its done in the throughput example. I'm guessing this fragment made it in when copying some connection parameters over from the passthrough example.
    I also noticed while both sides advertised 2M PHY initially they never actually switched to it.
    Removed those config lines and now its updating everything automatically and runs just fine at full sampling bandwidth :)
    Thanks for your pointers they put me on the right track. I compared captures between throughput sample and my app and saw that that exchange was missing.

Children
Related