This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

Inconsistent BLE data rates

Hello,

I'm working on a project where we are doing the following:

  • Collecting sensor data and storing it in an external SPI flash
  • Retrieving that sensor data from the SPI flash, reading it out into a module global array
  • Sending that sensor data from the array, in <23 byte chunks, over BLE to a phone using notifications

The condensed version of our transfer protocol (a fairly standard one) is as follows (app == phone, device == nrf52):

  1. App sends message to device to begin recording data.
  2. App sends message to device to indicate that it is ready to receive the next page of data, when it is ready to be sent.
  3. Device collects data and when it has 1 SPI-flash-page's-worth of data, writes that data to SPI flash.
  4. If the app has sent a message to the device saying that it is ready for the next page of data, read that page out of SPI flash into an array and send it in chunks over BLE to the app.
  5. If the app has not yet sent a message to the device saying it is ready for the next page of data, wait until it does so, then repeat from 4.
  6. Repeat from 3. 

The device always sends a page when it is available if the app has said that it is ready for it. Due to a variety of factors, sometimes the BLE data throughput is slower than the rate at which we are collecting data. In this case, "backlogged" pages are sent as soon as the app has indicated it is ready to receive another page (i.e. as soon as it has finished receiving the previous page).

Onto the interesting part:

  • If, at the time the app sends the "ready for next page" command to the device, there is a backlog of pages to be sent, the page is retrieved from SPI flash immediately and sent over BLE at a rate of ~5000B/s.
  • If, at the time the app sends the "ready for next page" command to the device, there is NOT a new page ready to be sent (i.e. the device is still collecting data and hasn't collected 1 pages worth yet), when the page is ready to be sent it gets read out of SPI flash and sent over BLE at a rate of ~750B/s. 

The mechanism by which the page is retrieved from SPI flash is exactly the same in both cases (the same function even), and I've performed my measurements starting at the moment AFTER the page has already been loaded from SPI flash into an array, so this doesn't have anything to do with SPI transfer rates or priorities (I don't believe). 

I've tested this both with our proprietary app and nRFConnect, on iOS and android, on multiple phones.

What I don't understand is why there is such disparity in the two scenarios described above. As I said, the code path from the time I start measuring is exactly the same for both cases. The only difference is that in the fast data rate case, there is very little time between the "I'm ready for another page" and when that page gets sent. In the slow data rate case, there is often some hundreds of milliseconds between when the "I'm ready for another page" command is received by the device and when that next page is actually ready to be sent. So far as I can tell, the negotiated connection parameters (conn interval in particular) aren't updated or changed in between the two scenarios. All else being equal, if the connection interval remained the same, and the size of the packets being sent remained the same, I'm not sure why I'm getting such inconsistent transfer rates and why they are so reliably different in the two scenarios provided. 

Is there another BLE connection setting or nrf52 configuration that might give a clue as to why this is happening? Has anyone seen something similar? Are there any other tools I can use to get to the bottom of this issue? 

Thank you

Matt

Parents
  • Hi Matt,

    Could you provide more details - what softdevice are you using, what OS is at client side, what's your connection parameters? what's the size of a page? How fast is page read from SPI, are they preloaded into a buffer or reading starts just after receiving a command? In what way did you implement a flow control on device side when sending notifications? Also, a sniffer trace in both cases would be very helpful.

  • Hi Dmitry,

    Thank you for your response.

    I'll work on getting a sniffer trace, but here are some answers to your other questions:

    • Softdevice/SDK: SDK 15.3.0, s132 6.1.1
    • OS at client side: I've tested this using iOS and Android, but for the sake of our debugging lets say I'm using a Google Pixel 2 with Android OS 9, running nRFConnect
    • Connection parameters: 
      • Interval: 37.5ms
      • Latency: 0
      • Timeout: 5000ms
      • MTU: 247, but i've limited the packet size to 23 bytes to make the device compatible with older phones that don't support the larger MTU. 
    • Page size: up to 2048 bytes, data being sent in each individual packet is <= 23 bytes
    • How fast is page read from SPI/are they preloaded into a buffer or reading starts just after receiving a command: Measurements for the rates I've described in my original message were taken AFTER the read from SPI flash, so page read from SPI shouldn't matter? In both cases, when the device knows it is time to send a new page of data, it retrieves the data from SPI flash and places it in a buffer. Once the buffer is loaded, I start measuring transfer rates, and unspool the buffer until all the bytes of the page are sent. If it is relevant, the actual read from SPI can happen either immediately after receiving a command (i.e. there is a new page available when the "ready for next page" command is received - this is the fast transfer rate case), or some hundreds of milliseconds after receiving a command (i.e. a new page was not ready when the "ready for next page" command is received, but became available sometime after - this is the slow transfer rate case). 
    • In what way did you implement a flow control on device side when sending notifications? Once the device is made aware that the app is ready to receive data, and the data has been retrieved from SPI flash into a buffer, I send the first packet from the buffer. Subsequent packets are triggered by the TX complete event (BLE_GATTS_EVT_HVN_TX_COMPLETE) being received by my *service*_on_ble_evt event handler - i.e. after the first packet is successfully sent, TX_COMPLETE is issued, I advance the pointer in my data buffer, send next packet, repeat. This is done so that I can ensure the previous notification was received before sending the next one (no packet loss). 
  • Hi Dmitry,

    That diagram is accurate. 

    I attempted to increase the number of TX buffers in two ways:

    1. Manually change the value in ble_gatts.h of BLE_GATTS_HVN_TX_QUEUE_SIZE_DEFAULT to 4, 8, 12
    2. Increase NRF_SDH_BLE_GAP_EVENT_LENGTH to 80 (100ms in 1.25ms units) - this is the max expected conn interval despite sometimes being lower. I followed this ticket

    I did see an overall improvement using method 2 in the data throughput rates for both the fast transfer and the slow transfer scenarios, but still the same overall behavior (as shown in your diagram). 

    Am I understanding correctly that the issue here seems to be related to number of packets sent per connection - i.e. in the fast case we are getting multiple packets per connection interval and in the slow case we are not?

    I've attached a wireshark/sniffer log which demonstrates one page transfer in both the fast and slow transfer scenarios.

    datacapture-matt-20190910.pcapng

  • Hi Matt,

    BLE_GATTS_HVN_TX_QUEUE_SIZE_DEFAULT is hardcoded in stack, so this change will have no effect. Use sd_ble_cfg_set with BLE_CONN_CFG_GATTS  to change it.

    Increase of event length helps while your app is smart enough to provide next packet every time the previous one is acknowledged by the host (imagine, the path from TX_COMPLETE event to call of sd_ble_gatts_hvx takes 50 usec - you will loose these 50 usec between every two-packet exchange).  Also, too long connection event will increase power consumption, because receiver is on for the whole connection event. Increasing of TX queue size gives you another boost - the stack always has buffers ready to send, the delay is minimal and not depends on your handler.

    From your trace I can assume that the source of your issue is the host - in slow case you can see long delays (up to 75 msec) between notification and ACK from host.

  • Hi Dmitry,

    Thanks for the clarification. I was able to update the tx_queue size via the following code:

    ble_cfg_t ble_cfg;
    ble_cfg.conn_cfg.conn_cfg_tag = APP_BLE_CONN_CFG_TAG;
    ble_cfg.conn_cfg.params.gatts_conn_cfg.hvn_tx_queue_size = 4;
    err_code = sd_ble_cfg_set(BLE_CONN_CFG_GATTS, &ble_cfg, ram_start);
    APP_ERROR_CHECK(err_code);

    But this had no effect on the issue at hand (i.e. still seeing the fast/slow cases). I tested this with the extended event length and the default event length with the same results. I also tried a tx_queue_size of 8. Is there something else I need to do to configure my device to use the additional tx_queue size, or is setting it in the above way sufficient?

    I see the same thing you're seeing in the trace. For clarity, I was using nRFConnect app on Android for this test as the host. Is there a reason why it would respond so slowly (~75ms) in the "slow case" and more quickly (~5us) in the "fast case"? Is this something that is fixable on the device/firmware side, or are we at the mercy of the phones here? Is there a bluetooth configuration parameter that governs these response times? 

    EDIT: Come to think of it, should the tx_queue_size even matter, since I'm not adding any new packets to the queue until I've already received BLE_GATTS_EVT_HVN_TX_COMPLETE?

  •  Is there something else I need to do to configure my device to use the additional tx_queue size, or is setting it in the above way sufficient?

    As you figured out yourself - you need to fill whole tx queue to have any gain from these settings.

    It's hard to say why the response time slows down (I'm not android programmer, maybe there are some settings in android API). AFAIK many phones have shared radio part for BT and wifi, and OS may increase BT priority for some time after host-initiated operation, after that shrinking BT timeslots to minimum - to give maximum resources for wifi... just my thoughts, you can try to disable wifi and see what's happen.

  • I think Dmitry is right, either the HandleValueNotification/Indication(HVX) tx buffers are not filled when ready, or the smartphone stacks lower the priority of BLE tasks after inactivity. 

    I suggest you test a few different smartphones and see how they behave, you might be testing with a phone that has a poor BLE stack. 
     
    I also suggest you study how you queue your HVX tx buffers in your two cases. The ble_app_uart server example should serve as a template on how to properly queue your data. 

Reply
  • I think Dmitry is right, either the HandleValueNotification/Indication(HVX) tx buffers are not filled when ready, or the smartphone stacks lower the priority of BLE tasks after inactivity. 

    I suggest you test a few different smartphones and see how they behave, you might be testing with a phone that has a poor BLE stack. 
     
    I also suggest you study how you queue your HVX tx buffers in your two cases. The ble_app_uart server example should serve as a template on how to properly queue your data. 

Children
  • In general I've found Dmitry's final conclusion to be true. Packet streaming after a host-intiated communication is very quick, packet streaming after a client-initiated communication is slower. I was able to verify this on a variety of phones and apps. In fact, I found that during the "slow case" of streaming (i.e. client-initiated communication prior to packet streaming), if I sent any data at all from the host to the client the transfer speed would return to the higher rate for the remainder of the data being streamed for that page. Practically speaking, the solution here was to adjust our protocol such that:

    1. Client sends packet to Host indicating it is ready to send a new page

    2. (This is the important one) Host sends packet to Client ACK'ing the previous message (and functionally claiming priority on the Host's radio stack for subsequent BLE activity)

    3. Client streams page of data to Host

    4. Repeat for every page

    All in all an interesting journey into interoperability between SoCs and commercial cellphones and bluetooth connection configuration. 

    Just to address the tx buffer/queue issue - This did not come in to play for me as in both fast/slow cases I was never adding more than one packet to the buffer at any given time. i.e., I am only adding a new packet once I've received the BLE_GATTS_EVT_HVN_TX_COMPLETE event, at which point the queue should be cleared. 

Related