Peripheral -> Central Streaming Packet Loss

I am using NUS to stream sensor data from a nRF5340 peripheral to a nRF5340 central device. I have an index that I'm able to use to track if any packets are missed, and at higher sensor bandwidths I'm intermittently missing a packet. This is generally to be expected (reaching a throughput limit), but I'm trying to determine where the packet is getting lost/dropped. In sniffing the traffic, I can clearly see an Empty PDU packet where the missing index packet should be.

Within the peripheral device, I am able to confirm that the packet of the (eventually) missing index is present and correct at the time bt_nus_send() is called. I am checking for returned errors, and none come back.

What could account for these missing packets if the data makes it into bt_nus_send() with no errors returned?

Thanks in advance

Parents
  • Hi,

    I will bring your questions up with the BLE team, but in the meanwhile could you state which version of the SDK you're working with? Are you using the Softdevice controller or the Zephyr BLE controller? 

    I assume you've based your application on the central-/peripheral_uart samples and added your sensor, but in case you have not done this already, could you verify if you see the same thing with the pristine samples+sensor functionality?

    Kind regards,
    Andreas

  • Hi Andreas,

    We are using SDK version 2.0.0, and we are using the SoftDevice controller. That's correct that the code is based on those samples.

    Sorry, can you elaborate on what functionality I should verify with regards to the "pristine samples+sensor"? If relevant, I have verified that the samples that eventually "go missing" do make it to the point of the bt_nus_send() call.

    Thanks for your help

  • Hi,

    brushlow said:
    Sorry, can you elaborate on what functionality I should verify with regards to the "pristine samples+sensor"? If relevant, I have verified that the samples that eventually "go missing" do make it to the point of the bt_nus_send() call.

    Yeah, no worries, I see that the question could be a bit ambiguous. To be clear, what I meant was in the case you were running this on a 100% custom application that you've created yourself was if you were able to recreate the error on a sample that we provide in the SDK. And if it was already the case that the latter of these two were what you were doing, then you don't have to run that test.

    But disregard that for now. Based on what you describe it seems like it is nothing wrong with the BLE part of your application, but rather the UART part of it. Could you explain what you mean by "higher sensor bandwidths"? Could you check which baud rates the sensor supports? My guess is that the higher baud rates you refer to might not be compatible with what the sensor supports

    Kind regards,
    Andreas

  • We manipulate the sample rate of the sensor itself, and at higher sample rates (in this case 1kHz) I see more dropped samples; around 3-5 per minute. At 500Hz, I see significantly less (but still some) dropped samples, around 1 per minute.

    Just to be clear, we're able to get good data out of the sensor at 1kHz (and 2kHz, for that matter). I have code checking for gaps in sample indexes instrumented throughout my Peripheral and Central, and these gaps are only detected on the Central side. On the Peripheral side, there are no missing samples up to the point of the bt_nus_send() call, so I'm quite sure it's not an issue with the sensor producing data.

    As for the data itself, we are sending 6 bytes per sample with some added messaging overhead. I would estimate the throughput need at around 55kbps.

    Thanks

  • Noted! 

    Thanks for the follow up. I'm just about to head out the door for the weekend so this is just a heads up that I will most likely not be able to give a comment to your reply before early next week. I will get back to you by then

    Kind regards,
    Andreas

Reply Children
  • Hi again,

    Thank you for your patience.

    So 55kbps should be manable for both BLE and UART. But just as you know, BLE is not 100% lossless w.r.t package loss. ~5 samples per minute might occur and is not an unreasonable number of lost packs. I am fairly certain that the packs are lost over the air and not in your hardware/software. If you don't do this already, could you add an ack to every received pack to and resend every lost pack that does not go through (if you need every single pack)?

    Kind regards,
    Andreas

  • Hi Andreas,

    No problem.

    Understood. I thought I read somewhere that the NUS resends undelivered packets by default and guarantees delivery (implying use of the link layer ACKs, I'd assume), except in the event of a disconnection. Is there a separate ACK I can enable?

    I'm a definite newcomer to BLE, so thanks for all your help and patience

  • Hi,

    I see I formulated myself quite ambiguous in the Monday rush so I will reformulate my previous reply with some more details

    AHaug said:
    But just as you know, BLE is not 100% lossless w.r.t package loss

    This is not true, BLE is lossless. More specifically connections are lossless, while advertising/scanning has loss (which is what I based my previous comment on without elaborating). BLE connections are reliable and does not accept packet loss. That is, if a packet is not received after x attempts, the connection will be terminated. 

    Connections, notifications and indications are lossless. This means that packs sent in either of these 3 relations are in fact acked.

    Apologies for the confusion!

    The acknowledgement is that you increment a package sequence number based on the previous transmission you received from the peer so that you always knows that the previous is received when you get the next one. Then you can retransmit if the number is not as expected. This goes for all packs sent in a connection, not only notifications (a notification is just a permission for the peripheral to send packs to a central unprompted). 

    Going back to the issue at hand: you are observing packet loss. This narrows us down to two most likely scenareos

    1. The peripheral sensor and central are only advertising and scanning and not connected -> loss is to be expected
      1. Changing to a connection or using notifications should fix this
    2. Something happens when queuing the data

    Narrowing down the second item is a bit more complicated than the first one, but here are some thoughts nonetheless

    1. Start by checking exactly how your sensor data is queued when bt_nus_send() is called
      1. If it is a Zephyr queue by some sort (and not as direct input into any softdevice buffers), there could be some differences in implementations that leads to the data being lost here. For example, if it is a ring buffer of some kind (not likely, but it could happen).
      2. One way to check is if the call to bt_nus_send returns "non mem" or similar at any point in time 

    Let me know if this clarifies things for you, and once again apologies for the confusion regarding the not lossless statement

    Kind regards,
    Andreas

  • Hi Andreas,

    No worries, thanks for the clarification.

    Here's where I'm at, first in regards to the sensor data and then in regards to the NUS portion:

    - Sensor data:

    I am using a Zephyr FIFO to queue the data and send it via bt_nus_send(). I have tried instrumenting the peripheral throughout the sensor data pipeline, and I can find no evidence that we are losing samples. I've looked at the data pre-queuing and post-queuing, and also ensured there are an equal number of calls of the pre-queue function, bt_nus_send() calling function, and the nus sent callback. All match up, even as I am getting missed samples on the Central device.

    - NUS/BLE:

    I have been monitoring error code returns from bt_nus_send(), and it has never returned one. With regards to notifications, my impression was that NUS used notifications by default. And I am using NUS on the Central side, as well. Can you point me to how I can be sure notifications are enabled on both sides, such that every packet is ACKed before proceeding?

    Thanks again

  • Actually, what I said about the sensor data is inaccurate. I was only looking at the data itself that goes on the FIFO and not the length field for the data. For some reason, that is ending up 0 on the consumer end of the FIFO, which is the root of my issue. No issue with the BLE/NUS, it seems.

    Not sure if you can provide any insights on the Zephyr FIFO or if that's a separate matter, but appreciate your help in getting to this point!

Related