Packet loss during ISO transmission when advertising

Hi,

Bug:

With a connected ISO connection between two devices X and Y:

  • X acts as a peripheral. It's the ISO server and it is transmiting iso data to Y.
  • Y is multirole. It connects to X and establish an ISO connection to receive ISO data.

As long as Y only receives data, the communication is stable (no packet loss). As soon as Y advertises (bt_le_adv_start), Y does not receive all packets.

  • Increasing advertising interval, decrease packet loss.
  • Decreasing advertising interval, increase packet loss.

Environment:

  • OS: Linux
  • Toolchain zephyr-sdk-0.16.8
  • NRF5340DK
  • NCS 2.7 (sdk-nrf v2.7.0 and zephyr v3.6.99-ncs2)

How can solve this issue, which causes too many packet loss ?

Thank you

Parents
  • Hi,

    When advertisign while in an ISO connection there will be collisions from time to time. And in this case, the stack prioritizes the advertising packets. To reduce the ISO packet loss you could increase the number of ISO retransmissions, and ansure you use an advertising interval that is as high as possible.

  • Hello,

    We are migrating from NCS 2.1 with Packet Craft firmware (CPU NET) and we did not have this problem (at least it was not so significant).

    With the configuration described in my first message, I advertised using different parameters (provided to bt_le_adv_start):

    • No advertising: No packet loss => 0% loss
    • BT_GAP_ADV_FAST_INT_MIN_1 (30 ms) and BT_GAP_ADV_FAST_INT_MAX_1 (60 ms): 568 packets lost (out of 3425) => ~17% loss
    • BT_GAP_ADV_FAST_INT_MIN_2 (100 ms) and BT_GAP_ADV_FAST_INT_MAX_2 (150 ms): 226 packets lost (out of 3400) => ~7% loss
    • BT_GAP_ADV_SLOW_INT_MIN (1 s) and BT_GAP_ADV_SLOW_INT_MAX (1.2 s): 21 packets lost (out of 2800) => ~1% loss

    Looking at these numbers it seems that the advertising almost always causes an ISO packet loss.


    Even though we were to use the best case scenario (advertising parameters between 1 and 1.2s), we would have a minimal 1% loss, which is too much for our use case.


    With Packet Craft, we used the advertising parameters between 30 and 60ms, and we were clearly below 1% packet loss. So I don't think it's normal to have this today, and it seems more like a regression.

    What do you think ?

    Thanks

  • I have updated my diagram according to your previous message:



    If I follow correctly, with 2 retransmission and a latency of 20ms, we shouldn't have any loss. This is what is represented on the diagram.

    So, can you explain to me, why 2 retransmission and a latency of 20ms does not solve this packet loss issue ?

    Thank you

  • Hi,

    Sorry for chiming in. As you already know, with an advertising interval set to 20ms, the actual advertising interval could range from 10ms to 30ms due to the random offset. So in the updated diagram, it is possible that the retransmitted Frame B is also blocked by the advertising activity, when the actual advertising interval is ~10ms. Since the retransmission number (RTN) is set to 2, and Frame B has been transmitted twice without a success, Frame B will be flushed, resulting in a packet loss.

    For reference, the basic audio profile (BAP) for Bluetooth low energy defines two sets of quality of service configurations: low latency and high reliability (See 5.6.2 QoS Configurations in https://www.bluetooth.com/specifications/specs/basic-audio-profile-1-0-1/). There you can see a RTN of 2 is still in the realm of "low latency". In your application, to ensure a high reliabitliy without a high packet loss rate, I'd recommend a much higher RTN.

    As a side note, the Softdevice Controller would select flush timeout (FT) and number of subevents (NSE) based on the given RTN and max transport latency. I wouldn't go into details about FT and NSE here to keep the answer short. However, the Softdevice Controller would always prioritize max transport latency over RTN, as by the Core Specification, the RTN is only a recommendation, not a mendatory requirement. Therefore, I would suggest setting a max transport latency that your application can accept first and foremost, and then increase the RTN to achieve better reliability.

    Hope this helps!


    Cheers,

    Yuxuan

  • Hello,

    My advertising is 1 second, not 20ms.

    Sorry for the confusion it was only to amplify what I didn't understand in one post. You can forget this value of 20ms.

    So since my advertising is (about) every second, my diagram is valid (I think) and I don't understand why the parameters (rtn:2, latency: 20ms) do not work. Can you provide me with an example where this doesn't work ?

    If possible, can you provide me with the correct parameters to have no packet loss with 1s advertising, 10ms ISO interval and 128 bytes payload ?  (In my opinion, the correct parameters are predictable)

    Thank you

  • Hi,

    Here is another possibility.

    The ISO reception could be blocked by the ACL connection. To establish a CIS, you shall have an ACL connection by the spec. These two connections have individual intervals, and it is possible/expected that they would interfere with each other. What is the connection interval you are using for the ACL connection? It is recommended to use an ACL connection interval larger than ISO interval (say 60ms or 70ms when the ISO interval is 10ms) to reduce the inteference. Such interference might already exist before, and it was not noticed because of the retransmission. With the advertising activity of the peripheral, the radio becomes busier and thus the packet loss is noticed.

    It is not guaranteed to work but could you please try a RTN = 13 with a max transport latency >= 50 ms? This allows transmitting the same ISO packet across 5 ISO intervals, in which each interval the packet could be retransmitted 3 times.

    Also, it would be nice if you have any sniffer logs that we could look into. 


    Cheers,

    Yuxuan

  • Hello,

    So this is my current configuration:

    • Advertising: 1/1.2s
    • ACL: 450/750 ms
    • RTN: 13
    • Latency: 100 ms

    In theory, we should have no packet loss with this configuration (related to Soft Device internal mechanism/scheduling).

    The result of my tests shows:

    • Increasing latency and RTN reduces packet loss, I now see 0.1% packet loss instead of 1%. However, it's still too much loss for a "best case scenario" (2 boards next to each others).
    • Increasing ACL connection is not reducing packet loss.

    I'm sorry to insist, but I need a configuration with 0% packet loss in the "best case scenario". This is mandatory for our product to have the packet loss to a minimum (even though we know we can't have 0% packet loss when deployed).

    To my knowledge, NRFSniffer does not support Isochronous connection. If it's not supported, I can provided you the code I run to reproduce it on your side. That would allow you to sniff the traffic on your side.

    As a reminder, if I remove advertising I see absolutely no packet loss. (Even with 30/60ms ACL connection time)

Reply
  • Hello,

    So this is my current configuration:

    • Advertising: 1/1.2s
    • ACL: 450/750 ms
    • RTN: 13
    • Latency: 100 ms

    In theory, we should have no packet loss with this configuration (related to Soft Device internal mechanism/scheduling).

    The result of my tests shows:

    • Increasing latency and RTN reduces packet loss, I now see 0.1% packet loss instead of 1%. However, it's still too much loss for a "best case scenario" (2 boards next to each others).
    • Increasing ACL connection is not reducing packet loss.

    I'm sorry to insist, but I need a configuration with 0% packet loss in the "best case scenario". This is mandatory for our product to have the packet loss to a minimum (even though we know we can't have 0% packet loss when deployed).

    To my knowledge, NRFSniffer does not support Isochronous connection. If it's not supported, I can provided you the code I run to reproduce it on your side. That would allow you to sniff the traffic on your side.

    As a reminder, if I remove advertising I see absolutely no packet loss. (Even with 30/60ms ACL connection time)

Children
  • Hi,

    The ACL connection interval you are using seems a bit too large. Could you try a smaller value of 60 or 70ms?

    Due to clock drifts between devices, the peripheral will perform window widening to ensure it can receive packets from central. Simply put, the larger connection interval it is, the larger window widening it will be. This window widening on ACL connection may block the CIS packet reception.


    Cheers,

    Yuxuan

  • Hi,

    I tried with 60/70ms ACL connection time, I still have the same packet loss.

    This is my understanding of the current problem:

    Advertising is 4.5ms, so in our case this is the duration of about 3 ISO frames (3 * 1.5ms). It means that 4 retransmissions should be sufficient to cover the collisions. I don't know how much time takes ACL connection, but let's say 1.5ms (feel free to correct me), it would mean 5 retransmissions in the worst case scenario.

    So, with 10 RTN and a latency of 50ms, I don't see what can cause packet loss ? I think it should cover every collision.

    I don't understand why the worst case scenario (longest duration where ISO frames can't be sent because of collision) isn't known and why a configuration (retransmission and latency) is not possible to remove entirely this packet loss.

    We don't have specific requirements on latency and retransmission (as far as it's reasonable), if it allows no packet loss. We are in connected mode (CIS).

    Thank you.

  • Hi,

    I have tested it a bit locally and managed to reproduce the packet loss you are seeing. It does seem to be an implmentation issue on our side. Although the root cause is not identified yet, it seems to be working when NSE > 2 and FT > 1. NSE and FT are selected by the controller with the provided RTN and max transport latency, and so far the configurations you have tried are, unfortunately, not meeting the two conditions at the same time. The parameter selection algorithm of the controller is not exposed publicly and thus there are some trial & error to do. I am sorry for the inconvinience.

    You mentioned RTN=10 and max transport latency=50ms - have you tested this configuration? It had no packet loss locally for me with an advertising interval=1s, ACL connection interval=70ms, max SDU size=120bytes, ISO interval=10ms.

    Or alternatively, could you try RTN=5, max transport latency=20ms? This is 48_4_1 in the quality of service configuration, and it had no packet loss for me as well.

    Let us know if this is helpful to you!

    Cheers,

    Yuxuan

  • Hi,

    In the following configuration:

    • ACL: 70ms
    • SDU: 120 bytes
    • ISO interval: 10ms
    • Advertising: Every second

    I tried:

    • RTN:10 and latency: 50ms: I still have packet loss
    • RTN:5 and latency: 20ms: I still have packet loss

    So unfortunately, I see no improvement on my side.

    Could you provide me your sample code (that allowed you to have no packet loss) to reproduce it on my end ?

    Thank you

  • Hi,

    I was using the internal framework for quick testing, and it will take some time before I can convert it into something you could run. Would you mind sharing your codes?

    Cheers,

    Yuxuan

Related