Packet loss during ISO transmission when advertising

Hi,

Bug:

With a connected ISO connection between two devices X and Y:

  • X acts as a peripheral. It's the ISO server and it is transmiting iso data to Y.
  • Y is multirole. It connects to X and establish an ISO connection to receive ISO data.

As long as Y only receives data, the communication is stable (no packet loss). As soon as Y advertises (bt_le_adv_start), Y does not receive all packets.

  • Increasing advertising interval, decrease packet loss.
  • Decreasing advertising interval, increase packet loss.

Environment:

  • OS: Linux
  • Toolchain zephyr-sdk-0.16.8
  • NRF5340DK
  • NCS 2.7 (sdk-nrf v2.7.0 and zephyr v3.6.99-ncs2)

How can solve this issue, which causes too many packet loss ?

Thank you

Parents
  • Hi,

    When advertisign while in an ISO connection there will be collisions from time to time. And in this case, the stack prioritizes the advertising packets. To reduce the ISO packet loss you could increase the number of ISO retransmissions, and ansure you use an advertising interval that is as high as possible.

  • Hello,

    We are migrating from NCS 2.1 with Packet Craft firmware (CPU NET) and we did not have this problem (at least it was not so significant).

    With the configuration described in my first message, I advertised using different parameters (provided to bt_le_adv_start):

    • No advertising: No packet loss => 0% loss
    • BT_GAP_ADV_FAST_INT_MIN_1 (30 ms) and BT_GAP_ADV_FAST_INT_MAX_1 (60 ms): 568 packets lost (out of 3425) => ~17% loss
    • BT_GAP_ADV_FAST_INT_MIN_2 (100 ms) and BT_GAP_ADV_FAST_INT_MAX_2 (150 ms): 226 packets lost (out of 3400) => ~7% loss
    • BT_GAP_ADV_SLOW_INT_MIN (1 s) and BT_GAP_ADV_SLOW_INT_MAX (1.2 s): 21 packets lost (out of 2800) => ~1% loss

    Looking at these numbers it seems that the advertising almost always causes an ISO packet loss.


    Even though we were to use the best case scenario (advertising parameters between 1 and 1.2s), we would have a minimal 1% loss, which is too much for our use case.


    With Packet Craft, we used the advertising parameters between 30 and 60ms, and we were clearly below 1% packet loss. So I don't think it's normal to have this today, and it seems more like a regression.

    What do you think ?

    Thanks

  • Hi,

    I tried with 60/70ms ACL connection time, I still have the same packet loss.

    This is my understanding of the current problem:

    Advertising is 4.5ms, so in our case this is the duration of about 3 ISO frames (3 * 1.5ms). It means that 4 retransmissions should be sufficient to cover the collisions. I don't know how much time takes ACL connection, but let's say 1.5ms (feel free to correct me), it would mean 5 retransmissions in the worst case scenario.

    So, with 10 RTN and a latency of 50ms, I don't see what can cause packet loss ? I think it should cover every collision.

    I don't understand why the worst case scenario (longest duration where ISO frames can't be sent because of collision) isn't known and why a configuration (retransmission and latency) is not possible to remove entirely this packet loss.

    We don't have specific requirements on latency and retransmission (as far as it's reasonable), if it allows no packet loss. We are in connected mode (CIS).

    Thank you.

  • Hi,

    I have tested it a bit locally and managed to reproduce the packet loss you are seeing. It does seem to be an implmentation issue on our side. Although the root cause is not identified yet, it seems to be working when NSE > 2 and FT > 1. NSE and FT are selected by the controller with the provided RTN and max transport latency, and so far the configurations you have tried are, unfortunately, not meeting the two conditions at the same time. The parameter selection algorithm of the controller is not exposed publicly and thus there are some trial & error to do. I am sorry for the inconvinience.

    You mentioned RTN=10 and max transport latency=50ms - have you tested this configuration? It had no packet loss locally for me with an advertising interval=1s, ACL connection interval=70ms, max SDU size=120bytes, ISO interval=10ms.

    Or alternatively, could you try RTN=5, max transport latency=20ms? This is 48_4_1 in the quality of service configuration, and it had no packet loss for me as well.

    Let us know if this is helpful to you!

    Cheers,

    Yuxuan

  • Hi,

    In the following configuration:

    • ACL: 70ms
    • SDU: 120 bytes
    • ISO interval: 10ms
    • Advertising: Every second

    I tried:

    • RTN:10 and latency: 50ms: I still have packet loss
    • RTN:5 and latency: 20ms: I still have packet loss

    So unfortunately, I see no improvement on my side.

    Could you provide me your sample code (that allowed you to have no packet loss) to reproduce it on my end ?

    Thank you

  • Hi,

    I was using the internal framework for quick testing, and it will take some time before I can convert it into something you could run. Would you mind sharing your codes?

    Cheers,

    Yuxuan

  • Hi,

    You can find enclosed a patch to apply on Zephyr (v3.6.99-ncs2) used by NCS 2.7. It's only a rework of peripheral_iso and central_iso samples (central becomes RX and peripheral becomes TX).

    Once the patch is applied, you just have to build and flash these 2 samples on 2 different targets (Once flashed, the targets must be close to connect to each other since connection is based on RSSI).

    0001-Peripheral-TX-ISO-Central-RX-ISO.patch

    There is multiple defines at the beginning of the different main.c to customize the parameters we talked about.

    Let me know if you have any issue with this.

Reply
  • Hi,

    You can find enclosed a patch to apply on Zephyr (v3.6.99-ncs2) used by NCS 2.7. It's only a rework of peripheral_iso and central_iso samples (central becomes RX and peripheral becomes TX).

    Once the patch is applied, you just have to build and flash these 2 samples on 2 different targets (Once flashed, the targets must be close to connect to each other since connection is based on RSSI).

    0001-Peripheral-TX-ISO-Central-RX-ISO.patch

    There is multiple defines at the beginning of the different main.c to customize the parameters we talked about.

    Let me know if you have any issue with this.

Children
  • Hi,

    Sorry that it took some time.

    I've managed to reproduce the packet loss you mentioned, and after discussions with the team, I believe it has to do with how the data are provided.

    In your application, the SDUs are provided to the controller with sequence numbers. When an SDU is provided too late, the controller will either drop the data or send it in the next event. An SDU can be provided too late due to the asynchronous nature of the HCI interface, or in your case, the interference from an ongoing advertiser.

    To address this, you could use the "Time of Arrival" mode mentioned here by setting the sequence number to 0 when calling bt_iso_chan_send(). With this I observed no missing packets locally.

    If you'd like to learn about the "Timestamp" mode, please check out the iso time sync sample.

    Cheers,

    Yuxuan

  • Hi,

    We did some test with

    ret = bt_iso_chan_send(&iso_cis_chan, buf, 0);

    and this is much better. We don't see any packet loss for the moment (we will continue our tests in the following days).

    Could you tell me if the issue you mention with the SDU will be fixed in later version or is it normal ?

    Thank you for your support.

  • Hi,

    Glad to know it helps!

    The fact that the controller drops outdated SDU is a design decision and thus it will most likely not be fixed.

    Cheers,

    Yuxuan

Related