Packet loss during ISO transmission when advertising

Hi,

Bug:

With a connected ISO connection between two devices X and Y:

  • X acts as a peripheral. It's the ISO server and it is transmiting iso data to Y.
  • Y is multirole. It connects to X and establish an ISO connection to receive ISO data.

As long as Y only receives data, the communication is stable (no packet loss). As soon as Y advertises (bt_le_adv_start), Y does not receive all packets.

  • Increasing advertising interval, decrease packet loss.
  • Decreasing advertising interval, increase packet loss.

Environment:

  • OS: Linux
  • Toolchain zephyr-sdk-0.16.8
  • NRF5340DK
  • NCS 2.7 (sdk-nrf v2.7.0 and zephyr v3.6.99-ncs2)

How can solve this issue, which causes too many packet loss ?

Thank you

Parents
  • Hi,

    When advertisign while in an ISO connection there will be collisions from time to time. And in this case, the stack prioritizes the advertising packets. To reduce the ISO packet loss you could increase the number of ISO retransmissions, and ansure you use an advertising interval that is as high as possible.

  • Hello,

    We are migrating from NCS 2.1 with Packet Craft firmware (CPU NET) and we did not have this problem (at least it was not so significant).

    With the configuration described in my first message, I advertised using different parameters (provided to bt_le_adv_start):

    • No advertising: No packet loss => 0% loss
    • BT_GAP_ADV_FAST_INT_MIN_1 (30 ms) and BT_GAP_ADV_FAST_INT_MAX_1 (60 ms): 568 packets lost (out of 3425) => ~17% loss
    • BT_GAP_ADV_FAST_INT_MIN_2 (100 ms) and BT_GAP_ADV_FAST_INT_MAX_2 (150 ms): 226 packets lost (out of 3400) => ~7% loss
    • BT_GAP_ADV_SLOW_INT_MIN (1 s) and BT_GAP_ADV_SLOW_INT_MAX (1.2 s): 21 packets lost (out of 2800) => ~1% loss

    Looking at these numbers it seems that the advertising almost always causes an ISO packet loss.


    Even though we were to use the best case scenario (advertising parameters between 1 and 1.2s), we would have a minimal 1% loss, which is too much for our use case.


    With Packet Craft, we used the advertising parameters between 30 and 60ms, and we were clearly below 1% packet loss. So I don't think it's normal to have this today, and it seems more like a regression.

    What do you think ?

    Thanks

Reply
  • Hello,

    We are migrating from NCS 2.1 with Packet Craft firmware (CPU NET) and we did not have this problem (at least it was not so significant).

    With the configuration described in my first message, I advertised using different parameters (provided to bt_le_adv_start):

    • No advertising: No packet loss => 0% loss
    • BT_GAP_ADV_FAST_INT_MIN_1 (30 ms) and BT_GAP_ADV_FAST_INT_MAX_1 (60 ms): 568 packets lost (out of 3425) => ~17% loss
    • BT_GAP_ADV_FAST_INT_MIN_2 (100 ms) and BT_GAP_ADV_FAST_INT_MAX_2 (150 ms): 226 packets lost (out of 3400) => ~7% loss
    • BT_GAP_ADV_SLOW_INT_MIN (1 s) and BT_GAP_ADV_SLOW_INT_MAX (1.2 s): 21 packets lost (out of 2800) => ~1% loss

    Looking at these numbers it seems that the advertising almost always causes an ISO packet loss.


    Even though we were to use the best case scenario (advertising parameters between 1 and 1.2s), we would have a minimal 1% loss, which is too much for our use case.


    With Packet Craft, we used the advertising parameters between 30 and 60ms, and we were clearly below 1% packet loss. So I don't think it's normal to have this today, and it seems more like a regression.

    What do you think ?

    Thanks

Children
  • We did an other test with NCS 2.6, Packet Craft for CPU Net (ble5-ctr-rpmsg_18929.hex) and an "aggressive advertising" (30/60ms).

    => We have almost no packet loss (few packets times to times), clearly below 0.1% in the same test conditions.

    So I keep thinking, this is a regression from previous NCS. Can you confirm this ? How can we fix it ?

  • Hi,

    This is not due to a regression, but a difference in priorities in the Packetcraft controller and the SoftDevice controlloller. The Packetcraft controller always prioritizes ISO. The advantage with that is that it is less likely to get ISO packet loss. The disadvantage: other activities are influenced more. This may result in link loss, no packets being sent at all or reduced randomness in the random offsett for the advertiser.

    The priority between ISO and other activities is not configurable.

  • Hi,

    I understand what you are saying but I think the packet loss is excessive.

    I used the advertising parameters 1-1.2s during 10 minutes:

    • Maximum possible advertising: 1 * 10 * 60 = 600 advertisements.
    • Number of packets lost: 393

    If there is no advertising, there is no packet loss, so I can say that packet loss are caused by advertising.

    It means that 393/600 = ~66% of the advertising packets causes a packet lost. (if I take the worst case scenario, meaning 500 advertisements, it would be 393/500 = ~79%)

    If I try to keep it simple, with a 2M PHY (and retransmission set to 1):

    We send an ISO packets every 10ms, let's say that sending an ISO frame takes 1ms (128 bytes payload), it means ISO uses 10% of the bandwidth. I have trouble understanding that with 90% of the bandwidth available, 66% of the advertising packets cause ISO frames to be lost.


    Could you explain why there is so much collision ?

    Increasing retransmission is not reducing the packet loss. Why ?

    Thank you

    FYI, I use the following parameters, with one CIS channel (extracted from Zephyr sample peripheral_iso and central_iso):

    static uint16_t latency_ms = 10U; /* 10ms */
    static uint32_t interval_us = 10U * USEC_PER_MSEC; /* 10 ms */
    
    param.sca = BT_GAP_SCA_UNKNOWN;
    param.packing = 0;
    param.framing = 0;
    param.c_to_p_latency = latency_ms; /* ms */
    param.p_to_c_latency = latency_ms; /* ms */
    param.c_to_p_interval = interval_us; /* us */
    param.p_to_c_interval = interval_us; /* us */

  • Hi,

    The numbers you are seeing make sense. The reason is that a legacy advertising event takes about 4.5 ms in our implementation (including time for a potential scan request and scan response). And an ISO event with 128 byte takes about 1 ms as you write, but including overhead it is 1.5 ms. So in this case, with an ISO packet every 10 ms, there likelyhood of a collision is above 50%, as you are seeing. In other words, what you are seeing is expected.

    You may want to consider increasing the retransmission coun and transport latency to better handle the packet loss.

  • Hi,

    I tried the following configuration:

    • retransmission: 1, latency: 30ms or 100ms: No change
    • retransmissions: 3, latency: 10ms: No change
    • retransmissions: 3, latency: 30ms: Reduce packet loss by factor 2.5 (so packet loss when advertising is ~25% instead of 65%)

    So, increasing retransmission and latency improve the situation however with my configuration I don't understand why we don't have 0 packet lost. If advertising takes 4.5ms, with retransmission set to 2 and latency set to 20ms, it should be enough to cover all collision with advertising. Isn't it ?


    Moreover I found few interesting thing in the Soft Device scheduling documentation (docs.nordicsemi.com/.../scheduling.html):

    In "BIS timing" (docs.nordicsemi.com/.../scheduling.html, there are few parameters that seems to transpose to CIS:

    • BT_CTLR_SDC_BIG_RESERVED_TIME_US: Equivalent of BT_CTLR_SDC_CIG_RESERVED_TIME_US for CIS
    • BT_CTLR_SDC_PERIODIC_ADV_EVENT_LEN_DEFAULT: Is it used by CIS ?

    In this section it is stated that "For optimal scheduling, the periodic advertising interval and ISO interval should have a common factor, and the sum of the periodic and extended advertising timing-event lengths should be less than the BIG reserved time".

    So, do you think we could reduce the impact of advertising by configuring these ?

    "CONFIG_BT_CTLR_SDC_PERIODIC_ADV_EVENT_LEN_DEFAULT: The time set aside for periodic advertising each periodic advertising interval in microseconds. The event length is the primary parameter for how much data can be transmitted by the periodic advertiser without scheduling conflicts occurring.".
    My understanding is that we can reduce scheduling conflict (advertising) using this parameter. If so, which value should we use ? If not, I'm not sure to understand what this value does, could you provide me with more insight ?


    Thanks

Related