Packet loss during ISO transmission when advertising

Hi,

Bug:

With a connected ISO connection between two devices X and Y:

  • X acts as a peripheral. It's the ISO server and it is transmiting iso data to Y.
  • Y is multirole. It connects to X and establish an ISO connection to receive ISO data.

As long as Y only receives data, the communication is stable (no packet loss). As soon as Y advertises (bt_le_adv_start), Y does not receive all packets.

  • Increasing advertising interval, decrease packet loss.
  • Decreasing advertising interval, increase packet loss.

Environment:

  • OS: Linux
  • Toolchain zephyr-sdk-0.16.8
  • NRF5340DK
  • NCS 2.7 (sdk-nrf v2.7.0 and zephyr v3.6.99-ncs2)

How can solve this issue, which causes too many packet loss ?

Thank you

Parents
  • Hi,

    When advertisign while in an ISO connection there will be collisions from time to time. And in this case, the stack prioritizes the advertising packets. To reduce the ISO packet loss you could increase the number of ISO retransmissions, and ansure you use an advertising interval that is as high as possible.

  • Hello,

    We are migrating from NCS 2.1 with Packet Craft firmware (CPU NET) and we did not have this problem (at least it was not so significant).

    With the configuration described in my first message, I advertised using different parameters (provided to bt_le_adv_start):

    • No advertising: No packet loss => 0% loss
    • BT_GAP_ADV_FAST_INT_MIN_1 (30 ms) and BT_GAP_ADV_FAST_INT_MAX_1 (60 ms): 568 packets lost (out of 3425) => ~17% loss
    • BT_GAP_ADV_FAST_INT_MIN_2 (100 ms) and BT_GAP_ADV_FAST_INT_MAX_2 (150 ms): 226 packets lost (out of 3400) => ~7% loss
    • BT_GAP_ADV_SLOW_INT_MIN (1 s) and BT_GAP_ADV_SLOW_INT_MAX (1.2 s): 21 packets lost (out of 2800) => ~1% loss

    Looking at these numbers it seems that the advertising almost always causes an ISO packet loss.


    Even though we were to use the best case scenario (advertising parameters between 1 and 1.2s), we would have a minimal 1% loss, which is too much for our use case.


    With Packet Craft, we used the advertising parameters between 30 and 60ms, and we were clearly below 1% packet loss. So I don't think it's normal to have this today, and it seems more like a regression.

    What do you think ?

    Thanks

  • Hi,

    This is not due to a regression, but a difference in priorities in the Packetcraft controller and the SoftDevice controlloller. The Packetcraft controller always prioritizes ISO. The advantage with that is that it is less likely to get ISO packet loss. The disadvantage: other activities are influenced more. This may result in link loss, no packets being sent at all or reduced randomness in the random offsett for the advertiser.

    The priority between ISO and other activities is not configurable.

  • Hi,

    I understand what you are saying but I think the packet loss is excessive.

    I used the advertising parameters 1-1.2s during 10 minutes:

    • Maximum possible advertising: 1 * 10 * 60 = 600 advertisements.
    • Number of packets lost: 393

    If there is no advertising, there is no packet loss, so I can say that packet loss are caused by advertising.

    It means that 393/600 = ~66% of the advertising packets causes a packet lost. (if I take the worst case scenario, meaning 500 advertisements, it would be 393/500 = ~79%)

    If I try to keep it simple, with a 2M PHY (and retransmission set to 1):

    We send an ISO packets every 10ms, let's say that sending an ISO frame takes 1ms (128 bytes payload), it means ISO uses 10% of the bandwidth. I have trouble understanding that with 90% of the bandwidth available, 66% of the advertising packets cause ISO frames to be lost.


    Could you explain why there is so much collision ?

    Increasing retransmission is not reducing the packet loss. Why ?

    Thank you

    FYI, I use the following parameters, with one CIS channel (extracted from Zephyr sample peripheral_iso and central_iso):

    static uint16_t latency_ms = 10U; /* 10ms */
    static uint32_t interval_us = 10U * USEC_PER_MSEC; /* 10 ms */
    
    param.sca = BT_GAP_SCA_UNKNOWN;
    param.packing = 0;
    param.framing = 0;
    param.c_to_p_latency = latency_ms; /* ms */
    param.p_to_c_latency = latency_ms; /* ms */
    param.c_to_p_interval = interval_us; /* us */
    param.p_to_c_interval = interval_us; /* us */

  • Hi,

    The numbers you are seeing make sense. The reason is that a legacy advertising event takes about 4.5 ms in our implementation (including time for a potential scan request and scan response). And an ISO event with 128 byte takes about 1 ms as you write, but including overhead it is 1.5 ms. So in this case, with an ISO packet every 10 ms, there likelyhood of a collision is above 50%, as you are seeing. In other words, what you are seeing is expected.

    You may want to consider increasing the retransmission coun and transport latency to better handle the packet loss.

  • Hi,

    I tried the following configuration:

    • retransmission: 1, latency: 30ms or 100ms: No change
    • retransmissions: 3, latency: 10ms: No change
    • retransmissions: 3, latency: 30ms: Reduce packet loss by factor 2.5 (so packet loss when advertising is ~25% instead of 65%)

    So, increasing retransmission and latency improve the situation however with my configuration I don't understand why we don't have 0 packet lost. If advertising takes 4.5ms, with retransmission set to 2 and latency set to 20ms, it should be enough to cover all collision with advertising. Isn't it ?


    Moreover I found few interesting thing in the Soft Device scheduling documentation (docs.nordicsemi.com/.../scheduling.html):

    In "BIS timing" (docs.nordicsemi.com/.../scheduling.html, there are few parameters that seems to transpose to CIS:

    • BT_CTLR_SDC_BIG_RESERVED_TIME_US: Equivalent of BT_CTLR_SDC_CIG_RESERVED_TIME_US for CIS
    • BT_CTLR_SDC_PERIODIC_ADV_EVENT_LEN_DEFAULT: Is it used by CIS ?

    In this section it is stated that "For optimal scheduling, the periodic advertising interval and ISO interval should have a common factor, and the sum of the periodic and extended advertising timing-event lengths should be less than the BIG reserved time".

    So, do you think we could reduce the impact of advertising by configuring these ?

    "CONFIG_BT_CTLR_SDC_PERIODIC_ADV_EVENT_LEN_DEFAULT: The time set aside for periodic advertising each periodic advertising interval in microseconds. The event length is the primary parameter for how much data can be transmitted by the periodic advertiser without scheduling conflicts occurring.".
    My understanding is that we can reduce scheduling conflict (advertising) using this parameter. If so, which value should we use ? If not, I'm not sure to understand what this value does, could you provide me with more insight ?


    Thanks

  • Hi,

    thomas_hexploy said:
    increasing retransmission and latency improve the situation however with my configuration

    That is good.

    thomas_hexploy said:
    If advertising takes 4.5ms, with retransmission set to 2 and latency set to 20ms, it should be enough to cover all collision with advertising. Isn't it ?

    In principle, yes. However, legacy advertising packets have a random offset of 0-10 ms per the Bluetooth sepcification, so you cannot schedule it exactly where it would fit in between the ISO packets.

    thomas_hexploy said:

    In this section it is stated that "For optimal scheduling, the periodic advertising interval and ISO interval should have a common factor, and the sum of the periodic and extended advertising timing-event lengths should be less than the BIG reserved time".

    So, do you think we could reduce the impact of advertising by configuring these ?

    This applies to periodic avertising only, but as that was not mentioned before I assumed you are using only legacy (normal) advertising. If that is the case(?), this is not relevant.

Reply
  • Hi,

    thomas_hexploy said:
    increasing retransmission and latency improve the situation however with my configuration

    That is good.

    thomas_hexploy said:
    If advertising takes 4.5ms, with retransmission set to 2 and latency set to 20ms, it should be enough to cover all collision with advertising. Isn't it ?

    In principle, yes. However, legacy advertising packets have a random offset of 0-10 ms per the Bluetooth sepcification, so you cannot schedule it exactly where it would fit in between the ISO packets.

    thomas_hexploy said:

    In this section it is stated that "For optimal scheduling, the periodic advertising interval and ISO interval should have a common factor, and the sum of the periodic and extended advertising timing-event lengths should be less than the BIG reserved time".

    So, do you think we could reduce the impact of advertising by configuring these ?

    This applies to periodic avertising only, but as that was not mentioned before I assumed you are using only legacy (normal) advertising. If that is the case(?), this is not relevant.

Children
  • Hello,

    Thank you for your response. However, there are two points I'd like to clarify:

    I don't quite understand why these 10ms would cause a collision. From my perspective, this is only a delay and should not interfere with transmission or reception. If we consider the minimum advertising interval allowed by the specification, which is 20ms, this would imply that up to 33% (10ms random delay divided by 30ms, which is the total advertising duration, 20ms + random delay) of the bandwidth might be unavailable. Therefore, in my understanding, this delay should not cause collision with any reception or transmission.(I used the information provided in "Legacy advertising" section of https://www.bluetooth.com/blog/periodic-advertising-sync-transfer/)

    How is this implemented in the SoftDevice ? Can you confirm that the ~10ms delay is blocking any transmission/reception ?

    Even if we treat the 10ms delay as part of the advertising, it would suggest that advertising takes 14.5ms, which is still under 20ms (the time between 3 retransmissions). So, I don’t understand how a packet could be lost with 3 retransmissions and a 40ms latency (as I tested). The following diagram illustrates my perspective (I simplified it, focusing on lost packet):


    Could you please point out where I might be mistaken?

    Thank you.

  • thomas_hexploy said:
    How is this implemented in the SoftDevice ? Can you confirm that the ~10ms delay is blocking any transmission/reception ?

    No, it is not. Advertising blocks for about 4.5ms per advertising event. But due to the random offset this is not consistent so you cannot schedule the advertising events to always be between other activity. And the ISO transmissions happen at a fixed interval. So as you have one fixed interval (ISO) and one gliding and varying interval (advertising) whese will collide from time to time in a non-deterministic way.

  • I have updated my diagram according to your previous message:



    If I follow correctly, with 2 retransmission and a latency of 20ms, we shouldn't have any loss. This is what is represented on the diagram.

    So, can you explain to me, why 2 retransmission and a latency of 20ms does not solve this packet loss issue ?

    Thank you

  • Hi,

    Sorry for chiming in. As you already know, with an advertising interval set to 20ms, the actual advertising interval could range from 10ms to 30ms due to the random offset. So in the updated diagram, it is possible that the retransmitted Frame B is also blocked by the advertising activity, when the actual advertising interval is ~10ms. Since the retransmission number (RTN) is set to 2, and Frame B has been transmitted twice without a success, Frame B will be flushed, resulting in a packet loss.

    For reference, the basic audio profile (BAP) for Bluetooth low energy defines two sets of quality of service configurations: low latency and high reliability (See 5.6.2 QoS Configurations in https://www.bluetooth.com/specifications/specs/basic-audio-profile-1-0-1/). There you can see a RTN of 2 is still in the realm of "low latency". In your application, to ensure a high reliabitliy without a high packet loss rate, I'd recommend a much higher RTN.

    As a side note, the Softdevice Controller would select flush timeout (FT) and number of subevents (NSE) based on the given RTN and max transport latency. I wouldn't go into details about FT and NSE here to keep the answer short. However, the Softdevice Controller would always prioritize max transport latency over RTN, as by the Core Specification, the RTN is only a recommendation, not a mendatory requirement. Therefore, I would suggest setting a max transport latency that your application can accept first and foremost, and then increase the RTN to achieve better reliability.

    Hope this helps!


    Cheers,

    Yuxuan

  • Hello,

    My advertising is 1 second, not 20ms.

    Sorry for the confusion it was only to amplify what I didn't understand in one post. You can forget this value of 20ms.

    So since my advertising is (about) every second, my diagram is valid (I think) and I don't understand why the parameters (rtn:2, latency: 20ms) do not work. Can you provide me with an example where this doesn't work ?

    If possible, can you provide me with the correct parameters to have no packet loss with 1s advertising, 10ms ISO interval and 128 bytes payload ?  (In my opinion, the correct parameters are predictable)

    Thank you

Related