Multirole (C and P) device misses connection events

Hi all, I am observing some strange behaviour on my two nRF52840 devices. They are running NCS v2.1.0 (Zephyr 3.1.99), one as central and peripheral, one as peripheral only. They both have BLE parameters similar to those inthe nRF throughput example.

The central is connected to the peripheral, as well as to another central (nRF52 DK). The connection params on both connections are 100ms, timeout 4s, latency 0. Data is sent from the peripheral (A) to the central (B), and from that to the DK (C). With a logic analyser I am measuring radio activity on A and B (RX_READY to DISABLED on A, TX_READY to DISABLED on B) and whenever there is a confirmation of BLE message sent/received (let's call it BLE confirmation) depending on the role (sent callback on A, received callback on B).

For most of the time, the connection seems healthy, and on the logic analyser this translates to the BLE confirmation signal and the radio activity signal occuring very closely to each other on each A-B connection event, apart from a missed packet let's say every 40 packets. However, after some time (sometimes ~20 minutes, it varies), during which data is transmitted continuously between the three devices, the BLE confirmations on A and B seem to half (roughly both synced confirmations happening every 200ms rather than every conn event), and the two devices miss packets and start queueing them up. This continues for a while (10 minutes maybe) and then automatically goes back to a healthy connection.

While looking closely at what goes on during this unhealthy state, I noticed that it is B that just isn't showing up to the connection events with A. A shows up, I can see the radio activity, but on B there is no radio activity during that event. This occurs more or less once on every two A-B connection events, but sometimes more often. However, B still shows up to every connection event with C and maintains what seems to be a reliable connection with C all throughout this state.

What could be causing this?

In the mean time I will run a similar setup while replacing A and B with two nRF DKs to isolate the antenna design element from the root cause.

Parents
  • Hi David, 
    I assume you are using Nordic BLE controller ? 

    If you have a look at the scheduling documentation here, you can find somewhat the explanation for the behavior you observed. By default BLE connection event should have same priority and when one connection is about to timeout due to too much preemption it will have first priority. 

    However, I'm not so certain about the drifting explanation. If there is a drifting, both connection should be drifted. And why would B-C has higher priority than A-B. 

    I agree with Emil suggestion that you can try to make the connection interval slightly different to see if you have the same issue. I would assume collision still occurs but it will not last for a long time as you are seeing in your case. 

  • Consider the case when there are three devices A, B, C with two connections:

    1. A (peripheral) <-> B (central)

    2. B (peripheral) <-> C (central)

    and both connections have equal connection interval.

    Since the timing is based on the central's clock in a connection, B's clock will drive the first connection and C's clock will drive the second connection. These two clocks are not perfectly synchronised but will drift with respect to each other. It is therefore possible that the connection events at B for the two connections for some time always clash, until they drift apart. Having unrelated connection intervals solves this issue, since if two connection events from two different connections in that case happen to overlap, they will certainly not overlap at the next connection event.

    This can be compared to the case when there is only one central, but multiple peripherals to this central. Since the central's clock drive all connections in this case, the timings will always drift the same, causing no overlaps in this case.

Reply
  • Consider the case when there are three devices A, B, C with two connections:

    1. A (peripheral) <-> B (central)

    2. B (peripheral) <-> C (central)

    and both connections have equal connection interval.

    Since the timing is based on the central's clock in a connection, B's clock will drive the first connection and C's clock will drive the second connection. These two clocks are not perfectly synchronised but will drift with respect to each other. It is therefore possible that the connection events at B for the two connections for some time always clash, until they drift apart. Having unrelated connection intervals solves this issue, since if two connection events from two different connections in that case happen to overlap, they will certainly not overlap at the next connection event.

    This can be compared to the case when there is only one central, but multiple peripherals to this central. Since the central's clock drive all connections in this case, the timings will always drift the same, causing no overlaps in this case.

Children
Related