This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

sniffer detects LL infinite loop

tr4x_87dB_0xc7_1024_7.5ms_おかしい.psdWe are using a transmission scheme where we tag each of our packets with a block number and fire them off using write without acknowledgement. The receiving side looks for "holes" (missing block numbers) and requests the re-transmission of those missing blocks.

While testing functionality with the central app running on a PC and using the MasterEmulator.dll, we noticed the Central sometimes "locked up" and resent the same packet "forever", well, hundreds of times until some watchdog timer kicks in and the connection is disconnected.

The packet is sent at an interval of ~7.5ms, and the channel always changes, so the radio timing sync is working OK. But there is a vicious circle where the SN and NESN get out of sync.

image description

I have not delved into the details of the link layer, but I would imagine there should be some sort of retry timeout mechanism, who knows?
I believe the connection interval used was 7.5ms, and the peripheral is using S130 2.0.0-7 Alpha with app_sched_execute(); to process the Q. The peripheral may well have been operating near its limit, which may have caused a critical delay which sparked this sequence, but no matter how you get into it, this infinite loop is a LL state machine problem, bug or feature.

We will be transitioning to the S130 release version image soon enough, and the problem may, just go away. Or then again, it may not.

Hope some one has time to address the issue. Thanks. Karel.

Minor edit: added the trace file (sorry, its a TI trace file).

Parents
  • Hi Karel,

    It's on the link layer not the ATT layer, so it's doesn't matter if it's notification or write command, if the packet is not ACKed on link layer, it will be re-transmit until it's ACKed or until the connection timeout.

    From your trace, it seems that the Slave tried to NACK the packet because it's not possible to take more data (buffer full). It's maybe that the application didn't read the events out of the softdevice as it should and the event buffer is full. Or if you do authorization but failed to approve the write you will also get bufferfull.

    Which firmware do you use on the Slave side ?

    If you test using our SDK's examples, would you have the same issue ?

Reply
  • Hi Karel,

    It's on the link layer not the ATT layer, so it's doesn't matter if it's notification or write command, if the packet is not ACKed on link layer, it will be re-transmit until it's ACKed or until the connection timeout.

    From your trace, it seems that the Slave tried to NACK the packet because it's not possible to take more data (buffer full). It's maybe that the application didn't read the events out of the softdevice as it should and the event buffer is full. Or if you do authorization but failed to approve the write you will also get bufferfull.

    Which firmware do you use on the Slave side ?

    If you test using our SDK's examples, would you have the same issue ?

Children
No Data
Related