This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

Random BLE Disconnects / NRF52832 stops responding to Central

We are currently experiencing a strange BLE Disconnect issue that I haven't been manage to sort out for a long time now.

We have a device that acts as a peripheral that is connected to an app over BLE. The connection will be established normally, and keep running without any problems. On some of the devices however after a period of 10-25 minutes on avarege the BLE connection will die without any indication for a reason.

Some details:

  • We have a custom borad based on NRF52832
  • We are using SDK V15.3
  • We are using SoftDevice S132 6.1

The general symptoms are as follows:

  • After a certain random period of time (usually from 10 to 25 minutes) the NRF52832 board will stop responding to the packets coming from central
  • The central will try to retransmit the packet 23-24 times (this is in line with our connection params latency = 25)
  • The device will enter advertising mode
  • The app will connect again exchange some information discover services etc. and then the pattern will repeat.
  • The connection will be there for a couple of seconds, some data exchange will happen but afterwards the device will again fail to respond to 23-24 packets from central and go into advertising mode
  • repeat ad infinitum with the periods of sustained connection becoming shorter and shorter.
  • after the first several disconnects the connection will be able to be reestablished for ~20-30 seconds or so
  • later on the connected phase will decrease to virtually nonexistent - the device will  stop responding right after the CONNECT_IND

some additional remarks:

  • There doesn't seem to be a particular request from the central that we're not handling as the sniffer logs don't seem to show anything relevant
  • most often the device will fail to respond to an EMPTY_PDU but it will also fail on: LL_FEATURE_REQ, LL_PHY_REQ, LL_LENGTH_REQ, Sent Read By Group Type Request, Sent Read By TypeRequest, Sent Find Information Request, Sent Exchange MTU Response, and the list goes on
  • There's a suspicion that this issue is more prevalent on Iphones, however a bigger portion of our user base has an iphone so this is probably just statistics
  • The issue is present on our production app, on our internall test app as well as nrf_connect so it's extremelyunlikely that it's comming from the app side of things
  • The firmware isn't going through a softreset when this issue occurs as we have a custom app_error check that log's errors to ram (unless SD asserts silently somewhere) + even if there was a silent reset there are certain variables which will get zeroed out on reset and we have ways of reading them out independant of BLE (3g modem + mqtt server)
  • A pinreset will in general resolve this issue so if there was indeed a reset it would be highly likely that it would clear this BLE Loop of Death (this is our cute nickname for this behavior)

Things we have tried:

  • We had a suspicion that it had something to do with PHY requests - where on newer iphones it would try to negotiate for 2M PHY but forcing 1M PHY didn't solve the issue
  • We've experimented with several different connection params, and though this had some effect on the quality & range of the signal it didn't affect this particular issue whatsoever our connection params are will within the spec for iOS which seems to be way more stringent that Android's spec
  • We disabled peer manager and this didn't affect the behavior

It's been a struggle to capture this behavior while sniffing packets with the NRF dongle (since we're still not sure exactly how to reproduce this issue) but we've finally been able to do it. sadly though the wireshark logs didn't make the reason for this behavior directly evident to me.

Here are some snippets of what the sniffer sees during one of these episodes, as well as the full wireshark log for deeper investigation.

1#

2#

3#

The whole log, The "Loop of Death" starts around 14:54:30 and it will keep disconnecting untill almost the end of the log where a pinreset has been triggered:

test_324_1062.pcapng

In addition to this here's a screenshot from our internal fleet management system which shows the RSSI of the connection and the dynamics of this behavior. BLE is shown in yellow and the value axis for it is on the right - The RSSI is not stellar, hovering around -65dB but I see no reason why this should be the reason for this behavior:

Any tip or suggestion would be most appreciated as I'm completely stumped. The most resent discovery that some Iphones would negotiate for 2M PHY was really promising but after implementing this change without success I really don't know how to continue my investigation as the wireshark files didn't reveal any new & interresting info to me...

I'd be more than happy to provide additional info if it help clarify what the hell is going on

Related