We are using SDK 15.3.0 and S140 v6.1.1 SoftDevice. We are using an external LF crystal with tolerance of +/-20ppm, and are therefore using
- Set 1:
- Min CI: 15ms
- Max CI: 30ms
- SL: 0
- Supervision Timeout: 6s
- Set 2:
- Mic CI: 30ms
- Max CI: 45ms
- SL: 30
- Supervision Timeout:6s
When first connecting, parameter set 1 is used. Set 2 is requested 30 seconds after ATT communication stops. For example, a phone connects to the nRF52, they exchange ATT commands for 10 seconds, then 30 seconds later, set 2 is requested.
We tested this setup on at least a dozen phones and a dozen nRF52s and it worked great. We even run a test for hours that switches between set 1 and set 2 at random intervals. This works great.
Recently, with new testers involved, they have found that 6 seconds (the supervision timeout) after set 2 is accepted (it seems to be after the params are accepted and not requested because both the phone and the nRF52 log use of the new parameters), a disconnection occurs due to a supervision timeout. While we know that interference could cause this, for certain individuals, this is 100% reproducible. They have tested in multiple physical locations (different interference profiles) and this can be reproduced on a variety of iOS and Android devices and a variety of nRF52 units. However, we have multiple individuals with the same model of phone, and only one person reproduces the issue.
We tried lowering the slave latency for set 2, and it eventually resolves the issue, but the value at which the issue is resolved varies case-by-case. Sometimes SL=10 fixes the disconnects. Sometimes it needs to be as low as 5.
I also tried a special nRF52 build with NRF_SDH_CLOCK_LF_ACCURACY set to 500ppm. I know this is recommended (required by the SDK asserts) if using the internal RC clock source, but we're not. The value of 500ppm did indeed resolve a consistent disconnect with SL=30.
Overall, it seems like we might be experiencing a clock tolerance issue, but I don't know how to prove this. We test in climate-controlled environments and our nRF52 board does not generate much heat. Unless the manufacturer is provided out
We've also already released our nRF52 FW and we are hoping to only make changes to the phone app (we have a way for the phone to request the nRF52 to use new connection parameters)
In summary, we have a small set of phones and nRF52s that consistently disconnect after a connection parameter update with a CI/SL combo over a few hundred milliseconds. Most devices do not have an issue. This is resolved by lowering the CI/SL duration, but the duration is variable. We're curious whether there is any known issue with the connection parameter update procedure or RX window widening. Does this situation ring any obvious bells?