nRF52 dual role: while connected as central and scanning, connection as peripheral fails

We have an nRF52 with dual roles: peripheral for the connection to an app, and central, connecting to multiple third party peripheral devices.

We have come across a very weird issue - it took a while to reproduce, but here it is:

If our device as central is connected to a Holux GPS receiver, and it is currently scanning for more devices, and we then connect our app to our devices peripheral role, it fails. Well, actually it's worse: it does connect, but during the next few steps (MTU & parameter negotiation) suddenly disconnects with reason 0x08

What's odd is that everything else being the same, with a QStarz GPS receiver connected instead, this is absolutely no problem. The only difference being that the connection parameters are different (lower connection interval)

Holux:

min: 320 (400ms)
max: 520 (650ms)
latency: 0
sup_timout: 400 (4000ms)

They agree on 520 (650ms)

The Holux, btw has a Nordic Chipset.

Qstarz:

min: 16 (20ms)
max: 60 (75ms)
latency: 0
sup_timout: 400 (4000ms)

They agree on 60 (75ms)

our ble_gap_scan_params_t remain the same:

.active = 1,
.interval = MSEC_TO_UNITS(100, UNIT_0_625_MS), 
.window = MSEC_TO_UNITS(30000, UNIT_10_MS),
.timeout = MSEC_TO_UNITS(50, UNIT_0_625_MS),

when we receive a connection on our peripheral role, we immediately (try to) stop scanning, and restart it once the connection is established, but this seems not to work as expected if we're connected with long connection intervals. 

Sniffing the traffic with the nRF sniffer does not reveal anything special - our device simply stops responding in the middle of service discovery, about 1.2 seconds after the initial connection. 
I've attached a sniffing session of the problem (2 successful connections while not scanning, one fails when scanning) and 2 sessions of connecting the holux accessory and the qstarz accessory.

Do you have any ideas what we should try or look into? 

Thanks already! 


3124.unleasshed-holux-app-connect-scan-disconnect-app-reconnect-fail.pcapng

1070.unleashed-holux.pcapng

2133.unleashed-qstarz.pcapng

Parents
  • Hi,

    Looking at unleasshed-holux-app-connect-scan-disconnect-app-reconnect-fail.pcapng I see the intended disconnect in #3228, and then a re-connection, data exchange, and a new sudden disconnect at #3486. Looking at the packets before there you see that there is a lot of retransmissions, so the disconnect here with reason 8 (timeout) is expected given that. There is no telling why there are all these retransmissions, though. Are you able to debug both devices? It seems clear that the peripheral stopped receiving and acknowledging the packets. Perhaps it is in a bad state? The sniffer trace does not give information about this, though I suggest you debug on the peripheral side to see what state it is in.

  • Hi,

    yes, exactly. That's what we were observing too. Our device is not showing any signs of getting stuck, so we're assuming it's the soft device. After all it stops responding in the middle of service discovery. Can we poll the softdevice state for debugging? Any specific info we should try to get?

    The odd thing is that this only happens if our device is simultaneously connected as central with a 650ms connection interval and is scanning as central while we're trying to connect to its peripheral instance.

    If it's connected as central with a 75ms connection interval, and is scanning, then there's no problem! 

  • We're using a Module with a built-in XTAL. It's a crystal with max 40 ppm, but I don't know the part number or load caps. 


    sdk_config.h:

    NRF_SDH_CLOCK_LF_SRC 1 //XTAL
    NRF_SDH_CLOCK_LF_ACCURACY 5 //50ppm


    We're using SDK 17.0 and SD s132_nrf52_7.0.1

  • I think I actually just managed to work around the bug by stopping scanning as soon as possible when we get a connection to our peripheral role. 

    Still I'd like to understand if this is a bug in the softdevice, and/or what the reasoning behind this is. 

    I guess it would be worth, checking if you can reproduce the setup with example projects.
    Instead of our Holux: One Blinky with preferred connection parameters: min: 320 (400ms), max: 520 (650ms), latency: 0, sup_timout: 400 (4000ms). 
    Instead of our Unleashed: one dual role with preferred connection parameters: 8(10ms)-12(15ms), 2mbps PHY and MTU/Datalength: 247/251
    Instead of our App: nRF Connect on an iPhone. 

    I already tested using nRF connect with our devices and I see the same error, so it has nothing to do with our app.

  • That is interesting. Can you upload the example projects with your modifications here so that I can test them on my end?

  • I haven't actually tested with example projects - all our DKs are currently with employees in home-office, or configured as sniffers - I was suggesting you might be able to test it ;-) 

    I did test the nRF Connect app with our devices, just to confirm it has nothing to do with our iOS app. 

  • That seems like a needle in a haystack, as I have no indication of the conditions needed for the issue nor what causes it. But if you are able to reproduce it with code that I can run on a DK let me know and share the code, and I will dig into it.

Reply Children
No Data
Related