This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

Central device claims to have Connected but no Connection event is fired on the peripheral

I'm developing a multirole device to connect to a peripheral device that we developed earlier.

The multirole device when acting as a central device it scans for the peripheral and if found it connects, then secure the connection and bond. After several successful connections, the central says that it connects to the peripheral but before being able to secure the connection or to discovery the services it disconnects.

I've enabled the logger on both devices and added logs to see the progress of the connection and the events.

I've seen that when the problem happens, the central is showing me the BLE_GAP_EVT_CONNECTED but the peripheral doesn't show anything. it's like the central device is firing a false BLE_GAP_EVT_CONNECTED event.

I've tried with the scan_params.extended on and off and the behavior is the same.

I've confirmed that when connected from another central that works fine (using the phone with the nrfConnect application), the peripheral shows the events as expected.

Both devices are using the nrf52832

The peripheral is using the SDK 15.0.0 and the softdevice s132 6.0.0

The device acting as a central is using the SDK 16.0.0 and the softdevice s132 7.0.1

Parents
  • You are not writing anything about how frequently this occurs , but it is possible that the connection request packet from the central to the peripheral from time to time is lost (e.g. due to interference). In such case the central will first get an BLE_GAP_EVT_CONNECTED event, then shortly after get an BLE_GAP_EVT_DISCONNECTED event with disconnect reason BLE_HCI_CONN_FAILED_TO_BE_ESTABLISHED. 

    Best regards,
    Kenneth

  • Hello Kenneth, as I say before, the connection works fine for a while, I can turn off, turn on, disconnect, reconnect several times, then suddenly, it stops working, no more connection to that particular peripheral, I don't believe it's something regarding interference, both devices are close, is not like it sometimes connects and sometimes it fail, when the issue happens, it never connect again. However, without moving anything, just deleting bonds or doing a mass erase it starts working again. 

    I'm going to try to sniff the packets to check if there's something wrong that I'm not seeing in the logs, however I need to get the nrf52 DK or the dongle to do it.

    Summarizing:

    1. After a fresh mass erase or a delete bonds, everything works fine, it always find the peripheral, connect, bonds, reconnects if disconnected, you can power off and on several times and everything works

    2. Suddenly the peripheral doesn't connect anymore, the Central is able to see the advertising packets from the peripheral, however as soon as the central tries to connect, the BLE_GAP_EVT_CONNECTED is received in the Central device but the peripheral doesn't receive anything, shortly after the central disconnects

    3. The peripheral is connectable through another central

    4. Scanning and connecting for another peripheral works fine (creating and using a new bond and a new peer entry)

    5. Deleting all bonds on the central and allowing the peripheral to refresh the bond information fix the connection issues

    6. Repeat from step 1

    I haven't count how many times the connection works before getting the issue, however I'm sure that it's more than at least 10 times, but probably a lot more.

  • I finally was able to get the sniffer traces, I can't see any connection response from the peripheral, however, the master still tries to negotiate the MTU

    traces_with_issue.pcapng

    The peripheral address is: e6:9d:93:ce:2d:c0

    and the master address is: e1:2e:95:29:49:6c

  • Everything is pointing to the connection request packet is not received (the advertisement timing indicate that the interval is not stopped at any time). The MTU exchange is just the first packet from the central, so that doesn't contradict that the connection request packet is not received.

    Presuming that whitelist is not enabled on the peripheral, then the only other explanation is some noise preventing the connection request packet to be received on the peripheral. Is the peripheral here an nRF52-DK or your own hardware? Is there any serial interfaces active near the antenna that may prevent the packet from being received?

    Best regards,
    Kenneth

  • The MTU exchange shouldn't wait for the connection response?

    The whitelist is enabled on the peripheral, but the central is a known device and it's within the whitelist of the peripheral

    The peripheral is our own hardware, It's working just fine with other central using other softdevice s132 6.0.0

    I doubt it has something to do with signal interference, the problem seems to be related with the flash storage on the central, because as soon as you do a mass erase on the central and disable the whitelist on the peripheral to allow the bond to be updated, it connects again. All this without changing anything in the environment.

    Before you ask, I've already tried disabling the whitelist on the peripheral, The peripheral doesn't even know there's a connection request.

    I've noticed that the problem happens when the central has more than one device in the whitelist,

  • nvelozsavino said:
    I've noticed that the problem happens when the central has more than one device in the whitelist,

    Do you have a sniffer log of failing and working? The next step would be to look at the content of the connection request packet from the central. The peripheral do not have any knowledge of the whitelist that may be present on the central, so this is indeed very odd.

    Kenneth

  • Here's the trace when the connection is successful, the packet id of the connection request is 11199

    connection_successful.pcapng

Reply Children
  • But this is not the same devices?

    Seems both the central and peripheral gap addresses are different between working and failing?

    Are you changing the GAP addresses of peers here?

    Kenneth

  • Hi Kenneth, no, it's not the same device, I have several peripherals and 4 master devices, it doesn't seem to be related with a specific hardware, I've seen the issue in all combinations.

    However I'll try to reproduce both a successful connection and the issue with the same pair of devices

  • Hi Kenneth, today I got the error again, I'm attaching the sniffer traces where it fails

    Within the file:  connection_unsuccessful_2.pcapng

    at line 854 you can see the Central trying to connect and the peripheral doesn't respond, both using their whitelists

    at line 2012 there's a connection again but this time both devices has their whitelist disabled, this time I can see that the peripheral is disconnecting, at line 2020

    after that, you will find lots of attemps with the whitelist enabled, Then I did a factory reset (mass erase) on the peripheral and starting line 4163 you can see the connection attempt and the peripheral is disconnecting.

    The problem is solved when I perform a mass erase on the central, as you can see in the successful file  connection_successful_3.pcapng

    as you can see in line 981

  • I assume that the central is f5:c6:75:d8:88:50 and the peripheral is e5:fa:95:ec:dc:61.

    From both logs I can see that the central send several scan request, but none are responded by a scan response by the peripheral. All the scan requests have invalid CRC according to the sniffer logs from the specific central, this lead me to believe you may have marginal hardware and/or some noise in the system. Is there any serial interfaces and/or gpios active with high drive strength that may impact the modulation of the scan request packet on the central? Have you done any DTM testing on your central hardware to verify modulation characteristics etc? Can you use an nRF52-DK for comparison?

    Also I can see that other central devices (e.g. 74:33:b4:05:18:b9 and 76:d6:47:c6:8e:94) seems to be able to send scan request packets successfully (some errors, but not all) and receive scan response from the peripheral during this time. Have you tried to call connect from any of these centrals for comparison?

    The connection request packets don't have any CRC errors based on the logs, so this leads me to believe that the whitelist on the peripheral may be wrong and/or some noise in the system, but of course difficult to know. Can you set a breakpoint in sd_ble_gap_whitelist_set() just to check that the addr_ptrs contain the central address (f5:c6:75:d8:88:50)?

    Have you tried to call ble_advertising_restart_without_whitelist() to just check? 

    I find it slightly inconclusive yet if it's the peripheral or central side that is the issue.

    Best regards,
    Kenneth

  • Hi Kenneth, I believe that the problem is definetly the Central, and I strongly doubt that there's something to do with noise, if that were the case, the probability of a successful connection should be higher than 0 after the problem manifest for the first time, what I've seen is that once the problem occours once, the central device never reconnects, other centrals can connect successfully to the same peripheral.

    I've tested deleting all bonds on the central without success, the only way it can work again without having to re-flash the MCU is doing a clean of all the data on the flash (only data stored by the peer manager, and the app, not the actual running program)

    My guess is that somehow the flash gets corrupted, something related to the info of the peer saved by the peer manager. I don't know if it could be that the central device goes to System OFF if there's no activity registered by an accelerometer, It might be possible that the device shuts down while a flash operation is in progress and this could cause the corruption, maybe between invalidating a file before adding the new updated one.

    We had a while ago another issue with another hardware, also a central device, after about 100 different connections to different peripherals (allowing only 1 at the time in the whitelist), the central stop connecting anymore and a clean of the flash was required. We haven't had any new reports from the user about this problem again, but I don't know if it's because we added an option to factory reset the central which solve the issue. Our guess that time was that the flash filled and there was a problem with the garbage collector that damaged the peers information. That time we were using a different sdk and softdevice version (s132 v5) and a different hardware, but now that I remember, it was the same behavior

Related