proper usage of peer manager + advertising and what to do after disconnect

I am using NRF SDK 17.1.0

I inherited some legacy code that was a mess, and rewrote it.

Part 1 - 

First I skipped the peer manager thinking it was not necessary.
Then I re-added the peer manager to a peripheral to allow faster reconnects in the event we lose connection. 
It happens frequently. The device is frequently submerged in water. Without peer manager, reconnection was slow. 

I first was using a manual deactivation of advertising on connect, and activation of advertising on disconnect, and if multiple centrals are nearby, I noticed some cross pairing issues. 
Later, I noticed 3 things 

  • The legacy code I inherited never turned off advertising, tho I don't see it advertise in nrf connect while connected
  • The legacy code I inherited was skipping 
    pm_handler_disconnect_on_sec_failure in the pm_evt_handler(), I re-enabled it
  • I set config.ble_adv_on_disconnect_disabled to false

Between these 2 actions, it seems like the cross pairing issue is gone. 

Question 1&2 - is this the correct way to handle advertise on disconnect? Should I never be turning off advertising after a connection is established? 

Part 2 -

If a central is scanning and see's my peripheral, it tries to connect. 
The central hammers the peripheral and I get an endless loop of 

<warning> app: BLE_GAP_EVT_CONNECTED
<info> peer_manager_handler: Connection security failed: role: Peripheral, conn_handle: 0x0, procedure: Encryption, error: 4102
<warning> peer_manager_handler: Disconnecting conn_handle 0.
<warning> app: BLE_GAP_EVT_DISCONNECTED

I expect the security failure given that it isn't paired/bonded. 

Question 3 - If I delete the bonds on the peripheral, start advertising (undirected) and want this new central to connect, It seems like it stops scanning and fails to connect. 
I am trying to determine if this is a bug in my central or the peripheral. Still gathering data. Open to suggestions. 

Thanks!

Parents
  • Hi ms360,

    Question 1&2 - is this the correct way to handle advertise on disconnect? Should I never be turning off advertising after a connection is established? 

    There are multiple things at play here so let me go into it a bit slowly.

    I set config.ble_adv_on_disconnect_disabled to false

    It seems you are using the Advertising Module here rather than direct SoftDevice GAP API. No problem with that, just noting details for later.

    The exact behavior desired depends on your requirements. However, most project would like to have the device restart advertising upon disconnection. Regardless, this shouldn't have any issue related to pairing.

    The legacy code I inherited never turned off advertising, tho I don't see it advertise in nrf connect while connected

    Connectable advertising is stopped once the connection request is received. This is the default behavior.

    Do you expect the connectable advertising to continue? Is the device expected to support multiple concurrent centrals?

    The legacy code I inherited was skipping 
    pm_handler_disconnect_on_sec_failure in the pm_evt_handler(), I re-enabled it

    No comment on this. This depends on your requirement.

    Between these 2 actions, it seems like the cross pairing issue is gone. 

    I need some clarifications here. What exactly is the cross-pairing issue? 

    Does the issue disappear after you set ble_adv_on_disconnect_disabled() and implement pm_handler_disconnect_on_sec_failure()? Or after you do either of those?

    <warning> app: BLE_GAP_EVT_CONNECTED
    <info> peer_manager_handler: Connection security failed: role: Peripheral, conn_handle: 0x0, procedure: Encryption, error: 4102
    <warning> peer_manager_handler: Disconnecting conn_handle 0.
    <warning> app: BLE_GAP_EVT_DISCONNECTED

    I expect the security failure given that it isn't paired/bonded. 

    Question 3 - If I delete the bonds on the peripheral, start advertising (undirected) and want this new central to connect, It seems like it stops scanning and fails to connect. 
    I am trying to determine if this is a bug in my central or the peripheral. Still gathering data. Open to suggestions. 

    Refer to this past DevZone case:  peer_manager_handler: Connection security failed: role: Peripheral, conn_handle: 0x0, procedure: Encryption, error: 4102 .

    The steps you described here matches what is discussed in that case, where the bond information is only deleted on one device. If the other device isn't programmed to clear the bond information on this kind of failure, the failure will persist.

    Could that be the case?

    Hieu

  • Hieu, thank you. That sounds like the exact issue. I'm trying to detect this on the central but I'm not getting a clear event to indicate. Other than  BLE_HCI_REMOTE_USER_TERMINATED_CONNECTION 0x13 /**< Remote User Terminated Connection. */

    I logged all the PM events and I don't see an auth or sec failure. Is there something I can trigger the bonding info delete on? 

    [00:00:55.023,498] <error> ble_scan: Paired Peripheral FOUND with RSSI: -19 max: -23
    [00:00:55.023,803] <info> ble_scan: Connecting
    [00:00:55.067,626] <info> app: Connecting to target 079258EAD4E1
    [00:00:55.067,687] <error> app: PM ERROR 1
    [00:00:55.067,871] <error> app: PM ERROR 6
    [00:00:55.068,237] <error> app: PM ERROR 2
    [00:00:55.068,237] <info> peer_manager_handler: Peer data updated in flash: peer_id: 0, data_id: Peer rank, action: Update, no change
    [00:00:55.068,298] <error> app: PM ERROR 9
    [00:00:55.068,298] <error> app: PM ERROR 0
    [00:00:55.068,908] <error> app: PM ERROR 15

  • Hi ms360,

    What is the central device? Are there no way to delete the bond information from it?

    If you try to connect to the nRF device using a new central, such as your phone, do you still get the error 4102?

    Regarding the LOG_ERROR on the default case, I recommend lowering it down to the NRF_LOG_INFO or NRF_LOG_WARNING level and maybe change the message. It just signifies unhandled PM events, not an error.

  • The central device is another nrf52840. 
    It does not see the error 4102 warning, it only see's the disconnect reason 0x13.
    The peripheral show's the error 4102 warning.  

    It would be better if the central could detect this issue and delete the bonds. 

  • If you want the central to delete the bond, then you could implement a simple logic where if the connection failed repeatedly within a small time frame (like more than 3 times within 10-15 seconds), it could delete the bond.

    Deleting bond on the central could be done similar to how it is done in the peripheral.

  • So I did try this, and it did fix the central reconnecting problem. 
    However it broke the tolerance to being submerged. 

    We frequently submerge the device, which is why I used peer manager in the first place, for quick reconnection, and it worked great. 

    The second I implemented the workaround (3x within 10-15 seconds), after submerging, I have to re-pair, which is not good at all. 

    Since the central doesn't see the 4102, maybe I need to update peripheral advertisement data to indicate to the central to clear the bonding info, when it see's a 4102. Can you think of anything simpler? 

  • I added this exact feature, and now the central can filter scanned devices for the "peripheral with 4102" indicator and delete bonds. 

Reply Children
  • I tested a little bit with the ble_app_hrs and ble_app_hrs_c examples and have a few findings.

    First, I found that the disconnection in those examples is due to a pm_handler_disconnect_on_sec_failure() call in the peer manager event handler. A failure in the security procedure doesn't immediately lead to a disconnection.

    Thus, I comment out pm_handler_disconnect_on_sec_failure(), and retry the case where only the peripheral forgets the bond.
    I see that both the central and the peripheral receive error 4102 with different source.
    On the central, Peer Manager report error 4102 with source BLE_GAP_SEC_STATUS_SOURCE_REMOTE.
    On the peripheral, Peer Manager reports error 4102 with source BLE_GAP_SEC_STATUS_SOURCE_LOCAL.

    If your application is also using pm_handler_disconnect_on_sec_failure(), you can try delaying that call until after error checking, detect the error, and raise some flags before disconnecting, for example.

    On a relevant note, if I also remove the auto disconnect, and try the case where only the central forgets the bond, both devices then see Peer Manager reporting error 133, BLE_GAP_SEC_STATUS_PAIRING_NOT_SUPP. That sounds arguably wrong, but it is what the current behavior is like. If you wish, you can add similar handling on the peripheral side to forget bonds.

    While working on that, I realize something strange. How come the submerged peripheral device forgets the bonds? Does it get bonded with a lot of centrals?

  • The submerged peripheral was triggering the disconnect count / time threshold I previously tried. It happens pretty quickly with 1 foot of water separating the devices. It looked like the prescribed fix for the sec_failure from last week. They both have a very similar failure mode. 

    I did get it working with peripheral BLE advertisements showing a flag in advertisement data, and the central gives up after seeting the flag, trying N tries and deletes the bonds. It helped a lot. The solution you're providing would simplify things quite a bit tho. 

    I think either is viable. Yours wont require updating BLE documentation. 

Related