Beware that this post is related to an SDK in maintenance mode
More Info: Consider nRF Connect SDK for new designs

nRF52840 pairing error on one specific device

Dear support team and community

After some days where it worked perfectly, our nRF52840 suddently stopped connecting anymore to a Motorola G4 plus running Android 7.1.1.

The same firmware running on a different nRF52840 can pair, bond, and connect well to the same smartphone.

We are running SDK v2.3.0 with ZephyrOS, and are using the default configuration here. The bonding information are saved in the flash memory (CONFIG_BT_SETTINGS=y), and this works perfectly on multiple other devices. For your information, the BT_LL_SOFTDEVICE flag is set in kconfig (I did not set it manually).

The logs with the info level are : 

[00:05:36.819,091] <inf> BLE: Connected via connection (0) at 7C:6B:E3:81:40:AE (random)
[00:05:36.819,335] <inf> BLE: Connection (0) - Initial Tx Power = 3
[00:05:36.819,335] <inf> BLE: Connection parameters: interval 48.75 ms, latency 0 intervals, timeout 20000 ms
[00:05:37.676,055] <inf> BLE: Connection parameters updated: interval 7.50 ms, latency 0 intervals, timeout 20000 ms
[00:05:38.369,720] <inf> BLE: Connection parameters updated: interval 48.75 ms, latency 0 intervals, timeout 20000 ms
[00:05:39.040,924] <inf> BLE: Connection parameters updated: interval 7.50 ms, latency 0 intervals, timeout 20000 ms
[00:05:39.147,399] <inf> BLE: Security changed: 7C:6B:E3:81:40:AE (random) level 2
[00:05:39.222,015] <wrn> bt_smp: SMP does not allow a pairing failure at this point. Known issue. Disconnecting instead.
[00:05:39.222,656] <wrn> bt_att: Not connected
[00:05:39.222,686] <err> bt_conn: not connected!
[00:05:39.222,778] <inf> BLE: Connection parameters updated: interval 48.75 ms, latency 0 intervals, timeout 20000 ms
[00:05:39.284,881] <inf> BLE: Disconnected (reason 22)
The reason In the source code where the bt_smp warning log appears, I can see :

	/* By spec, SMP "pairing process" completes successfully when the last
	 * key to distribute is acknowledged at link-layer.
	 */
	remote_already_completed = (atomic_test_bit(smp->flags, SMP_FLAG_KEYS_DISTR) &&
				    !smp->local_dist && !smp->remote_dist);

	if (atomic_test_bit(smp->flags, SMP_FLAG_PAIRING) ||
	    atomic_test_bit(smp->flags, SMP_FLAG_ENC_PENDING) ||
	    atomic_test_bit(smp->flags, SMP_FLAG_SEC_REQ)) {
		/* reset context and report */
		smp_pairing_complete(smp, reason);
	}

	if (remote_already_completed) {
		LOG_WRN("SMP does not allow a pairing failure at this point. Known issue. "
			"Disconnecting instead.");
		/* We are probably here because we are, as a peripheral, rejecting a pairing based
		 * on the central's identity address information, but that was the last key to
		 * be transmitted. In that case, the pairing process is already completed.
		 * The SMP protocol states that the pairing process is completed the moment the
		 * peripheral link-layer confirmed the reception of the PDU with the last key.
		 */
		bt_conn_disconnect(smp->chan.chan.conn, BT_HCI_ERR_AUTH_FAIL);
		return 0;
	}

I don't understand the meaning of this issue, can you help me with some unlightening ? Would deleting the bonding information for this specific device in the nRF52 flash memory be a fix ? If yes, how to proceed ? 

  • Hi

    It sounds to me like the bonding information on either the nRF52 or the Android device has been deleted, making the other device thinking it's still bonded to it. This can be reverted by making sure the bonding information is deleted on both devices. On the Android go to Bluetooth settings and "forget" the device from there. On the nRF52 the surefire way to erase bond info is to erase the entire flash with the nRF Command Line Tools and do an nrfjprog --eraseall command. It's also a good idea to have a way to erase bond info on the nRF52840 without having to reprogram it. This is done by an erase_bonds() call. By default in some of our samples this is done upon a button press on the Development Kit for example, or whenever the nRF52 is reset or powered down.

    Best regards,

    Simon

  • Dear Simon,

    Thank you for your reply. I tested to remove the nRF52 from the known bluetooth devices on the phone, it does not fix the issue.


    I saw in the following tutorial that I need to use bt_unpair in my case (after dealing with the peer manager for a moment... I still don't understand well the differences between the Zephyr stack and the other ones).

    I would like to apply the deletion of the pairing informations for the problematic device only, and keep the other devices informations in the Flash memory. This snap should do the trick, not tested yet : 

    /// @brief Register a callback structure for connection events.
    BT_CONN_CB_DEFINE(conn_callbacks) = {
        .connected = onConnectedCb,
        .disconnected = onDisconnectedCb,
        .security_changed = onSecurityChangedCb,
        .le_param_updated = onLeParamUpdatedCb,
    };
    
    static void onDisconnectedCb(struct bt_conn *conn, uint8_t reason) {
        [...]
        
        // How to filter to call the two following lines only when the issue occurs ?
        const bt_addr_le_t *destAddr = bt_conn_get_dst(conn);
        bt_unpair(BT_ID_DEFAULT, destAddr);
    }

    I added in comment what I'm looking for. Which condition should I add to delete the pairing information only in case of BT_HCI_ERR_AUTH_FAIL ? The error parameter passed to the disconnection callback (error 22 in my case) covers too many use cases and does not filter out enough scenarios.

    Best regards,

    Aurélien

  • Hi

    If you're using the nRF Connect SDK (Zephyr RTOS based SDK), you should not look at the Infocenter, as that only documents our products and the nRF5 SDK (bare metal SDK).

    The documentation for the nRF Connect SDK can be found here: https://developer.nordicsemi.com/nRF_Connect_SDK/doc/latest/nrf/index.html 

    I think you can add a check for the disconnect reason or for this AUTH_FAIL and then if that triggers you can unpair from the device on your end. 

    Best regards,

    Simon

  • Dear Simon,

    Thank you for the link and the answer. The check for the disconnect reason is not sufficient from my point of view, since I always get reason 22 (BT_HCI_ERR_LOCALHOST_TERM_CONN) and there could be several other scenarios leading to the same disconnecting reason that should not trigger an erasure of the bonding informations. 

    I cannot successfully find a way to get the AUTH_FAIL error in the disconnection callback. If the security_changed callback was called on this error, I would have access to the error type bt_security_err, but this is not the case here.

    Do you know how I could get this AUTH_FAIL status in the disconnect callback ?  Or access any other flag that could help me to address this issue specifically ? I have the feeling this is not possible. 

    Best regards,

    Aurélien

  • Hi

    I don't have a specific sample snippet of code doing this I'm afraid, but the T_HCI_ERR_AUTH_FAIL seems to return the BT_SECURITY_ERR_AUTH_FAIL call, which you can wait for in your main.c file and make that occurence trig an erase bond.

    If we want to get to the bottom of what exactly is causing this, I think we would need a sniffer trace to see what the difference over the air is between the nRF52840 that is able to pair and connect and the one that is not.

    You can use an nRF52840 DK and the nRF Sniffer firmware with Wireshark, or a dedicated Bluetooth Sniffer like the Ellisys Vanguard if you have one available. We can help you review the sniffer traces.

    Best regards,

    Simon

Related