Ble pairing with linux - zero distribution flags

Hi!

We are having problems pairing a nRF52840 against linux (ubuntu 25.04). Pairing fails instantly, before any passkey is shown on the host, returning with code 9.

Short log is here:

<dbg> bt_smp: bt_smp_recv: Received SMP code 0x01 len 6
<dbg> bt_smp: smp_pairing_req: req: io_capability 0x04, oob_flag 0x00, auth_req 0x2D, max_key_size 0x10, init_key_dist 0x0D, resp_key_dist 0x0F
<dbg> bt_smp: smp_init: prnd 849156b688e37e12ad063cb072baf74f
<dbg> bt_smp: smp_pairing_req: rsp: io_capability 0x02, oob_flag 0x00, auth_req 0x0D, max_key_size 0x10, init_key_dist 0x00, resp_key_dist 0x00
<dbg> bt_smp: bt_smp_recv: Received SMP code 0x0c len 64
<dbg> bt_smp: smp_public_key: 
<inf> Bt: Received passkey pairing inquiry.
<inf> Bt: Type `uhk passkey xxxxxx` to pair, or `uhk passkey -1` to reject
<wrn> bt_conn: conn 0x20010498: not connected
<dbg> bt_smp: bt_smp_disconnected: chan 0x20010b9c cid 0x0006
<dbg> bt_smp: smp_pairing_complete: got status 0x8
<dbg> bt_smp: bt_smp_encrypt_change: chan 0x20010b9c conn 0x20010498 handle 1 encrypt 0x00 hci status 0x1f 
<wrn> Bt: Bt security failed: n/a (n/a, 98:5f:41:d2:92:3a), level 1, err 9, disconnecting
<wrn> Bt: The connection (n/a (n/a, 98:5f:41:d2:92:3a)) isn't even connected! Ignoring.
<wrn> Bt: Pairing of auth conn failed because of 9
<wrn> Bt: Pairing failed: n/a (n/a, 98:5f:41:d2:92:3a), reason 9


AI back and forths suggest that the host PC cancels pairing because it receives the zero distribution flags in
```
<dbg> bt_smp: smp_pairing_req: rsp: io_capability 0x02, oob_flag 0x00, auth_req 0x0D, max_key_size 0x10, init_key_dist 0x00, resp_key_dist 0x00
```

Detailed logs (attaching file here doesn't work, so linking externally):
- zephyr with hci_core and smp logging: http://ktweb.cz/upload/logs2/right.log
- the same, but filtered: http://ktweb.cz/upload/logs2/right_filtered.log
- btmon log: http://ktweb.cz/upload/logs2/btmon.log

Any ideas what the problem might be and where to look further?

EDIT: here is wireshark log (paired from gui this time):

  - fails on gatt instead  http://ktweb.cz/upload/logs2/wireshark_linux_pairing.pcapng (This is probably our problem, since gatt is handled by an external service.)
- this one actually contains pairing attempt similar to the key distribution problem: http://ktweb.cz/upload/logs2/wireshark_linux_pairing2.pcapng

EDIT2 my updated conclusions and hypotheses:
- I think that the distribution flags are correct and (on of the) troubles is that zephyr doesn't reply with its own keys within the next 400ms or so.
- Looks like key generation takes too long on zephyr side. Before it is finished, the connection gets terminated.
- Reason for termination seems to be related to gatt and the ll buffers. Right before the connection is determined as disconnected, I see this check to fail in conn.c. Increasing the number doesn't fix the issue, but results in increased number of calls to bt_conn_tx_processor before failure (which is where the disconnect is determined):

static bool should_stop_tx(struct bt_conn *conn)
    ...
	if (atomic_get(&conn->in_ll) < 3) {
	   ...
	   return false;
	}
    return true;
}
	   

- my guess is that the bt_conn_tx_processor is triggered by receiving the gatt requests
- successful pairing against android http://ktweb.cz/upload/logs2/wireshark_android_pairing.pcapng
- there is a minor difference between linux and android flow. I think it is not important, but it should be noted that we are trying to achieve security level 4:

Initiator Key Distribution: 0x0d, Link Key, Signature Key (CSRK), Encryption Key (LTK)
    0000 .... = Reserved: 0x0
    .... 1... = Link Key: True
    .... .1.. = Signature Key (CSRK): True
    .... ..0. = Id Key (IRK): False         // This is false from linux, but true from android
    .... ...1 = Encryption Key (LTK): True

- failed pairing with the number from above excerpt increased 3 to 12 and buffer count: http://ktweb.cz/upload/logs2/wireshark_linux_pairing3.pcapng (flood of gatt packets from central to peripheral)
- This makes me think that either linux simply overwhelms zephyr with gatt requests, or there is some problem with our gatt provider (which is implemented by a custom implementation from c2usb). Still will be grateful for any thoughts on the subject.

  • As for more logs, I have the hci_core and smp logs that I posted in the pack (right.log). 

    Should I try gather more?

    (Generally, capturing logs at debug level is difficult, because there is too many logs and so the logging system keeps dropping messages.)

  • kat829 said:

    i don't think it's a hardware issue. if the connection between the halves is stable, it should also be stable between the right half and any other peer. we're using a proven 3rd party wireless module, and the module have been integrated into the uhk 80 according to the datasheet

    I would take it with a grain of salt though. 

    Does that mean that you tested the same connection interval on the same device?

    Just in case the issue is also present on your HFXO, can you please try to use the internal RC Oscillator?

    USe this in prj.conf:

    CONFIG_CLOCK_CONTROL_NRF_K32SRC_RC=y

    Also, you can experiment with all of the sources, and minmizing the accuracy of the LFCLK by setting:

    CONFIG_CLOCK_CONTROL_NRF_K32SRC_500PPM=y

    And see if that changes anything.

    Best regards,

    Edvin

Related