High Failure Rate in START ENCRYPTION Sequence: 2 Failures per 10 Trials (Baseband or Link Layer Issue)

Hello Nordic Semiconductor tech support team,

HW : nRF52840 ( Build code : nRF52840-CKAA-F )

SW : ncs 2.6.0

Sample Application :  zephyr/samples/bluetooth/hci_spi sample with SoftDevice Controller


While verifying our software design, we frequently encounter connection errors. Could you help us identify the possible cause of this issue and suggest a resolution? Right after the enhanced connection is completed, we’re expecting an Encryption Change Event from the SoftDevice Controller. However, instead of an Encryption Change Event, a Disconnect Complete (0x05) occurs with the reason ‘Connection Terminated due to MIC Failure (0x3d).

Device A : Android Mobile Phone (central), (iOS case is also reported)

Device B: nRF52840 (Peripheral) 

Frame 17605: 7 bytes on wire (56 bits), 7 bytes captured (56 bits) on interface Fake IF, Import from Hex Dump, id 0 (inbound)
Bluetooth
Bluetooth HCI H4
Bluetooth HCI Event - Disconnect Complete
Event Code: Disconnect Complete (0x05)
Parameter Total Length: 4
Status: Success (0x00)
Connection Handle: 0x00cd
Reason: Connection Terminated due to MIC Failure (0x3d)

(highlighted marked HCI events)

Parents
  • Hello,

    I get slight impression that there is some marginally timing here or that the wrong keys ares provided. 

    Do you see the same if you relax the SPI clock speed, and in specific if you relax (e.g. >10us for test) for instance the timing between slave select and first clock pulse.

    Kenneth

  • Hello Kenneth, 

    Thank you for your prompt response and update. Based on the HCI log from device B (nRF52840 SoftDevice controller), I have ruled out one possible cause: the wrong LTK case. During this test, we consistently used the same LTK.

    Could you please investigate this case from another perspective? Specifically, could you identify the scenario in which the SoftDevice Controller sends out the (0x3d) reason during a Disconnection Complete event, in collaboration with your BLE Core development team?

    According to our logs, the error occurred when the connection began encrypting immediately after it was created, rather than during heavy traffic to/from the SPI bus.

    Thanks,

    Charles

  • Hello again,

    1. Your understanding is correct, CONFIG_BT_LOG_SNIFFER_INFO is only applicable to the Zephyr host. So you need to share the LTK some other way then.

    2. We didn't find any similar issues mentioned before.

    Please share sniffer log when you are able to replicate the issue.

    Kenneth

  • Hello Kenneth,

    Thank you for your patience. I received the sniffer log from the test team. This test case involves around 50 reconnection tests. I observed that 13 of them failed with the same pattern.

    	15,645		2	0xbc85f063	0x000a	LL_START_ENC_REQ		43	 00:00:00.097501125	8/16/2024 8:25:22.270591865 AM	
    	18,487		2	0xda25e476	0x000c	LL_START_ENC_REQ		43	 00:00:00.195464125	8/16/2024 8:26:24.134999740 AM	
    	25,526		2	0x30b62793	0x0010	LL_START_ENC_REQ		43	 00:00:00.097501000	8/16/2024 8:29:22.513598365 AM	
    	35,393		2	0x4b0f9672	0x0010	LL_START_ENC_REQ		43	 00:00:00.097500250	8/16/2024 8:33:28.167792615 AM	
    	38,324		2	0x545d9ae1	0x0012	LL_START_ENC_REQ		43	 00:00:00.097503000	8/16/2024 8:34:39.343057990 AM	
    	44,843		2	0x11572d49	0x0014	LL_START_ENC_REQ		43	 00:00:00.146252500	8/16/2024 8:37:23.194317365 AM	
    	47,776		2	0x5f123729	0x0010	LL_START_ENC_REQ		43	 00:00:00.146251625	8/16/2024 8:38:44.314594615 AM	
    	54,035		2	0x7b38cd16	0x000f	LL_START_ENC_REQ		43	 00:00:00.195000750	8/16/2024 8:41:23.046934865 AM	
    	67,019		2	0x4a9b76df	0x000f	LL_START_ENC_REQ		43	 00:00:00.097502000	8/16/2024 8:46:33.977749740 AM	
    	151,397		2	0x69835adb	0x0010	LL_START_ENC_REQ		43	 00:00:00.195001625	8/16/2024 9:22:41.194609865 AM	
    	154,291		2	0xcfa5441f	0x000e	LL_START_ENC_REQ		43	 00:00:00.146251375	8/16/2024 9:23:47.153576490 AM	
    	183,910		2	0x9825d1c4	0x000f	LL_START_ENC_REQ		43	 00:00:00.097501125	8/16/2024 9:35:43.885001240 AM	
    	190,851		2	0x987d7cd5	0x0009	LL_START_ENC_REQ		43	 00:00:00.097501250	8/16/2024 9:38:40.265265615 AM	

    According to the sniffer logs, somehow the NRF52840 does not understand LL_START_ENC_RSP. For example, in the first failure, the LL of Device A (Mediatek, Inc., subversion: 0x0000) keeps sending LL_START_ENC_RSP, but there is no LL_START_ENC_RSP response back. Most likely, the NRF52840 determined that the Message Integrity Check (MIC) failed on a received LL_START_ENC_RSP packet.

    Could you verify if the encrypted data is valid and share how we can verify if the encrypted MIC: 0x04ec92c6 is correct or not?

    LTK  : A1 B8 AB 72 D9 4E D7 AD 32 39 23 CF 68 B9 B5 03 (0x03B5B968CF233932ADD74ED972ABB8A1)

    To read the attached sniffer logs, I had to install the wps4.00_24.6.34658.34822.exe version. Please find the Pump Reconnection-withLTK.zip file. The unzip password is: devzone114123.

    Pump Reconnection-withLTK.zip

    Thanks,

    Charles

  • Hi Charles,

    I have forwarded the details internally. Will let you know when I learn more or they need further details.

    Kenneth

  • Hello again,

    The team have taken a look, and this indeed look like an issue in the softdevice controller, the team will start working on a fix and will target to get this into v2.8.0 that is scheduled end of next month.

    Sorry for this issue.

    Kenneth

  • Hello,

    To check if the issue is fixed, look for DRGN-23204 in the CHANGELOG for the v2.8.0 release (when it's released that is):
    https://github.com/nrfconnect/sdk-nrfxlib/blob/main/softdevice_controller/CHANGELOG.rst

    Kenneth

Reply Children
  • Hello Kenneth, If the DRGN-23204 information is published, please update this case. Since we’re in the verification stage, may I apply the softdevice_controller change (like softdevice_controller/lib/cortex-m4/soft-float/libsoftdevice_controller_multirole.a) within version 2.6.0 instead of the full v2.8.0 NCS? If so, how can I apply that change only?

    Thanks, Charles

  • Subject: Request for Detailed Information on DRGN-23204

    Hi Kenneth,

    According to our local AE, NCS v2.8 was frozen on October 3rd. However, I am still unable to find DRGN-23204 in the changelog of the main branch. When the information for DRGN-23204 is published, could we receive more detailed information?

    We are observing this issue with iOS 14 as well. If the SoftDevice controller includes a fix, I would like to have more details to close our open quality tickets.

    Thank you for your assistance.

    Best regards,

    Charles

Related