Custom board nrf5340 configured as central peripheral HR combo disconnects from HR devices after a 'successful' connection param update.

Hello,

We are currently in the process of developing a custom board containing the nrf5340, which will act as a central for HR devices and a peripheral for data transfers. During the development of the HR side we came across an issue in our custom board that doesn't occur on the discovery kit. The issue is that every time a HR device has connected to our Central, and requests an update of the connection parameters, our central accepts these new parameters but doesn't use them. Thereby causing a timeout in the connection with the peripheral.

This behaviour is visible in the image below:

The wireshark sniffer caught the following traffic:

The central does not seem to actually use the new connection parameters, as between each empty PDU (discounting the re-transmits) is still the original ~50ms interval. Our peripheral does not accept this interval and is quietly disconnected, going back into advertising mode (which is temporarily a NONCONN).
The only visible hint on what could be wrong here is in the control opcode message, which has a red error message in the protocol window, as seen in the below image:

At first we assumed this to be a software issue, where our device isn't correctly setting the new connection parameters, so we went back to the nordic samples. 

The above images were all taken from the Central And Peripheral HR sample from Nordic, with the authentication, security elements and DK associations completely stripped from the sample. The only addition is a le_param_updated callback, which prints out the new connection parameters. We created two build configurations, one for our DK and one for our own hardware (the dts of which is solely based on that of the DK with DK elements removed). Flashing this sample to our DK we observed the expected behaviour, as seen in the image below:

Although it takes a few packages for the connection parameters to actually update, we can see that from package ID 327 and onward that the connection interval has been changed to ~330ms and the peripheral keeps notifying the device of new heart rate intervals. Curiously the above behaviour is captured after we flashed the build configuration that was actually meant for our custom hardware, yet it still works. This would mean that (as far as we are aware) the firmware of both devices is the exact same, yet on our custom hardware it seems to fail which points to it being a hardware issue (although we do find this a strange hardware symptom).

Our hardware configuration is based on configuration 3 of the datasheet (9.3.3 Circuit configuration no. 3 for QKAA aQFN94, engineering sample D (QKAAD0) ), except that we have added an antenna connector (CONUFL001) so that we can add an external antenna.

We have tried a lot of different troubleshooting options including but not limited to:
- Manually accepting the new parameters using the le_param_updage_req callback
- Flashing the build configuration meant for the DK to our custom hardware
- Changing the antenna
- Disabling the peripheral side
- Using a different HR peripheral
- Using a different SN custom board
- Changing from SDK/Toolchain 2.9.0 to 3.0.0

What does work is that we can use the le_param_update_req callback to simply decline the connection parameter update request, which keeps the communication alive as normal on the 50ms interval, however considering the low-power requirements of the peripherals we are connected with, this isn't a viable long-term solution.

We are currently out of ideas to further troubleshoot this issue, so we require advice on how to continue. The Control Opcode error in the third image might be a hint as to what is going wrong.

  • I have been testing a bit more and found some strange results.

    For testing we are using a watch and a heart rate band, the watch has a preferred connection interval of 330ms and the band 500ms. Setting these intervals in the zephyr example causes no issues. The connection interval is updated by our central and kept alive just fine. The bluetooth sniffer however stops receiving after the event instant for the connection param update has passed (so the connection interval specifies instant 211, and as soon as event 211 occurs, the sniffer stops receiving, while our RTT is still logging a valid connection).

    When we disable the zephyr example and instead connect the band or the watch, the problem returns. The problem is still not visible on our DK board, which can connect with both the band and the watch.

    Setting the connection interval back to 800 (1000ms) the zephyr example will also get timed out after a connection parameter update request.

    @Edvin do you have any further advice?

  • Relitech-Jeroen said:
    Our central meanwhile also disconnects because of a timeout, which is odd because what I can see from the wireshark it was our central who was responsible for keeping the connection alive.

    That is correct. It is always the central sending the first packet in a connection, to which the peripheral replies.

    Does this happen on one specific device, or multiple custom boards? Do you have multiple custom boards to test on? Just to see if this is an issue that follows one specific chip, or if it follows the HW+SW.

    It is strange. Setting the initial connection parameters to match the peripherals preferred parameters working just fine suggests that there is no timing issue. So the device is perfectly capable of maintaining a connection with these parameters. It is interesting that the sniffer falls out, though. 

    Relitech-Jeroen said:
    RTT is still logging a valid connection

    Is that both on the central and the peripheral, or just the central?

    BR,

    Edvin

  • Hi Edvin,

    I'm replying on behalf of my colleague, as he's not in today.

    Yes, we've tried the same software with different custom boards, and these give the same behaviour.

    We also tested with responding different connection parameters, which sometimes does change the behaviour that the connection remains "more stable", as in, the timeout is deferred to a later moment. This would suggest that our board has problems with keeping an accurate reference clock. However, to verify this, we tried logging information at regular intervals on our custom board which did not reveal any significant drift or wander of the clock.

    As mentioned before, the strange thing is that the BLE sniffer does not report any traffic being sent after the connection parameters are changed on our custom board, the RTT logs on both the central and HR peripheral do still show ongoing communication. Until a timeout event is triggered, and the connection is dropped. This event is also logged by the BLE sniffer.

    It is really baffling what is happening here, and we're not sure what options we have to investigate this further.

    Regards,

      Jan Willem

  • Hi Edvin,

    That is correct. It is always the central sending the first packet in a connection, to which the peripheral replies.

    So to summarize, the central is responsible for keeping the connection alive, it stops trying to do this and then receives a timeout. Why would it stop trying to connect to the peripheral? Or is this a bug from the wireshark sniffer?

    Does this happen on one specific device, or multiple custom boards? Do you have multiple custom boards to test on? Just to see if this is an issue that follows one specific chip, or if it follows the HW+SW.

    We have different custom boards of the same hardware design, but from a different batch and a different revision. They always fail at keeping the connection alive after updating the connection parameters past a certain threshold. When using a Nordic DK board as a peripheral, the connection interval becomes unstable at around 600ms. Meanwhile the 3rd party watch and 3rd party heart rate band both fail at their respective intervals of 332.5ms and 500ms, while the Nordic DK board as peripheral still works at these intervals.

    Any further advice?

    Regards,

    Jeroen

  • It baffles me as well. Particularly the part where you say that it is able to keep the connection, but the sniffer is not picking it up. 

    Relitech-Jeroen said:
    For testing we are using a watch and a heart rate band, the watch has a preferred connection interval of 330ms and the band 500ms. Setting these intervals in the zephyr example causes no issues. The connection interval is updated by our central and kept alive just fine. The bluetooth sniffer however stops receiving after the event instant for the connection param update has passed (so the connection interval specifies instant 211, and as soon as event 211 occurs, the sniffer stops receiving, while our RTT is still logging a valid connection).

    What Nordic DK is the peripheral in this case?

    On your central device, what is the LFLCK accuracy?

    Can you try setting this in prj.conf:
    CONFIG_CLOCK_CONTROL_NRF_K32SRC_500PPM=y

    If that doesn't make a difference, can you also try adding:

    CONFIG_CLOCK_CONTROL_NRF_K32SRC_SYNTH=y

    And let me know if any of those changes anything?

    BR,

    Edvin

Related