[Thingy91] BLE central using HCI crashing nRF52

Hi,

We are currently developing a solution where a Thingy91 acts as a BLE central. We have had a lot of issues with BLE when using the nRF52 as a HCI device being controlled by the nRF91. This is using the NCS v2.5.1

In the beginning there where issues with high data rates causing crashes but that was remedied by switching from lpuart to regular uart with flow control. We also switched to the LL_SW_SPLIT configuration option.

Now we have an issue of our application crashing while being idle for too long. The nRF91 parts keep working but the nRF52 halts and does breaks the BLE functionality. This is observed by just leaving the device scanning for bluetooth devices for about 7 minutes before becoming unresponsive. We believe it works better when using the LL_SOFTDEVICE but using that config option the BLE stack may end up unresponsive when receiving a lot of data without any way for the application to know.

So the question:

Is there any way to fix this? Are there any steps we can take towards increasing the BLE stack stability? This is critical to our application as the device needs to function without manual intervention.

Best regards,
Emil

Parents
  • Hello,

    The things I would have looked into:

    - The most common problem with UART is a mismatch in UART baudrate, to ensure stable baudrate you need to explicit start the high frequency clock (hfxo) before sending and receiving UART data (this must be done on both sides).

    - Another problem with UART, is that sometimes there will occur errors on UART (for instance framing error), in such case the UART typically will stop receiving the data, so the application will typically need to handle the UART error by starting the receive of UART again in this case. If you fix the baudrate (as suggested above) I expect the number of errors on the UART to drop significantly, but it's still a good idea to resume/start UART rx again on UART errors.

    - Another idea to consider is to start scanning or connection with a timeout, e.g. that you start scanning with 60seconds timeout, then you will get a timeout after 60seconds (which indirectly will show that the device is working), and you can re-start scanning again if you want to.

    Hope that helps,
    Kenneth

Reply
  • Hello,

    The things I would have looked into:

    - The most common problem with UART is a mismatch in UART baudrate, to ensure stable baudrate you need to explicit start the high frequency clock (hfxo) before sending and receiving UART data (this must be done on both sides).

    - Another problem with UART, is that sometimes there will occur errors on UART (for instance framing error), in such case the UART typically will stop receiving the data, so the application will typically need to handle the UART error by starting the receive of UART again in this case. If you fix the baudrate (as suggested above) I expect the number of errors on the UART to drop significantly, but it's still a good idea to resume/start UART rx again on UART errors.

    - Another idea to consider is to start scanning or connection with a timeout, e.g. that you start scanning with 60seconds timeout, then you will get a timeout after 60seconds (which indirectly will show that the device is working), and you can re-start scanning again if you want to.

    Hope that helps,
    Kenneth

Children
No Data
Related