Reasons not to enable CONFIG_BT_HCI_ACL_FLOW_CONTROL

For a very small percentage of our devices, we sometimes see watchdog reset events that we cannot explain. The only pattern that we see so far is that the watchdog reset happens a few minutes after there was BLE activity with the device (nrf52840 using peripheral role with BT NUS service). 

We are using NCS 3.1.0 now, but we also observed the resets on NCS 3.0.x and 2.9.x We now suspect that it is maybe caused by one of the known BLE issues below:

  • NCSDK-31528: Deadlock on system workqueue with tx_notify in host
  • NCSDK-30959: The Bluetooth subsystem might deadlock when CONFIG_BT_HCI_ACL_FLOW_CONTROL is disabled

More info here: docs.nordicsemi.com/.../known_issues.html

A workaround for both issues seems to be to enable CONFIG_BT_HCI_ACL_FLOW_CONTROL. Currently this is indeed disabled in our firmware, so we will test if the resets are gone once we enable it.

We have 3 questions though:

  • Are those known issues relevant for the nrf52840 or can you only get deadlocks on multicore socs?
  • Why is CONFIG_BT_HCI_ACL_FLOW_CONTROL not enabled by default if it fixes the known issues?
  • Are there any downsides of enabling it?

Thanks in advance!

Parents
  • Hello,

    Enabling CONFIG_BT_HCI_ACL_FLOW_CONTROL may increase RAM usage because buffer requirements (CMD, ACL, and EVT buffers) are calculated differently with this option enabled (you should get a build assert if your buffers need to be adjusted)

    Suggestions:

    - Enable CONFIG_BT_HCI_ACL_FLOW_CONTROL and adjust buffer settings according to the build error if an assert is raised.

    - Increase CONFIG_BT_BUF_EVT_RX_COUNT and ensure it is set to a value at least '1' more than the TX buffer count (CONFIG_BT_BUF_EVT_RX_COUNT > CONFIG_BT_BUF_EVT_TX_COUNT  CONFIG_BT_BUF_ACL_TX_COUNT + 1).

    This will hopefully resolve the problem you are experiencing. We are actively working on mitigating the risk of deadlocks in the stack.

    Best regards,

    Vidar

  • Hi Vidar, thanks for your useful response!

    So it is indeed possible that this deadlock occurs on a single core nrf52840?

    Either way, we will implement your suggestions and do some testing.

  • So it is indeed possible that this deadlock occurs on a single core nrf52840?

    The developers confirmed that it can occur in certain scenarios, but I have not experienced it myself, so I would consider it rare. If it can happen or not depends on the application and the configuration.

    Please let me know how the testing goes.

    Best regards,

    Vidar

  • Okay clear! We have thousands of devices and only see the reset for a couple of them (and only sometimes), so it is really rare in our situation as well. Let's hope it is indeed this deadlock and fixed by the workaround. It is always nice if there are no unexpected behaviors anymore. ;)

    CONFIG_BT_BUF_EVT_TX_COUNT does not exist by the way. Do you mean CONFIG_BT_BUF_CMD_TX_COUNT?

    We now use this config:

    CONFIG_BT_HCI_ACL_FLOW_CONTROL=y
    CONFIG_BT_BUF_CMD_TX_COUNT=10
    CONFIG_BT_BUF_EVT_RX_COUNT=11
    Let us know if this is indeed correct.
  • Sorry, It was supposed to be CONFIG_BT_BUF_ACL_TX_COUNT not CONFIG_BT_BUF_EVT_TX_COUNT. I've edited my first post.

Reply Children
Related