BLE Advertisement Issue in nRF Connect SDK v2.7.0 (DRGN-23231) – Any Cherry-Pick or Temporary Fix Available?

Hello Nordic team,

I'm currently developing on the nRF5340 and recently encountered an issue related to BLE advertisements. After thorough debugging, I was able to confirm that the root cause lies within the nRF Connect SDK, not within my application code.

While investigating further, I found that the issue is documented in the Known Issues section of the nRF Connect SDK v2.7.0 release notes with the ID: DRGN-23231
Reference: Known Issues - nRF Connect SDK

I also noticed that this issue appears to have been resolved in nRF Connect SDK v2.8.0.

Due to the fact that our product is already in production, a full migration to v2.8.0 will require additional time for integration and validation. In the meantime, I’m looking for a temporary resolution.

Could you please let me know:

  • Is there a cherry-pickable commit or patch from v2.8.0 that can be applied to nRF Connect SDK v2.7.0 to temporarily resolve this issue?

  • Or is there any recommended workaround that can be used until we complete the migration?

Any support or guidance would be highly appreciated, as we are aiming to resolve this issue as soon as possible in our production environment.

Thanks in advance!

Regards,

Prashant Humbre

Parents
  • Hello Prashant,

    The fix for the mentioned issue was implemented in the Softdevice controller library so it can't be patched directly, unfortunately. But does this issue you are observing occur after you have been in a connection, or was it only advertising at the time? 

    Thanks,

    Vidar

  • Hello Vidar,

    Thank you for your quick response and for clarifying that the fix is part of the SoftDevice Controller library.

    To answer your question, the issue I am observing occurs after a connection has been established.

    In my particular scenario, this mainly occurs when the mobile phone goes out of range from the device during the DFU process while data is being transferred. After this happens, the device fails to resume advertising properly, and the only recovery option we’ve found so far is to power cycle the device, which is not ideal for a product that is already in production.

    Additionally, after reviewing the Nordic debug logs, I observed that the Disconnect event is not getting triggered when the mobile phone goes out of range. This might be contributing to the issue, as the device seems to remain in an inconsistent state after the connection is lost abruptly.

    If there are any mitigation steps, configuration changes, or temporary workarounds that I can apply while still on nRF Connect SDK v2.7.0, it would be very helpful until we can complete a full migration to v2.8.0.

    Thanks again for your support!

    Best regards,
    Prashant

  • Hello Prashant,

    Thanks for the update. What you describe seems to the same as the issue discussed in this thread:  nRF5340 Missing Disconnect Event (Supervision Timeout) Workaround proposed in my last comment there was to ensure the CONFIG_BT_BUF_CMD_TX_COUNT setting on the controller was +1 more than the corresponding RX and TX setting on the host side.

    For example if you want the host to have 10 TX/RX ACL buffers:

    sysbuild/ipc_radio.conf
    add:
    CONFIG_BT_BUF_CMD_TX_COUNT=11
    # CONFIG_BT_BUF_CMD_TX_COUNT >= CONFIG_BT_BUF_ACL_RX_COUNT + 1
    prj.conf
    change:
    CONFIG_BT_BUF_ACL_TX_COUNT=10
    CONFIG_BT_BUF_ACL_RX_COUNT=10
    # misaligned buffer counts app core and radio core don't make sense.

    In my particular scenario, this mainly occurs when the mobile phone goes out of range from the device during the DFU process while data is being transferred. After this happens, the device fails to resume advertising properly, and the only recovery option we’ve found so far is to power cycle the device, which is not ideal for a product that is already in production.

    You may also consider using watchdog to recover from this (as a last resort)

    Best regards,

    Vidar

  • Hello Vidar,

    Thanks again for the suggestion and detailed explanation regarding the buffer configuration.

    I tried updating the configurations as per your recommendation

    Unfortunately, the issue still persists. The device does not resume advertising or trigger the Disconnect event when the mobile phone goes out of range during the DFU process — and recovery is still only possible via a power cycle.

    However, I did make an important observation during testing:

    • On the first attempt, when the phone goes out of range, the Disconnect event does get triggered as expected.

    • But on subsequent attempts, the Disconnect event is not triggered at all, and the device seems to stay in a non-recoverable state until rebooted.

    This behavior is observed even without the buffer configuration changes — so the workaround unfortunately doesn’t seem to help in this case.

    Could this indicate an issue with the controller not properly resetting its internal state after the first disconnect?
    Is there any way to manually force a cleanup or timeout handling from the application or controller side?

    Any further insights or suggestions would be greatly appreciated.

    Best regards,
    Prashant

  • Hello Prashant,

    Sorry it didn’t work. Did you make sure to set CONFIG_BT_BUF_CMD_TX_COUNT to a larger value than CONFIG_BT_BUF_ACL_RX_COUNT in the ipc_radio project? I realize that this requirement (i.e. CONFIG_BT_BUF_CMD_TX_COUNT >= CONFIG_BT_BUF_ACL_RX_COUNT + 1) was not made very clear in my previous response. Here is from the known issue descritption:

    (commit: https://github.com/nrfconnect/sdk-nrf/pull/19869/commits/6c02a2b307297638bb92c18ccea514ea4326e71b)

    Best regards,

    Vidar

  • Update: in addition to what I suggested above, it may be necessary to increase the CONFIG_BT_CTLR_SDC_TX_PACKET_COUNT setting in the IPC radio firmware to match this equation: 

    (max_conn - 1) * MaxAttPacket + 1
    , where MaxAttPacket is the maximum number of ATT packets your application will queue per connection.

    The application must also ensure it does not queue more than "MaxAttPacket" per connection, especially when link quality drops in cases were a device is moving out of range. If the application is using GATT APIs with callbacks like bt_gatt_notify_cb, it can use the callbacks to monitor how many ATT packets are queued for each link.

    For example, let's say your application needs to support 3 peripheral or central role connections, and each link should be allowed to queue up to 3 ATT packets:

    (3 -1) * 3 +1 =7

    So in this case, CONFIG_BT_CTLR_SDC_TX_PACKET_COUNT should be set to 7. In the application, you must also ensure that no more than 3 write or notify packets are queued for a given connection at any time.

Reply
  • Update: in addition to what I suggested above, it may be necessary to increase the CONFIG_BT_CTLR_SDC_TX_PACKET_COUNT setting in the IPC radio firmware to match this equation: 

    (max_conn - 1) * MaxAttPacket + 1
    , where MaxAttPacket is the maximum number of ATT packets your application will queue per connection.

    The application must also ensure it does not queue more than "MaxAttPacket" per connection, especially when link quality drops in cases were a device is moving out of range. If the application is using GATT APIs with callbacks like bt_gatt_notify_cb, it can use the callbacks to monitor how many ATT packets are queued for each link.

    For example, let's say your application needs to support 3 peripheral or central role connections, and each link should be allowed to queue up to 3 ATT packets:

    (3 -1) * 3 +1 =7

    So in this case, CONFIG_BT_CTLR_SDC_TX_PACKET_COUNT should be set to 7. In the application, you must also ensure that no more than 3 write or notify packets are queued for a given connection at any time.

Children
No Data
Related