Beware that this post is related to an SDK in maintenance mode
More Info: Consider nRF Connect SDK for new designs

Beacon output suddenly stops

Hello.

In the BLE software being developed, the device uses nRF52840, s140 v7.2.0 or v7.3.0, and I am building a system that periodically sends a beacon (ADV_NONCONN_IND) in the peripheral role at the same time as scanning in the central role.

In the software currently under development, after setting a beacon (ADV_NONCONN_IND) and executing ble_advertising_start, the advertisements at the set interval suddenly stops after a while.

The time it takes for the problem to occur varies, ranging from a few hours to several tens of hours, and other functions appear to be working fine even after the problem occurs.

The operation of this device is to continuously perform a sequence of transmitting beacons, connecting in a central role based on advertisements from external devices that capture the beacons, and then disconnecting after communication.
For this reason, the central role frequently start and stop scanning, connects and disconnects, but the peripheral role's beacon output has never stopped since the start advertising.

I believe that the output of ADV_NONCONN_IND started with ble_advertising_start will not stop unless sd_ble_gap_adv_stop is executed, but is it possible that external factors such as SCAN or connection of the central role can stop advertising of peripheral roles?

Also, as a failsafe, we are considering implementing code that monitors the advertisement output and restarts it when the advertisement stops, but is there a way to detect that the ADV_NONCONN_IND has been output to RADIO?
I looked into Radio Notification, but since the event callback can only receive the radio_active state, I determined that it is not possible to capture only the advertisement output in a multi-role operating environment.

Please give me some advice.

best regard.

Parents
  • Hi

    I'm sorry you're being sent in a loop here, but now Hieu is out of office. I haven't had time to get into this case, but I have pinged the devs looking at this internally, and will let you know if they have any updates.

    Einar who initially had this ticket will return on Monday July 29th and likely take back this case then. So sorry about all the delays.

    Thus far I'm afraid there's no further updates from our side. You say that the full log can't be uploaded. Would it be possible to upload it with Google Drive or similar so we can download it on our end from there? Alternatively you can send it by mail to me, but for the mail address I'll reach out to you in direct messages if that's what you prefer.

    And just a shot in the dark from me, how many devices has the beacon connected to when this issue occurs. Could it be that the maximum number of connections is reached, and that's why the advertiser won't start again, since it seems to happen after a connection event if I'm not mistaken.

    One option you could try is to add a watchdog timer that will reset the device if no BLE events trig in X amount of time. Use the WDT example for reference here.

    Best regards,

    Simon

  • Hi,

    I have inerited this case back from Simon. For now I do not have anything to add to what he wrote, other than that if the problem with uploading logs is the file type, then I suggest putting the log in a zip file and uploading that. You can uploade large files here on DevZone as long as the file type is allowed (and .zip is).

    Br,

    Einar

  • Hello Einar,

    > This was fixed in SoftDevice version v7.3.0 though

    Currently the SoftDevice version that is having the issue is v7.3.0.

    > It is regardless important to process SoftDevice events in a timely manner.

    The most time-consuming BLE event is the process between receiving NRF_BLE_SCAN_EVT_NOT_FOUND(SCAN_RSP) and calling sd_ble_gap_connect (a maximum of about 100 ms).
    Specifically, what should be the upper limit for this event processing time?

    > Do you know more about when this issue occurs?

    This issues always occurred right after sd_ble_gap_connect in the NRF_BLE_SCAN_EVT_NOT_FOUND event.

    > Could it be that the application have not pulled events in a long time due to other activity?

    The logs at the time the problem occurred show no evidence of any events other than BLE being active.

    In fact, no BLE events (<debug> nrf_sdh_ble: BLE event: xx) have been issued from the SD card since sd_ble_gap_connect at the time of the problem.

    > And if you use the SoftDvice hanler with app scheduler,

    No, SoftDevice events are set to use interrupts.

    #define NRF_SDH_DISPATCH_MODEL 0
    // <i> NRF_SDH_DISPATCH_MODEL_INTERRUPT: SoftDevice events are passed to the application from the interrupt context.

    FYI, this issues seems to be the same.

    Best regards.

  • Hi,

    I see. I understand that this issue is difficult to reproduce, but are you ablet share a project that can reproduce this on a DK (given enough time)?

    Regardgin the other thread that is using the nRF Connect SDK which is using a different Bluetooth stack (though the controller / link layer inherits components from the SoftDevice).

  • Hello Einar,

    I have something to confirm.

    When providing this software, the internal code contains confidential security information, so I would like to ask that you not use it for anything other than bug analysis. Is that okay?
    You can answer this ticket instead of signing an NDA.

    Also, to run it, you will need a PC with a UART connection and two or more Android smartphones.

    If there are no problems with the above, I will send you the currently working set of the software, so please send the recipient's email address to my email address.

    If there are any problems, we will need to create a subset with limited functionality in order to reproduce the phenomenon, so this will take some time, including confirming that the phenomenon can be reproduced with the subset.

    Best regards.

  • Hi Iwasaki,

    hiroiwas said:
    When providing this software, the internal code contains confidential security information, so I would like to ask that you not use it for anything other than bug analysis. Is that okay?
    You can answer this ticket instead of signing an NDA.

    We handle all prive support cases confidentlially (not that this is a public case at the moment).

    hiroiwas said:
    Also, to run it, you will need a PC with a UART connection and two or more Android smartphones.

    Is it possibel to reproduce without Android devices? I should be able to obtain one or two for quick testing, though.

    hiroiwas said:
    If there are no problems with the above, I will send you the currently working set of the software, so please send the recipient's email address to my email address.

    Please make a private support ticket where you refer to this thread. There you can upload your code as a zip file.

    If it is so that this issue is difficult to consisently and fairly quickly reproduce it would be prefrable if it is possible to reproduce it usign a simpler setup. A simple setup is also benefisial for narrowing down the root cause of the issue. (We can of course inspect the code and see if we see somehting suspicious also without having been able to reproduce it on our end).

    Br,

    Einar

  • Hello Einar,

    A private ticket has been issued.
    Thank you for your continued support until the issue is resolved.

    Best regards.

Reply Children
  • Hi,

    That sounds good. let us continue the discussion in the private ticket for now.

  • Hello,


    A tentative conclusion has been reached on the private ticket side, so I will post it here.

    The issue occurred in the following scenario:

    1. An error occurred in the initiator when connecting.
    ☞ Example: The peripheral address changed after sd_ble_gap_connect(), the peripheral was shutdown, deterioration of radio wave conditions, etc.

    2. In SoftDevice, the initiator where the error occurred will not be able to accept the next ADV_IND.

    3. Because the scan timeout is set to 0 (no timeout), the initiator error state continues.

    4. Being unable to escape from the initiator causes the scheduler (time slot) inside the SoftDevice to lock, causing the SD to become unresponsive from the app's perspective.
    ☞ The SD becomes unresponsive, but the app continues to run.

    5. If the time slot stops working, the next beacon cannot be sent.
    ☞ This was the first issue that encounted.

    6. Since the RADIO time slot processing will no longer work, subsequent Radio notifications will also stop.
    ☞ A fail-safe measure that resets itself by monitoring the time when no response is received from the radio notification


    This issue occurs because the peripheral address update in 1 causes 3, and also because 2 causes 3.

    The solution is to set the ble_gap_scan_params_t::timeout (given by sd_ble_gap_connect) on the initiator where the malfunction occurs to a significant value other than 0, and to resume scanning when BLE_GAP_EVT_TIMEOUT of the BLE_GAP_TIMEOUT_SRC_CONN event occurs.

    Regarding the SoftDevice fix in 2., we have received the answer that, given the state of the live cycle, unfortunately, it is highly unlikely that this issue will be fixed  (see the nRF Connect SDK and nRF5 SDK statement given in 2021).

    That's all.

Related