Beware that this post is related to an SDK in maintenance mode
More Info: Consider nRF Connect SDK for new designs

Beacon output suddenly stops

Hello.

In the BLE software being developed, the device uses nRF52840, s140 v7.2.0 or v7.3.0, and I am building a system that periodically sends a beacon (ADV_NONCONN_IND) in the peripheral role at the same time as scanning in the central role.

In the software currently under development, after setting a beacon (ADV_NONCONN_IND) and executing ble_advertising_start, the advertisements at the set interval suddenly stops after a while.

The time it takes for the problem to occur varies, ranging from a few hours to several tens of hours, and other functions appear to be working fine even after the problem occurs.

The operation of this device is to continuously perform a sequence of transmitting beacons, connecting in a central role based on advertisements from external devices that capture the beacons, and then disconnecting after communication.
For this reason, the central role frequently start and stop scanning, connects and disconnects, but the peripheral role's beacon output has never stopped since the start advertising.

I believe that the output of ADV_NONCONN_IND started with ble_advertising_start will not stop unless sd_ble_gap_adv_stop is executed, but is it possible that external factors such as SCAN or connection of the central role can stop advertising of peripheral roles?

Also, as a failsafe, we are considering implementing code that monitors the advertisement output and restarts it when the advertisement stops, but is there a way to detect that the ADV_NONCONN_IND has been output to RADIO?
I looked into Radio Notification, but since the event callback can only receive the radio_active state, I determined that it is not possible to capture only the advertisement output in a multi-role operating environment.

Please give me some advice.

best regard.

Parents
  • Hi,

    Advertising should not just stop by itself. Could it be that a connection was established but in some special cases you do not re-start advertising, for instance? Do you have logging on the device that can shead some information on this?

    As for a workaround, you are right about radio notifications. That will only tell you when the radio is active, but not for what. I would of course suggest to find the root cause and fix that, but as a workaround perhaps you could regularily stop and re-start advertising?

  • Hello,  Einar.

    Sorry for the slight delay in responding due to low reproducibility and another task.

    Subsequent investigation revealed that when this issue occurs, the signal does not reach Radio Notification, so not only advertising but also RADIO stops.

    Attached are excerpt logs, a HEX dump, and a RAM dump from when the problem occurred.
    The heap is initialized with a 1KB incrementing fill pattern, and the stack is initialized with 0xCC by INITIALIZE_STACK.

    Log of the incident.zip

    SD is s140 v7.3.0, the SDK is 17.1.0_ddde560, and the build environment is SES V5.44.

    Best regards.

  • Hi,

    I see. Have you confirmed that there is no radio activity?

    Can you check if you always handle SoftDevice asserts in particular and also all SoftDevice events? I ask because critical events should never be missed, so it would be interesting to check if it is possiblet that you loose some events in your application in special cases.

  • Hello,  Einar.

    I confirmed that there is no radio activity by checking that the radio notification event callback is not received for a certain period of time (30sec).

    I believe that asserts that occur in SoftDevices can be trapped with interrupts such as app_error_handler or HardFault, just like in the SDK. Is that correct?

    The main loop continues to run after the Radio Notification has stopped, and still responds on the UART, so it doesn't appear to be asserted.

    What should I monitor to determine why Radio Notification has stopped?

    How can I tell if the SoftDevice is working properly?

    Can you get an idea of ​​the operating status of the SoftDevice from the RAM dump I sent?

    To be honest, I'm at a loss as to what to do, so I'd appreciate it if you could tell me how to identify the cause.

    Best regards.

  • Hi,

    hiroiwas said:
    I believe that asserts that occur in SoftDevices can be trapped with interrupts such as app_error_handler or HardFault, just like in the SDK. Is that correct?

    Yes, that is correct. If you are using the default error handler from the SDK, this would also detect SoftDevice asserts (and reset for release builds).

    hiroiwas said:
    The main loop continues to run after the Radio Notification has stopped, and still responds on the UART, so it doesn't appear to be asserted.

    I see.

    hiroiwas said:
    What should I monitor to determine why Radio Notification has stopped?

    To monitor, perhaps you could use an app_timer instance and reset it every time you get a radio notification? If it does not come in time, the app_timer would time out and the timeout handler run, so that you can handle it.

    hiroiwas said:
    How can I tell if the SoftDevice is working properly?

    Generally, the SoftDevice shoud not behave like this. You can moitor things like I sugested, and reset if you detect an issue (you could also use a watchdog instead of a n app timer, and feed the watchdog every time you get a radio notification. And if not, the watchdog would reset).

    hiroiwas said:
    Can you get an idea of ​​the operating status of the SoftDevice from the RAM dump I sent?

    No, unfortuantely not.

    hiroiwas said:

    To be honest, I'm at a loss as to what to do, so I'd appreciate it if you could tell me how to identify the cause.

    This is not an expected issue. To be honest, the SoftDevice is very well tested I do not recall seeing such an issue before. I would be interested in a verification that your application does not stop advertising, or alternatively, if there could be an issue with advertising not being restarted after a timeout or a connection (attempt). Could that be the case? Have you done anything to verify this?

  • Hello,  Einar.

    > To monitor, perhaps you could use an app_timer instance and reset it every time you get a radio notification?

    I have already done that, and the log I attached last time shows that no events occurred in radio notification for 30 seconds.

    > if there could be an issue with advertising not being restarted after a timeout or a connection (attempt). Could that be the case?

    There are places where advertising is stopped for switching, but when I check the log, it appears that it is not passing through there.

    Just to be sure, I will set the advertising output of the beacon (ADV_NONCONN_IND) and then check whether the issue occurs with software that does not stop it thereafter.

    The SoftDevice is designed to stop scanning on connection, but resume scanning on disconnection.

    Is there anything else I can do besides monitoring the radio notification to get a clue as to what might be going on?

    Best regards.

Reply
  • Hello,  Einar.

    > To monitor, perhaps you could use an app_timer instance and reset it every time you get a radio notification?

    I have already done that, and the log I attached last time shows that no events occurred in radio notification for 30 seconds.

    > if there could be an issue with advertising not being restarted after a timeout or a connection (attempt). Could that be the case?

    There are places where advertising is stopped for switching, but when I check the log, it appears that it is not passing through there.

    Just to be sure, I will set the advertising output of the beacon (ADV_NONCONN_IND) and then check whether the issue occurs with software that does not stop it thereafter.

    The SoftDevice is designed to stop scanning on connection, but resume scanning on disconnection.

    Is there anything else I can do besides monitoring the radio notification to get a clue as to what might be going on?

    Best regards.

Children
  • hiroiwas said:
    I have already done that, and the log I attached last time shows that no events occurred in radio notification for 30 seconds.

    I see. Then that could work as a recovery mechanism: if you have not had an radio activity for x amount of time, reset to recover. But this will not address the root cause.

    hiroiwas said:
    Just to be sure, I will set the advertising output of the beacon (ADV_NONCONN_IND) and then check whether the issue occurs with software that does not stop it thereafter.

    Sond good. Becase advertising really should not stop by itself (unless there is a timeout, and if so you get an event). And I have not found any other reports of such an issue. So I thin it is hightly likely that the application disables or does not restart advertising when it should.

    hiroiwas said:
    Is there anything else I can do besides monitoring the radio notification to get a clue as to what might be going on?

    I would add extensive loggin everywhere you stop or start advertising, and also log all BLE events (you oculd just log the event ID). Then you could go back in the log when advertising unexpectedly stopped and check if there are any events you did not process or if you did stop (or not start) advertising undexpectedly.

  • Hello,  Einar.

    I was able to capture the log when the phenomenon occurred.


    In conclusion, the point at which advertising stopped and switched is unrelated, and the problem appears to be occurring due to the overlapping timing of sd_ble_gap_scan_start during nrf_ble_scan_on_adv_report and the stop of scanning in the SoftDevice after sd_ble_gap_connect.

    00> @@@ nrf_ble_scan_on_adv_report(949)->sd_ble_gap_scan_start
    00> @@@ nrf_ble_scan_on_adv_report(949)->sd_ble_gap_scan_start
    00> @@@ nrf_ble_scan_on_adv_report(949)->sd_ble_gap_scan_start
    00> @@@ nrf_ble_scan_on_adv_report(949)->sd_ble_gap_scan_start
    00> 
    00> ==SCAN_RSP== AdType:FF
    00>  4C 00 01 00 00 00 00 0B  55 55 00 53 00 45 00 43  4F 02 4D
    00> 
    00> > Addr:59:D7:5E:C8:C4:3B (Resolvable) RSSI:-36dBm(+0) raw:-36dBm
    00> 
    00> @@@ scan_connect_judgement(661)->sd_ble_gap_connect
    00> +++ Connecting +++
    00> 
    00> @@@ nrf_ble_scan_on_adv_report(949)->sd_ble_gap_scan_start
    00> 
    00> 01:12:37.9172  > 1BACC0DE-50273-09551615/59:D7:5E:C8:C4:3B (Resolvable) was removed from the rssi moving average buffering list, calc end: elapsed 1000ms.
    00> 
    00> *** No radio notification for a while(30000ms).
    

    So here are my questions.

    In SoftDevice, the scanner stops after sd_ble_gap_connect, but at what point does it stop?
    After CONNECT_IND? After that delay?

    The scanner also stops at BLE_GAP_EVT_ADV_REPORT when an advertisement is received, but at what point does it stop?
    After receiving ADV_IND? Before the ADV_REPORT event?

    From the log, nrf_ble_scan_on_adv_report appears only once after sd_ble_gap_connect, but when is scan_start accepted here?

    For your reference, we have attached below the logs showing successful connection and disconnection.

    00> @@@ nrf_ble_scan_on_adv_report(949)->sd_ble_gap_scan_start
    00> @@@ nrf_ble_scan_on_adv_report(949)->sd_ble_gap_scan_start
    00> @@@ nrf_ble_scan_on_adv_report(949)->sd_ble_gap_scan_start
    00> @@@ nrf_ble_scan_on_adv_report(949)->sd_ble_gap_scan_start
    00> @@@ nrf_ble_scan_on_adv_report(949)->sd_ble_gap_scan_start
    00> 
    00> ==SCAN_RSP== AdType:FF
    00> 
    00>  4C 00 01 00 00 00 00 0B  55 55 00 53 00 45 00 43  4F 02 4D
    00> 01:12:28.2542  > Continue moving average calculation with RssiList[0]
    00> 
    00> > Addr:59:D7:5E:C8:C4:3B (Resolvable) RSSI:-37dBm(+0) raw:-37dBm
    00> 
    00> @@@ scan_connect_judgement(661)->sd_ble_gap_connect
    00> +++ Connecting +++
    00> 
    00> @@@ nrf_ble_scan_on_adv_report(949)->sd_ble_gap_scan_start
    00> @@@ on_connected_evt(160)->sd_ble_gattc_exchange_mtu_request
    00> @@@ data_length_update(89)->sd_ble_gap_data_length_update
    00> 01:12:28.4926  NRF_BLE_SCAN_EVT_CONNECTED(6)
    00> > conn_handle=0, conn_count=1
    00> 
    00> @@@ request_process(359)->sd_ble_gattc_primary_services_discover
    00> @@@ queue_process(215)->sd_ble_gattc_primary_services_discover
    00> ble_db_discovery_start: err_code=0
    00> 
    00> @@@ ble_evt_handler(1024)->sd_ble_gap_rssi_start
    00> 
    00> @@@ queue_process(215)->sd_ble_gattc_primary_services_discover
    00> 01:12:28.5242  BLE_GATTC_EVT_EXCHANGE_MTU_RSP(58)
    00> 01:12:28.5392  BLE_GAP_EVT_DATA_LENGTH_UPDATE(36)
    00> @@@ request_process(366)->sd_ble_gattc_characteristics_discover
    00> 01:12:28.5692  BLE_GATTC_EVT_PRIM_SRVC_DISC_RSP(48)
    00> @@@ request_process(366)->sd_ble_gattc_characteristics_discover
    00> @@@ request_process(372)->sd_ble_gattc_descriptors_discover
    00> @@@ request_process(372)->sd_ble_gattc_descriptors_discover
    00> 01:12:28.7040  BLE_DB_DISCOVERY_COMPLETE(0)
    00> 01:12:28.7043  BLE_NUS_C_EVT_DISCOVERY_COMPLETE(0)
    00> 
    00> @@@ request_process(353)->sd_ble_gattc_write
    00> 01:12:28.7052  BLE_DB_DISCOVERY_AVAILABLE(3)
    00> [APP] CONNECTED
    00> 
    
        === Part omitted ===
    
    00> @@@ MdlBLE_en_Disconnect(1711)->sd_ble_gap_disconnect
    00> 01:12:29.8440  BLE_GAP_EVT_DISCONNECTED(17)
    00> > conn_handle=0, conn_count=0, reason=0x16
    00> 
    00> @@@ ble_evt_handler(1036)->sd_ble_gap_rssi_stop
    00> 
    00> @@@ MdlBLE_vd_ConnectStartReq(1614)->sd_ble_gap_tx_power_set
    00> @@@ nrf_ble_scan_stop(1041)->sd_ble_gap_scan_stop
    00> *** SCAN START ***
    00> 
    00> @@@ nrf_ble_scan_stop(1041)->sd_ble_gap_scan_stop
    00> 
    00> @@@ nrf_ble_scan_on_adv_report(949)->sd_ble_gap_scan_start
    00> @@@ nrf_ble_scan_on_adv_report(949)->sd_ble_gap_scan_start
    00> @@@ nrf_ble_scan_on_adv_report(949)->sd_ble_gap_scan_start
    00> @@@ nrf_ble_scan_on_adv_report(949)->sd_ble_gap_scan_start
    00> @@@ nrf_ble_scan_on_adv_report(949)->sd_ble_gap_scan_start

    Does this information help you determine the cause and solution?

    Incidentally, the beacon (ADV_NONCONN_IND) has not been stopped since the parameters were first set, so it was not possible to determine from the log when it stopped.

    Best regards.

  • Additional information.

    After that, I added logs and tried to reproduce the issue several times, and all of the logs at the time the problem occurred recorded that there were no subsequent BLE events after sd_ble_gap_scan_start after sd_ble_gap_connect.

    Also, in the radio notification event received at the same time, the argument radio_active was notified as 0 twice, and radio_active did not become 1 after that.

    I thought that in Radio Notification, radio_active would alternate between 0 and 1, but is that wrong?

    I have attached a chart capturing radio_active just before the radio stopped using data sampling.

    This means that the ad will likely continue playing until the radio is turned off.

    Judging from these logs, I think the reason why radio stopped is due to a conflict within SoftDevice. Is this correct?

    Best regards.

  • Hello,

    Sorry for the late reply. Einar is out of of office, so I will try to look into it. 

    I am sorry, but I am a bit confused (and there is a lot of information here). So your beacon is also scanning, and connecting to other devices from time to time, is that correct?

    Will another central also connect to the beacon's advertisement at some point? 

    How do you set up your advertising? Does it restart advertising after a connection? 

    hiroiwas said:
    This means that the ad will likely continue playing until the radio is turned off.

    What do you mean by this? The advertisements have stopped at the end of this graph, right?

    hiroiwas said:
    Judging from these logs, I think the reason why radio stopped is due to a conflict within SoftDevice. Is this correct?

    It may happen that if there are too many events coming from the softdevice, that an event is not correctly forwarded to the application. In some cases, that can be e.g. the disconnected event.

    When this occurs, and you detect it, is it still possible to start advertising? What happens if you try to do so?

    Best regards,

    Edvin

  • Hello,  Edvin.

    > So your beacon is also scanning, and connecting to other devices from time to time, is that correct?

    I have drawn a simple sequence chart of the system.

    • The device emits a beacon (ADV_NONCONN_IND) at a 100 ms cycle and opens a 20 ms scan window at 20 ms intervals.
    • The iPhone catches the device's beacon and initiates ADV_IND.
    • The device uses the ADV_IND and SCAN_RSP information to make a connection request.
    • After connecting and completing the series of communications, it disconnects from the device, resumes scanning, and waits for the next ADV_IND.

    Beacons in the peripheral role are essentially configured when the device is first powered on and then continue to transmit signals continuously.

    In the central role, it receives ADV_IND and SCAN_RSP and makes decisions such as connection.
    Only one device can be connected at a time to the central, and a scanner that is stopped by CONNECT_IND will be restarted after disconnection.

    > The advertisements have stopped at the end of this graph, right?

    Yes, the radio notification stops after the graph ends and resets after 30 seconds.

    > It may happen that if there are too many events coming from the softdevice, that an event is not correctly forwarded to the application.

    This is important information.
    Does this mean that the SoftDevice is dropping events?

    If so, this could be the reason why BLE_GAP_EVT_CONNECTED is not raised and SCAN remains stopped even though SCAN is stopped after sd_ble_gap_connect.

    However, why does the peripheral role beacon also stop working at the same time?

    Since it's not something the app can fix, I think this may be a problem with the SoftDevice.

    Are there any parameters to increase the SoftDevice's event queue, or options to limit the types of events that are fired?

    > When this occurs, and you detect it, is it still possible to start advertising? What happens if you try to do so?

    Since we believe the issue is with the SoftDevice, we've set it to auto-reset if radio notifications stop for 30 seconds as a safety measure.

    In addition to this, if radio notifications stop for 1 second, we'll try to resume scanning and advertising if that's possible to recover.

    It will take some time to reproduce, but I will report the results once they are known.

    Best regards.

Related