This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

Softdevice S140 temporarily stops advertising BLE

We are using softdevice S140 6.1.1 with nrf52840 for a BLE peripheral. We see that mobile devices are not able to discover this peripheral intermittently for some time during the day ranging few minutes to hours. In an exception case, it required a physical power reset to recover from this issue. We use FAST BLE advertising with interval of 150ms and timeout of 0 (infinite). We additionally handle GAP connection and disconnection events to change the advertisement to Non-connectable when someone connects and back to connectable when the user disconnects. We don't see any failures in firmware logs which can indicate an issue with BLE configuration. These issues were reported for production peripheral devices and we have not been able to able to reproduce them locally. I need help with following queries:

  1. Can a softdevice stop advertising for any reason if the duration is set to infinite? We don't see any logs for BLE_GAP_EVT_ADV_SET_TERMINATED which can possibly indicate such a thing.
  2. Can we land in a situation where peripheral is not able to prioritize processing of GAP events like connected, disconnected, terminated etc. and we are not able to make softdevice resume advertisement after these events?
  3. Can there be an issue with peripheral radio which can cause it to stop advertisement for some reason? We are thinking of logging radio notifications to concretly capture any such issues, is it advisable to enable radio notifications for production devices, are there any best practices to leverage this feature? 
Parents
  • Hi,

    Can a softdevice stop advertising for any reason if the duration is set to infinite? We don't see any logs for BLE_GAP_EVT_ADV_SET_TERMINATED which can possibly indicate such a thing.

    No, I do not see how that could happen (unless it is caused by an unknown bug).

    Can we land in a situation where peripheral is not able to prioritize processing of GAP events like connected, disconnected, terminated etc. and we are not able to make softdevice resume advertisement after these events?

    I want to say no here as well. However, it is possible to contemplate a situation where SoftDevice events are not processed in a timely manner and this causes issue. I do not know anything about your code, but if  you look at the SoftDevice handler implementation in SDK 17.0.2 (similar in older versions) as an example, you will see that if the dispatch model is scheduler (NRF_SDH_DISPATCH_MODEL_APPSH), every event from the SoftDevice is put in a queue. If the queue is full because events are not processed fast enough, it would be lost. With the SDK implementation that would be detected by an APP_ERROR_CHECK, so it should not fail silently. But it could be different in your application, though. Could this be relevant?

    Can there be an issue with peripheral radio which can cause it to stop advertisement for some reason?

    Same here, I do not see how this could happen unless there is an up to now unknown bug.

    We are thinking of logging radio notifications to concretly capture any such issues, is it advisable to enable radio notifications for production devices, are there any best practices to leverage this feature? 

    Radio notification is a generic features that allows you to get events when the radio is in use. It is a commonly used feature and I do not see any problems with using radio notification in production. You can refer to the SoftDevice specification for details.

    Regarding the issue in general I wonder if you have been able to confirm that the issue on the nRF side? As you have only seen this in the field so far and not been able to reproduce locally I assume you do not have been able to get a sniffer trace when this occurs. If you did, that would show if the nRF peripheral advertises and if the central attempts to connect. Which central devices do you see this issue with? Is it limited to a few specific phone models or similar?

    Einar

  • Thanks Einar for your inputs here.

    I checked that in our case we are using NRF_SDH_DISPATCH_MODEL_POLLING as the dispatch model, so if we are not able to pull events fast enough, will softdevice drop them and can that be detected?

    >> Regarding the issue in general I wonder if you have been able to confirm that the issue on the nRF side? 

    We are seeing issues where user mobile apps (central device in our case) are not able to discover our peripherals and the issue happens temporarily i.e. peripheral will not be discoverable during one set of interactions (which can last 10-15 minutes) and then it becomes discoverable later during the day or next day. We have ruled out central specific issues because central is able to discover other nearby BLE devices during that time. 

    In some geographies we do so see such issues concentrated on specific make of phones like Caterpillar S48c and S41 in NA but we are not sure what mobile specific implementations can be causing it. Our peripherals are placed in public places so there is definitely an interference with large number of public BLE devices in the vicinity, can that contribute to such blackouts?

  • Hi Ram,

    Do you have any new information on this? Any new findings?

    We have just discovered that it is in fact possible to loose BLE_GAP_EVT_DISCONNECTED events in some very rare cases. I cannot say if this explains what you have been seeing, but my earlier dismissal of this turns out to be incorrect. We plan to release new SoftDevices (version 7.3.0) which will incorporate a fix for this issue.

  • Hi Einar,

    Yes, we were able to run frequency tests on the production boards where we were seeing high incidence of discovery issues. Please find attached the document with test results for three boards.

    FieldBoard_Logs.docx

    >> We have just discovered that it is in fact possible to loose BLE_GAP_EVT_DISCONNECTED events in some very rare cases. 

    Could you please elaborate a bit more on how to reproduce this? I would want a way to test if this is happening in our production devices.

    Thanks

    Ram

  • Hi Ram,

    The issue is tricky to reproduce. I have asked the R&D SoftDevice team if and how it is possible to detect that this has happened. I will let you know as soon as I get some information on that.

    Regarding the frequency accuracy for the three boards in FieldBoard_Logs.docx that is well within the limits. Did you also measure at cold and hot temperatures?

    Einar

  • Hi Ram,

    If the application stops pulling events from SoftDevice and some corner case plus a disconnect happened during that time, the SoftDevice may fail to give BLE_GAP_EVT_DISCONNECTED event to the application. To workaround this issue, the application must continuously pull events from SoftDevice.

    Einar

  • Thanks for reviewing the doc, no we didn't measure for cold and hot temperatures. Do you recommend running temperature tests as well? I seem to remember you mentioning that ut could be estimated based on the buffer available in the frequency spectrum.

Reply Children
Related