This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

Softdevice event handler continuously called with NRF_EVT_RADIO_BLOCKED

I've got another question concerning a timeslot application.

In our timeslot application which is based on the nRF Mesh SDK timeslot.c implementation I noticed that if a BLE connection is active with a connection interval of 7.5ms, the softdevice did not grant any timeslots. I noticed that the cause of this was that the requested timeslot length requested at the end of a timeslot was 14000us and therefore too long to fit into the 7.5ms connection interval. This resultd in the timeslot being canceled via invocation of the softdevice event handler with NRF_EVT_RADIO_CANCELED.

The handling of this event was a call to sd_radio_request() but using the same parameters that were canceled. This resulted in the softdevice event handler being called repeatedly with NRF_EVT_RADIO_CANCELED until either the BLE connection parameters were changed or the central disconnected.

To avert this behaviour I changed the request parameters to the values that were used in case of a NRF_EVT_RADIO_BLOCKED:

m_radio_request_earliest.request_type               = NRF_RADIO_REQ_TYPE_EARLIEST;
m_radio_request_earliest.params.earliest.hfclk      = NRF_RADIO_HFCLK_CFG_NO_GUARANTEE;
m_radio_request_earliest.params.earliest.priority   = NRF_RADIO_PRIORITY_NORMAL;
m_radio_request_earliest.params.earliest.length_us  = 3800;
m_radio_request_earliest.params.earliest.timeout_us = 15000;

This resulted in the timeslot being granted again after NRF_EVT_RADIO_CANCELED occurred.

I then realised that after changing the connection interval to 50ms with a slave latency of 2, timeslots get repeatedly blocked for a duration of 5ms (during the connection event) and only then are they being granted again. See the following chart:

I thought this behaviour looks odd and inefficient. Ideally, I expect the softdevice to not invoke the BLOCKED event, since the timeout - from my understanding - should only occur after 15000us? Invoking sd_radio_request() when handling the BLOCKED event with a length_us of 3800 and a timeout_us of 14000, I definitely don't expect the event handler to be invoked every ~50us!

So I tried to play around with the timeout_us parameter of the nrf_radio_request_t structure, but did not find a viable solution, in fact I only made it worse. When increasing the timeout_us e.g. to 50000, I noticed that the described behaviour occurs for a bit and then disappears, but the timeslot is only started after a total downtime of about 50ms! This drastically reduces the availability of our proprietary radio protocol.

  • Is there a clean way to avert these continuous BLOCKED events while still maximising timeslot duration?
  • Can you elaborate on the difference between the BLOCKED and CANCELED events?
  • What implications does/should changing the timeout of a radio request have?

By the way, this is all on an nRF52832 running S132 7.2.0.

Parents
  • Hi Michael, 
    If you have a look at the documentation of the softdevice (SDS) you can find this scheduling properties: 

    According to this if you request the timeslot at NORMAL priority it will have lower priority than the connection activity, meaning that if the timeslot doesn't fit entirely inside a period when there is no connection event (calculated by the event length reserved) it will get cancelled. 

    Please try to test again with the priority set to HIGH. 

    However, I couldn't explain why you get the event every 50us. We may need to look into your code to see why. 

    You can have a look at my example code here

    I attached the logic trace in my test here when the device first advertised, then entered a connection with interval = 7.5ms after 5 seconds the interval changed to 750ms. You can find the timeslot got cancelled multiple times, but still it get some slots in between. 

    1 MHz, 20 M Samples [9].logicdata

  • In the meantime, I was able to reproduce the issue with the ble_app_proximity example from the nRF SDK for Mesh 4.2.0.

    I've attached the diff to the nRF SDK for Mesh 4.2.0 folder. The modifications I made were:

    • Modified nRF5 SDK related paths in project file to point to folder nRF5_SDK at same level as the mesh SDK folder is.
    • Enabled and configured timeslot debug pins (default configuration for timeslot debugging in mesh/core):
      • P0.03: high while in timeslot
      • P0.04: high while in signal handler
      • P0.28: high while trying to extend timeslot
      • P0.29: high when serving TIMER0 IRQ
      • P0.30: high while in softdevice event handler (the culprit here!)
      • P0.31: High when timeslot end is reached
      • P0.24: High if high priority timeslot was requested
      • P0.25: High if extension succeeded.
    • Set APP_TIMER_CONFIG_RTC_FREQUENCY to 0 (only way the example runs)
    • Set NRF_SDH_BLE_GAP_EVENT_LENGTH to 320 --> this triggers the issue!

    timeslot-sd-event-loop.patch

    So I've touched on two issues here:

    1. The weird behaviour (described in the initial post) that occurs during a BLE connection where the SD event handler is invoked repeatedly. This seems to happen in the suppleid exemple when settting NRF_SDH_BLE_GAP_EVENT_LENGTH to 320. We set this to a high value to be able to achieve higher BLE throughput when needed.

      Do you have any explanation as to why this behaviour occurs and how it can be avoided?

    2. When setting the hfclk parameter to NRF_RADIO_HFCLK_CFG_NO_GUARANTEE, no timeslots can be served if the connection interval is only 7.5ms. The idea was to configure the hfclk that way because we are able to have it continuously running, as we do not have any battery constraints. Additionally, there seems to be a connection issue that seems to be avoidable if NRF_RADIO_HFCLK_CFG_NO_GUARANTEE is chosen.

      So why can no timeslots be served with a 7.5ms connection interval and NRF_RADIO_HFCLK_CFG_NO_GUARANTEE even though the HFCLK is running?

    Looking forward to your responses!

    Thank you & best regards,

    -mike

  • Hi Mike, 
    I assume when you mentioned "large event_length". it's just larger than 6 and still not larger than the connection interval ? 
    As far as I know when event_length is larger than connection interval it will not take effect and will be capped at the connection interval as the max value.

    When you mentioned "potential max" , did you mean the event_length ? 
    The softdevice should always presume that the connection event will last as long as event_length. 

    Our softdevice scheduler is not designed to be able to re-schedule if the connection event ends earlier than expected (which is the event_length). This requires a more dynamic algorithm for scheduling and complicating thing a little bit. 

    If you want to avoid such down time, I would suggest to use shorter connection interval. To save power consumption of the peripheral and to avoid the timeslot being blocked by short conn interval, you can use slave latency. When the slave skips the connection events due to slave latency, the timeslots can take place in those slots. 

    In your case when you want the timeslot as much as possible, power consumption may not be a big problem. Then you can afford to go with the approach to request continuously, I think that would be a good workaround for the "non-dynamic" scheduler. 

  • I assume when you mentioned "large event_length". it's just larger than 6 and still not larger than the connection interval ? 

    By this I actually only meant that the softdevice is configured with a value for NRF_SDH_BLE_GAP_EVENT_LENGTH that is larger than 6.


    As far as I know when event_length is larger than connection interval it will not take effect and will be capped at the connection interval as the max value.

    Okay, that makes sense - I did not take this limit into account.

    When you mentioned "potential max" , did you mean the event_length ? 
    The softdevice should always presume that the connection event will last as long as event_length. 

    If by "event_length" you mean NRF_SDH_BLE_GAP_EVENT_LENGTH, then yes, that is wath I meant.

    To save power consumption of the peripheral and to avoid the timeslot being blocked by short conn interval, you can use slave latency.

    Yes, we already try to use slave latency if possible. Since the connection interval is essentially given by our peer (running as peripheral only), I don't want to rely on the connection interval too much.


    In your case when you want the timeslot as much as possible, power consumption may not be a big problem. Then you can afford to go with the approach to request continuously, I think that would be a good workaround for the "non-dynamic" scheduler. 

    I think that's the way to go here. All measurements and our discussion here point to this being the ideal solution for our application.

    Thanks for your help!

    Regards,

    -mike

  • I'm happy to help :) it's an interesting topic to look into :) 

  • If (dynamically) increasing m_radio_request_earliest.params.earliest.length_us is what you recommend when getting NRF_EVT_RADIO_BLOCKED in the Mesh code (https://github.com/NordicSemiconductor/nRF5-SDK-for-Mesh/blob/master/mesh/core/src/timeslot.c#L419), then shouldn't this be implemented?

    Everybody who sets up connections will run into this. And not always in a very pleasant way, it's namely the case that sd_radio_request() uses the app scheduler which can run out of space.

  • Hi Anne,

    I don't think Nordic employees get notified of responses to questions with verified answers.

    But just note that we were talking about increasing the timeout, not the length. Reducing the length is the right thing to do. Increasing the timeout will lead to reduced timeslot availbility - as evident from the above discussion. So if you want to optimise for timeslot uptime you've got to live with the above scheduler behaviour.

    Regards

Reply
  • Hi Anne,

    I don't think Nordic employees get notified of responses to questions with verified answers.

    But just note that we were talking about increasing the timeout, not the length. Reducing the length is the right thing to do. Increasing the timeout will lead to reduced timeslot availbility - as evident from the above discussion. So if you want to optimise for timeslot uptime you've got to live with the above scheduler behaviour.

    Regards

Children
Related