Beware that this post is related to an SDK in maintenance mode
More Info: Consider nRF Connect SDK for new designs
This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

using app scheduler does not prevent BLE stack starvation

Hello,

In our application we heavily use the application scheduler. This was done for 2 main reasons:

  1. serialize access to system resources
  2. enable the soft device to handle higher priority tasks

It appears that it isn't sufficient to use the application schedule to accomplish the second objective. You also need to inject timer delays between the scheduled events, otherwise BLE communication will be starved. 

Is this how the application scheduler should work? I was under the impression that application scheduled events where the last thing to be serviced. 

We are using SDK14 with the nRF52832 processor.

Thanks

Parents Reply Children
  • Here is what we are doing and how it is implemented.

    I am erasing 64KB of eeprom connected to the SPI bus. Each time the erase function is called it erases 128 bytes from the eeprom and then reschedules itself. The eeprom SPI interface is *not* interrupt driven. Each erase operation takes about 20ms.

    I also investigated the interrupt priorities. However we are never doing anything in an interrupt context. So, repeatedly scheduling 20ms non-interrupt operations with no idle time will kill BLE. Adding a timer delay fixes the issue.

    BTW: as another test I changed the SD scheduler configuration to use the application event queue and increased the allocated queue size to 50. If anything, this seemed worse.

  • Interesting, so you may have several 20ms periods where you are not handling any BLE events in between if I understand you correctly. I will run this by the softdevice team.

    You write: "Usually the connection terminates with a 0x28 error code. Then we get 0x16 after that."

    I assume that you mean that you get disconnect reason 0x28, but not sure what you mean by 0x16? Is this an error code from a specific softdevice call, or are you getting 2 disconnects?

  • Yes, disconnect reason 0x28. 0x16 is also a BLE error code. The last one is probably due to a software error in our code. I could probably get a log file if you are interested.

  • Is it possible to get a on air sniffer log do you think? Or possible just dump the BLE evt_id to UART, so we may try to find which event that in specific may be the problem here. 

    The 0x16 error is local host terminated, do you initialize or receive connection update/phy update/ channel map update on the device?

     

    Also does this happen on a peripheral or a central? 

    A quick calculation on your description means you may be blocking BLE events for 64kB/128B*20ms=10seconds. This is a long time, but should work, so I would like to understand which event in specific that may be time critical.

  • The whole premise of this issue report is that we should not be blocking BLE for 10 seconds since we are using the app_sched_event_put() to perform the operation in 128 byte chunks.

    We are using a bonded communication scheme so an external sniffer does not work. (we tried that)

Related