Beware that this post is related to an SDK in maintenance mode
More Info: Consider nRF Connect SDK for new designs
This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

Race condition in nrf_sdk_freertos

There appears to be a design flaw in the nrf_sdk_freertos code that produces a race condition.  The race condition can result in a delay in processing a SoftDevice event, or possibly a dead-lock depending on the application.

The race condition arises from the use of vTaskSuspend / xTaskResumeFromISR as the synchronization method.  In the following code in softdevice_task():

while (true)
{
nrf_sdh_evts_poll(); // Let the handlers run first, in case the EVENT occured before creating this task.
vTaskSuspend(NULL);
}

if a SoftDevice interrupt happens immediately after the call to nrf_sdh_evts_poll(), but before the call the vTaskSuspend(), the call to xTaskResumeFromISR() from within the ISR will be treated as a no-op (because at that moment the task is not suspended).  Thus, once the interrupt completes, the task will proceed to suspend itself, only to be woken if/when another interrupt occurs.

The use of task suspend / resume in this manner is an abuse of the FreeRTOS API that is specifically called out in the documentation for xTaskResumeFromISR:

xTaskResumeFromISR() is generally considered a dangerous function because its actions are not latched. For this reason it should definitely not be used to synchronise a task with an interrupt if there is a chance that the interrupt could arrive prior to the task being suspended, and therefore the interrupt being lost. Use of a semaphore, or preferable a direct to task notification, would avoid this eventuality. 

Another solution would be to use a FreeRTOS Event Group.

--Jay

Related