This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

FreeRTOS Race Condition with Tickless Idle and Suspending Tasks

We've recently added FreeRTOS to the firmware of an existing product (primarily to pre-empt big chunks of 3rd-party code over which we have no control and execute them in a well-timed fashion). The goal is to achieve almost the same low-power properties as before adding FreeRTOS.
We're using the SDK 14.0.0 and SoftDevice S132 v5 (I know, old project).
The firmware is not really FreeRTOS-aware and communicates from the interrupts to the main context by setting flag variables.
In order to achieve the low power mode in FreeRTOS, we did
1 suspending (vTaskSuspend) the main task instead of calling
2 configUSE_TICKLESS_IDLE 1 and configUSE_IDLE_HOOK 0
3 defined configPOST_SLEEP_PROCESSING to be a function

if (NVIC->ISPR[0] | NVIC->ISPR[1] | NVIC->ISPR[2])
{
    vTaskResume(m_main_task_handle);
}

This seems to work, the system remains responsive and the power consumption measurements are as low as we want them.

However, I was thinking about theoretical correctness of the code so that the system neither sleeps too little nor too much. This does not seem to be easy to achieve.
My thoughts are:
The first call to ssd_app_evt_wait after switching to the idle task from the main task always returns immediately because software interrupts are used in task switching and ssd_app_evt_wait does not sleep if an interrupt happend since it was last called.
Hence, if I remove the check for pended interrupts in the configPOST_SLEEP_PROCESSING handler, the system basically never sleeps.
Interrupts that happen during sleep are pended since the ssd_app_evt_wait inside port_cmsis_systick.c is called in a critical section and allows for subsequent inspection of the NVIC->ISPR register.
With this configuration, my code catches all the reasons to unsuspend the main task that happen during the second call to ssd_app_evt_wait. But if an interrupt happens before the first call to ssd_app_evt_wait, for example, we cannot know since it returns immediately and clears the event register.

In other words, the race condition that is not present in ssd_app_evt_wait (or in the sequence WFE, SEV, WFE) is present in my way of calling vTaskSuspend and vTaskResume. The system will not sleep too little, but it might sleep too much.
Is there some other condition that I can check and base resumption of the main task on?

Thanks

Parents

0 A Knecht over 4 years ago

Hi

Thaks for the tip concerning SDK 17.2. I'll copy over the new code.
Concerning configPOST_SLEEP_PROCESSING I'm not calling it manually at all. It is called from vPortSuppressTicksAndSleep defined in port_cmsis_systick.c. The only thing I'm calling is vTaskSuspend for the main task (enabling the idle task to run) and vTaskResume from within configPOST_SLEEP_PROCESSING as in the snippet in the original post.

Thanks
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Susheel Nuguru over 4 years ago in reply to A Knecht

Thanks for clearing that. Can you please do the same test with FreeRTOS from SDK17.2 and see if you see the same issue. If so, then I will have to spend more time on this.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 A Knecht over 4 years ago in reply to Susheel Nuguru

It's not about an observable behavior, but about theoretical correctness.

- sd_app_evt_wait gets called inside a critical region. Hence interrupts that wake the CPU and resume code execution after sd_app_evt_wait will be pended and their presence can be checked via the NVIC->ISPR flags.
- sd_app_evt_wait will not sleep if an interrupt has occurred before it was last called, according to the note in the doxygen comment

If an application interrupt has happened since the last time sd_app_evt_wait was

* called this function will return immediately and not go to sleep. This is to avoid race

* conditions that can occur when a flag is updated in the interrupt handler and processed * in the main loop.

- Due to the way FreeRTOS is written, however, reaching this code after running some other task implies that an interrupt has occurred because task switching is implemented using software interrupts. Hence, sd_app_evt_wait will not sleep.
- If no task is executable after running vPortSuppressTicksAndSleep, it will get called again from the loop in prvIdleTask. This time it will sleep if no interrupt has occurred.

Hence, if I do not check the NVIC->ISPR flags in my configPOST_SLEEP_PROCESSING handler and simply resume the main task, the system will never sleep since every call to sd_app_evt_wait is preceded by an interrupt (the one switching to the idle task).
If I do check the NVIC->ISPR flags in my configPOST_SLEEP_PROCESSING I might sleep too much due to a very similar race condition that the non-FreeRTOS use of sd_app_evt_wait is free of.

Say I have an interrupt that sets some atomic variable that leads the main loop to do something the next time it runs and given the following sequence switching away from the main task:

1 vTaskSuspend(main task)
2 switch over to idle task via software interrupt
3 prvIdleTask calls portSUPPRESS_TICKS_AND_SLEEP
4 the critical region is entered
5 sd_app_evt_wait is called, returns immediately
6 configPOST_SLEEP_PROCESSING is called
7 configPOST_SLEEP_PROCESSING checks the NVIC->ISPR flags and decides the main task must not yet be unsuspended
8 the critical region is exited
9 prvIdleTask calls portSUPPRESS_TICKS_AND_SLEEP
10 the critical region is entered
11 sd_app_evt_wait is called, this time it sleeps
12 ...

If the interrupt that sets the atomic variable happens anytime after 1 and before 4, the main task may not get unsuspended until much later by another interrupt. It cannot be checked in the NVIC->ISPR flag because it happened before entering the critical region and it will not cause the second sd_app_evt_wait to return immediatley (it will do so for the first sd_app_evt_wait, but the first one returns immediately in all cases because of the software-interrupt based task switching).

Are these correct assumptions?
Let me know if this helps or if I should attempt some minimal code example outlining my concern.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Reply

0 A Knecht over 4 years ago in reply to Susheel Nuguru

It's not about an observable behavior, but about theoretical correctness.

- sd_app_evt_wait gets called inside a critical region. Hence interrupts that wake the CPU and resume code execution after sd_app_evt_wait will be pended and their presence can be checked via the NVIC->ISPR flags.
- sd_app_evt_wait will not sleep if an interrupt has occurred before it was last called, according to the note in the doxygen comment

If an application interrupt has happened since the last time sd_app_evt_wait was

* called this function will return immediately and not go to sleep. This is to avoid race

* conditions that can occur when a flag is updated in the interrupt handler and processed * in the main loop.

- Due to the way FreeRTOS is written, however, reaching this code after running some other task implies that an interrupt has occurred because task switching is implemented using software interrupts. Hence, sd_app_evt_wait will not sleep.
- If no task is executable after running vPortSuppressTicksAndSleep, it will get called again from the loop in prvIdleTask. This time it will sleep if no interrupt has occurred.

Hence, if I do not check the NVIC->ISPR flags in my configPOST_SLEEP_PROCESSING handler and simply resume the main task, the system will never sleep since every call to sd_app_evt_wait is preceded by an interrupt (the one switching to the idle task).
If I do check the NVIC->ISPR flags in my configPOST_SLEEP_PROCESSING I might sleep too much due to a very similar race condition that the non-FreeRTOS use of sd_app_evt_wait is free of.

Say I have an interrupt that sets some atomic variable that leads the main loop to do something the next time it runs and given the following sequence switching away from the main task:

1 vTaskSuspend(main task)
2 switch over to idle task via software interrupt
3 prvIdleTask calls portSUPPRESS_TICKS_AND_SLEEP
4 the critical region is entered
5 sd_app_evt_wait is called, returns immediately
6 configPOST_SLEEP_PROCESSING is called
7 configPOST_SLEEP_PROCESSING checks the NVIC->ISPR flags and decides the main task must not yet be unsuspended
8 the critical region is exited
9 prvIdleTask calls portSUPPRESS_TICKS_AND_SLEEP
10 the critical region is entered
11 sd_app_evt_wait is called, this time it sleeps
12 ...

If the interrupt that sets the atomic variable happens anytime after 1 and before 4, the main task may not get unsuspended until much later by another interrupt. It cannot be checked in the NVIC->ISPR flag because it happened before entering the critical region and it will not cause the second sd_app_evt_wait to return immediatley (it will do so for the first sd_app_evt_wait, but the first one returns immediately in all cases because of the software-interrupt based task switching).

Are these correct assumptions?
Let me know if this helps or if I should attempt some minimal code example outlining my concern.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Children

0 Susheel Nuguru over 4 years ago in reply to A Knecht
A Knecht said:
Say I have an interrupt that sets some atomic variable that leads the main loop to do something the next time it runs and given the following sequence switching away from the main task:

Can't you include the atomic variable into your if condition? for example

before you suspend your main task, clear the atomic variable and the vTaskSuspend(main_task)

in your interrupt service handler, set the atomic variable

in configPOST_SLEEP_PROCESSING
if( (NVIC->ISPR[0] | NVIC->ISPR[1] | NVIC->ISPR[2]) | (atomic_variable == 1)) { vTaskResume(m_main_task_handle); atomic_variable = 0; }

I think the only possibility that the atomic_variable in that IF statement gets through is when the corner case you mentioned happens.

Do you agree?
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel