This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

race condition in app_timer

I've been looking at the app_timer source after an odd lockup I had. I think it has a race condition in it.

The code which adds events to the pending events list is usually called from user code. It mutates the list of events and then calls timer_list_handle_sched() which pends the SWI0 interrupt. In most cases that interrupt would occur instantly and add the new events. SWI0 and RTC1 are the same priority so they are kept from mutating the internal timer list at the same time.

However timer_timeouts_check() also calls timer_list_handle_sched() and timer_timeouts_check() is called from RTC1_IRQHandler() amongst others.

The race condition would therefore appear to be something like this. During a user-mode call to say app_timer_start() that calls timer_start_op_schedule() which gets an operation and the last index, starts mutating the new operation over several lines of code and finally adds it to the queue with op_user_enqueue() using that last index.

However if the RTC1 timer interrupt occurs during this code, it will eventually call timer_list_handle_sched() which will, as soon as the interrupt finishes call the timer_list_handler() which will then mutate the same list that the app_timer_start() code was in the middle of mutating, removing events from it.

When the code returns back to the timer_start_op_schedule() code it will add the operation where the end of the list used to be, but the list has been changed.

It seems that allowing any way for the SWI0 interrupt to be triggered from the RTC interrupt can lead to cases where the RTC interrupt happens when the event queue is being mutated in user code, when the user code continues the list is not the same and can get corrupted.

Anything missed here?

Parents Reply Children
No Data
Related