This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

Bug in app_timer ?

I've been seeing occasional failures of my application which I now believe are due to a bug / design flaw in the app_timer library module.

The problem is due to a race hazard that occurs if a nested interrupt stops and restarts a timer while timer_list_handler() is executing, specifically if the interrupt occurs between processing the time list deletions and insertions.

The issue is that there is no protection around the timer operation queue manipulation in timer_list_handler.

If a higher priority ISR executes between timer_list_handler performing the timer deletions and the timer insertions from the op list, the stop operation by the higher priority ISR is not processed before the start. When the start request is processed in list_insertions_handler the timer has not been removed from the active timer list, but the isRunning flag is false, thus the test of this bit at app_timer.c:584 fails, continue is not executed and so the code proceeds to add the timer into the list.

This leads to a timer node having it's next pointer point to itself which subsequently causes an infinite loop next time the timer list is traversed.

I have devised and attached a small IAR project that demonstrates the issue. To demonstrate it I generate a software interrupt at the point where a hardware interrupt would cause a problem. The software interrupt handler runs at a higher priority than the timer handler so manipulates the operation list between the insertion and deletion tests ny stopping and restarting a timer. I've also added an assertion to check at the end of timer_list_insert() that the next pointer of the timer does not point to itself - it's this assertion that fails.

The software interrupt is generated when timer_list_handler() runs and a global boolean flag is set. The flag is set as a result of a press of button1.

Thus to demonstrate the problem, start the test program and press the button. An assert failure will result.

NordicAppTimerBug.zip

PCA10040 SDK 12.2 SD=132 rev3 (flashed but not enabled)

Related