Beware that this post is related to an SDK in maintenance mode
More Info: Consider nRF Connect SDK for new designs
This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

App timer (and atomic fifo) no memory

Hi,

I've encountered problem with app timer functions being called from different interrupt levels. Looks like no memory error returned is caused by collision in underlaying atomic fifo library. This seems like unexpected behaviour as there is no information in documentation about such a behaviour.

Setup:
nRF5_SDK_17.1.0
gcc-arm-none-eabi-10-2020-q4-major (tested both windows and IOS)
gcc-arm-none-eabi-9-2019-q4-major (tested on IOS)

nRF52840_XXAA
App timer v2
no rtos

Originally encountered problem:
From main application thread SAADC was configured and started to perform conversion. After triggering SAADC operation app timer was started with purpose of generating timeout event for SAADC (we had previous problem with SAADC driver hangup and not handling interrupt properly). In SAADC driver event handler, app timer stop was called. App timer interrupt level was set to 7 and SAADC to 6. Randomly (due to SD and active BLE connection) when calling app timer start from main thread, NO_MEMORY error was returned. First investigation showed that app timer start and subsequent, underlaying rtc interrupt is again interrupted by SAADC callback. This interrupt of start operation scheduling caused it to return NO_MEMORY error. 
I'm aware that NO_MEMORY error could be returned in case timer library has high load and operation queue is full. One of first solution was increasing timer operation queue from 10 to 20, however it didn't change anything, error was returned similarly frequent. Moreover, enabling app timer profiler (APP_TIMER_WITH_PROFILER=1) completely solved problem, as internally, queue element allocation (which was failing), is surrounded by critical section.
I've created simple project (zip will be attached to this ticket) that can reliably reproduce this issue.
To reproduce this problem slight modification was done in order not to use SAADC and simulate its delay (plus SD interrupt) with something more predictable. Instead of using SAADC, timer will be used, to get constant delay of interrupt. As timers are synchronous to 16MHz clock, we also have to synchronize test program with it. Main thread will be elevated to interrupt level 7, by additional timer. All other interrupts levels will be increased by 1 compared to original problem: app timer (rtc) to 6 and timer simulating SAADC to 5. Delay of simulating timer will be changed in loop and return code from app_timer_start will be monitored. 
I was able to reproduce issue with different compilers and different optimization levels.
I haven't look deeper into it, but looks like reentrancy or other collision in atomic fifo is causing this issue. Or maybe I simply have something incorrectly configured.
Best regards
Tomasz
P.S. Regarding mentioned saadc driver problem, I will also create ticket when I will be able to reproduce it somehow reliably.
  • Hi,

    Thanks for the detailed report. I will try your example and see if I can replicate this here tomorrow.

    Best regards,

    Vidar

  • Update: I was able to reproduce the problem here:

    And I agree that the issue appears to stem from the Atomic FIFO. However, I am not sure if it's a bug in the library itself, or if it is being used incorrectly by the app timer. The documentation does note some limitations:

    Handling interruptions

    The implementation assumes that the possible interruption of FIFO access operation works like a stack. A task has to finish the whole access before exiting. It might be interrupted by another task, but this interrupting task must also finish its access. The following is the underlying mechanism of handling interruptions:

    1. Task A allocates space in FIFO.
    2. Interrupt interrupts the access and allocates space in FIFO.
    3. Interrupt pushes a new value into the allocated space.
      • During pushing, it detects that access was interrupted before.
      • The new data is not available, but the space in the FIFO is reserved and data is present here.
    4. Task A pushes data into the allocated space.
      • During pushing, it detects that another piece of data is present.
      • Two new data items are available now.

    Multithreaded preemptive operating system limitation

    This kind of atomic FIFO access works well if access can be split between the main loop and any interrupts. It will not work however, if the FIFO access is distributed between different tasks in a multithreaded preemptive operating system. The problem with a multithreaded preemptive operating system is that the operation might be interrupted by another operation, which would not be finished before going back to the previous operation. For example:

    1. Task A allocates space in FIFO.
    2. Task B interrupts and allocates space in FIFO.
    3. Task A interrupts and pushes data into the FIFO. A problem arises:
      • Task A detects that there is new data in the FIFO.
      • Task A marks two pieces of data available - the one from Task A and the one from Task B.
      • However, there is no valid data from Task B as the space was only booked and no data was set.
    4. Task B interrupts and pushes its data. Now the FIFO state is valid, but there was a moment when data was corrupted.

    I will report this as a bug internally. In the meantime, can you work around the bug by adding the critical regions in app_timer.c like APP_TIMER_WITH_PROFILER does?

    Best regards,

    Vidar

  • Hi, thanks for detailed answer.

    I'm aware of limitations of atomic fifo library, but they appear to be indeed problem only when using operating system, which is not a case in this scenario.

    Sure, I can work around with critical regions for now. 

    I'm curious if is't problem with atomic fifo or how app timer uses it. If I mark your answer as "verified" will you still be able to update thread with this bit of information?

  • Hi,

    I wonder if you may get into a similar situation (although not identical) if the FIFO access is preempted by a third interrupt. Another thing is that the documentation does not mention anything about the allocation failing, but describes a window where an allocated FIFO element may be read before the data has been written to it, or that's how I interpret it, at least. I hope to get some clarification from the developers.

    I'm curious if is't problem with atomic fifo or how app timer uses it. If I mark your answer as "verified" will you still be able to update thread with this bit of information?

    Yes, I will make sure to update the thread if I hear back from the developers.

  • , the developer found the problem to be in the atomic fifo implementation and proposed the following patch (GCC only):

    diff --git a/sdk/nrf5/components/libraries/atomic_fifo/nrf_atfifo_internal.h b/sdk/nrf5/components/libraries/atomic_fifo/nrf_atfifo_internal.h
    
    index f81abe6218..518a8ffe92 100644
    
    --- a/sdk/nrf5/components/libraries/atomic_fifo/nrf_atfifo_internal.h
    
    +++ b/sdk/nrf5/components/libraries/atomic_fifo/nrf_atfifo_internal.h
    
    @@ -334,18 +334,18 @@ bool nrf_atfifo_wspace_req(nrf_atfifo_t * const p_fifo, nrf_atfifo_postag_t * co
    
             "   ldrh  %[temp], [%[p_fifo], %[offset_head_wr]]               \n"
    
             "   cmp   %[new_tail], %[temp]                                  \n"
    
             "   ittt  eq                                                    \n"
    
    -        "   clrexeq                                                     \n"
    
             "   moveq %[ret], %[false_val]                                  \n"
    
    +        "   moveq %[new_tail], %[old_tail]                              \n"
    
             "   beq.n 2f                                                    \n"
    
             "                                                               \n"
    
             "   pkhbt %[new_tail], %[new_tail], %[old_tail]                 \n"
    
             "                                                               \n"
    
    +        "   mov %[ret], %[true_val]                                     \n"
    
    +        "                                                               \n"
    
    +        "2:                                                             \n"
    
             "   strex %[temp], %[new_tail], [%[p_fifo], %[offset_tail]]     \n"
    
             "   cmp   %[temp], #0                                           \n"
    
             "   bne.n 1b                                                    \n"
    
    -        "                                                               \n"
    
    -        "   mov %[ret], %[true_val]                                     \n"
    
    -        "2:                                                             \n"
    
             : /* Output operands */
    
                 [ret]     "=r"(ret),
    
                 [temp]    "=&r"(temp),
    
    @@ -413,8 +413,8 @@ bool nrf_atfifo_rspace_req(nrf_atfifo_t * const p_fifo, nrf_atfifo_postag_t * co
    
             "   ldrh  %[temp], [%[p_fifo], %[offset_tail_rd]]           \n"
    
             "   cmp   %[new_head], %[temp]                              \n"
    
             "   ittt  eq                                                \n"
    
    -        "   clrexeq                                                 \n"
    
             "   moveq %[ret], %[false_val]                              \n"
    
    +        "   moveq %[new_head], %[old_head]                          \n"
    
             "   beq.n   2f                                              \n"
    
             "                                                           \n"
    
             "   ldrh  %[temp], [%[p_fifo], %[offset_item_size]]         \n"
    
    @@ -426,12 +426,11 @@ bool nrf_atfifo_rspace_req(nrf_atfifo_t * const p_fifo, nrf_atfifo_postag_t * co
    
             "                                                           \n"
    
             "   pkhbt %[new_head], %[old_head], %[new_head], lsl #16    \n"
    
             "                                                           \n"
    
    +        "   mov %[ret], %[true_val]                                 \n"
    
    +        "2:                                                         \n"
    
             "   strex %[temp], %[new_head], [%[p_fifo], %[offset_head]] \n"
    
             "   cmp   %[temp], #0                                       \n"
    
             "   bne.n 1b                                                \n"
    
    -        "                                                           \n"
    
    -        "   mov %[ret], %[true_val]                                 \n"
    
    -        "2:                                                         \n"
    
             : /* Output operands */
    
                 [ret]     "=r"(ret),
    
                 [temp]    "=&r"(temp),
    
     

Related