This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

Zephyr uarte library missing UART_RX_BUF_RELEASED events

I run newest nRF53 chip as of writing with 1.6.0 nRF Connect SDK from nRF Connect Toolchain manager 3.6.1 using SES 5.50c.

I use a array of 5 rx-buffers (Increased from 3 to figure out what was going on) and handle UART_RX_BUF_REQUEST and UART_RX_BUF_RELEASED events in the registered callback.
Most of the time it runs fine but during a continuous test I notice that at random intervalls it may skip a UART_RX_BUF_RELEASED event which leads my application to detect a request of using a buffer that is not yet freed. If these buffers had been allocated dynamically this would have caused a memory leak. Instead I catch it and log it while actually allowing the reassigning of already assigned buffer.

Since I use the buffers round-robin and process data (transfer to other threads) during any UART_RX_RDY events I do not miss data but this behavior is worrying.
Is there a race-condition in the library?

Parents
  • Hi Kyrre,

    Sounds very much like a race condition. I need to understand more about the chain of events that happened before digging into the code. Could you please help me understand with these questions.

    1. Your application seems to be tracking which buffer is free and not instead of relying solely on these events. Is it possible that the application tracking logic might have a race condition? Could you rule that out?
    2. Could you provide me the chain of UART events that are received by your application (a log file with a way to see EVENTS/buffer mapping would help).
    3. What other application peripheral interrupts are enabled and being used when your application detected that a request arrived for a buffer that is not yet free?
Reply
  • Hi Kyrre,

    Sounds very much like a race condition. I need to understand more about the chain of events that happened before digging into the code. Could you please help me understand with these questions.

    1. Your application seems to be tracking which buffer is free and not instead of relying solely on these events. Is it possible that the application tracking logic might have a race condition? Could you rule that out?
    2. Could you provide me the chain of UART events that are received by your application (a log file with a way to see EVENTS/buffer mapping would help).
    3. What other application peripheral interrupts are enabled and being used when your application detected that a request arrived for a buffer that is not yet free?
Children
  • The logs are very long before things happen so I had to use pastebin. Bottom of the log is the relevant part but all is included.

    Log of REQ and FREE

    The app keeps track of this array separately but it prints each event as they occur in the log.
    It is clear that a free event does not happen. It does ask for two buffers in quick succession without freeing buffer 3 though, but frees 3 right after.

  • Here is the same log with RX events as well. I had to compact it a bit to fit within 512k pastebin limits.
    https://pastebin.com/15AJpMP9

  • My application checks the buffer reference and compares to it's list of buffers to identify the buffer number being freed.
    The buffers are allocated statically so these values/pointers never change.

    Buf FREE x means it has identified the referenced buffer as buffer number x from 0 to 4.

    It always responds to a buffer request with the next buffer in succession and starts the system by setting up the enable rx call using buffer 0. The Buf 1 request event has been logged inbetween something else during startup and thus was missing in the log but happened correctly.

    Since I run a total of 5 buffers to debug this issue I see it usually always frees a buffer immediately once full, however this does not appear to always happen which leads me to believe the problem is either in the hardware or in the library during the handling of the events. I am thinking that there may exist a short timeframe where a flag may be set by hardware buy is cleared by software before or after the execution of the rx events callback etc.

  • Thanks for the detailed explanation. 

    Kyrre Aalerud said:
    My application checks the buffer reference and compares to it's list of buffers to identify the buffer number being freed.

     This above explanation from you tells me that this most likely is a race condition in the uart driver where it missed this released event totally. I will create an internal ticket for the bug.

    It would be interesting info from you if you can confirm if the buffer with the missed released event had its buffer fully/partially filled? and also if the data in that buffer was relavant? Also would it be possible for you to give us a minimalistic project to reproduce this as this seems very timing related. This information will help the engineers focus on the code sections which are which cause this behavior. 

    Once again Thanks a lot for such a detail explanation and your experiments. But for now, since you are not doing dynamic allocation on request events and able to manage the round robin buffers in a good way (and also not losing data due to this bug) we hope that this bug will be fixed in the next releases.

  • for your reference, the internal ticket for the low level driver team to look at this is NCSDK-10801

Related