This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

libUARTE bug: Unexpected RX free input parameter

We are more or less successfully using libUARTE for talking to an STM32 at 1M Baud. 
It was a long and tedious road to get there, but it looks like we finally got there

The biggest struggle was getting an idle line timeout to copy partially filled data, as our protocol uses variable sized packets. 

However, every now and again, we get this error:

libUARTE_async      : Unexpected RX free input parameter. length 2 rx_free_cnt 129 rx_buf_size 128

As you can see, we're only asking the libuarte to free 2 bytes we just copied, but internally, llibUARTE adds this to rx_free_cnt and tries to free that. 

void nrf_libuarte_async_rx_free(const nrf_libuarte_async_t * const p_libuarte, uint8_t * p_data, size_t length)
{

    p_libuarte->p_ctrl_blk->rx_free_cnt += length;
    ...
}

We see this seldomly, but more than just a handful of times. 

I have an assumption, as something similar happened to us on the STM, where we wrote our own driver with DMA, double buffering and an idle line timeout. It turned out to be a race condition where two different interrupts called the same callback function and the second interrupt interrupted the first one, and changed some of the variables under our noses. 

so we had to disable the interrupts until we finished with the Interrupt service routine, or at least the part where we handled data from outside the current scope. 

This also happened seldomly, was impossible to reproduce, but caused a lot of trouble when it did occur. 

Could this also be the case for libUARTE? After all, the callbacks to our code, (where we increment the head of our buffer to mach what DMA already copied, then call the above rx_free() function), come from either the timer interrupt, (when no bytes have been transmitted for a while), or the uart/dma interrupt (when a buffer is full, and we switch to a new one), if I understood libUARTE correctly. If this is the case, could it not be that the one interrupt interrupts the other? and if so, did you safeguard the data being handled in those interrupts?

for completeness sake, here's the code where we handle the NRF_LIBUARTE_ASYNC_EVT_RX_DATA event:

 static void libuarte_event_handler(void * context, nrf_libuarte_async_evt_t * p_evt) {
	instance_t * instance = (instance_t *) context;
    switch (p_evt->type)
    {
        ...
        case NRF_LIBUARTE_ASYNC_EVT_RX_DATA:
            if (p_evt->data.rxtx.p_data != NULL) {
            uint8_t * data = p_evt->data.rxtx.p_data;
            uint32_t len = p_evt->data.rxtx.length;
            uart_rx_complete_callback(instance, len, data);
            nrf_libuarte_async_rx_free(instance->uart.uart_instance_ptr, p_evt->data.rxtx.p_data, p_evt->data.rxtx.length);
            ...
            }
        ...
    }
}

Parents Reply Children
  • yes, the issue is not gone. It's gotten harder to get, and I've noticed that it usually comes after some other uart errors. 
    The case where I most often saw it, was when waking from sleep mode, the nordic wakes the STM, both start UART, and maybe the STM was a bit faster at starting UART, so Nordic would get some errors when uart started. When this was the case, a couple of ms down the line, I'd then get the rx free error. I found a log-file which contains both of these issues:

    unleashed_core.c         : 395:  919291: change_conn_params wish 0
    unleashed_core.c         :1001:  919291: Waking up: App active
    ble_stack.c              : 728:  919291: changed connection parameters to ACTIVE
    unleashed_core.c         :1024:  919291: Waking up IMCU...
    wr1_config.c             :  93:  919292: Instance 0 speed 1000000
    wr1_hardware.c           : 119:  919292: UART DMA Init ins 0 baud 1000000
    libUARTE            : (evt) UARTE ERROR Cwr1_hardware.c           :  62:  919292: NRF_LIBUARTE_ASYNC_EVT_ERROR errorSrc xC
    libUARTE            : (evt) UARTE ERROR ClibUARTE            : (evt) UARTE ERROR ClibUARTE            : (evt) UARTE ERROR ClibUARTE            : (evt) UARTE ERROR ClibUARTE            : (evt) UARTE ERROR ClibUARTE            : (evt) UARTE ERROR ClibUARTE            : (evt) UARTE ERROR ClibUARTE            : (evt) UARTE ERROR ClibUARTE            : (evt) UARTE ERROR ClibUARTE            : (evt) UARTE ERROR ClibUARTE            : (evt) UARTE ERROR ClibUARTE            : (evt) UARTE ERROR ClibUARTE            : (evt) UARTE ERROR ClibUARTE            : (evt) UARTE ERROR ClibUARTE            : (evt) UARTE ERROR ClibUARTE            : (evt) UARTE ERROR ClibUARTE            : (evt) UARTE ERROR C
    unleashed_stm_control_set_ready_to_receive 1
    wr1_uart.c               : 199:  919294: Instance 0 uart init baud 1000000
    wr1_packets.c            : 385:  919294: packets init
    imcu_master.c            :  84:  919294: IMCU master initialized
    wr1_packets.c            : 263:  919499: Ins 0: pkt timeout id 00 cmd 0x6F 0x55 0x0A tries left 24
    unleashed_stm_control_get_ready_to_receive 0
    wr1_uart.c               : 162:  919509: Ins 0: RTR == 0
    wr1_frames.c             :  76:  919560: Instance 0 startbyte not first, deleting frame:               000:  80 (1 bytes)
    wr1_packets.c            : 248:  919561: Ins 0: Got 1 framing errors
    wr1_packets.c            : 263:  919710: Ins 0: pkt timeout id 01 cmd 0x6F 0x55 0x0A tries left 23
    unleashed_stm_control_get_ready_to_receive 1
    wr1_packets.c            : 209:  919722: Ins 0: Rx retry packet: cmd 0xEF, id 0x00 (retry 3), datalength 2
    wr1_packets.c            : 203:  919722: Ins 0: Received an unexpected packet: cmd 0xEF, id 0x 1, datalength 2
    wr1_packets.c            : 204:  919723: Packet data: 55 01 
    wr1_packets.c            : 203:  919725: Ins 0: Received an unexpected packet: cmd 0xEF, id 0x 2, datalength 2
    wr1_packets.c            : 204:  919725: Packet data: 55 02 
    wr1_packets.c            : 263:  919925: Ins 0: pkt timeout id 03 cmd 0x6F 0x10 0x9C tries left 24
    wr1_packets.c            : 209:  919937: Ins 0: Rx retry packet: cmd 0xEF, id 0x03 (retry 2), datalength 2
    wr1_packets.c            : 203:  919939: Ins 0: Received an unexpected packet: cmd 0xEF, id 0x 4, datalength 2
    wr1_packets.c            : 204:  919939: Packet data: 10 10 04 
    unleashed_model.c        : 390:  919940: IMCU slave enabled
    wr1_packets.c            : 186:  919943: Ins 0: Got a request frame available (cmd 0x6F, id 0x 6)
    wr1_commands_extended.c  :1054:  919943: Subsystems update: Controls 0 | Settings 0 | GPS 0
    wr1_packets.c            : 329:  919944: Ins 0: Will delete reply cmd 0xEF loc 4
    wr1_commands_extended.c  : 933:  919946: Camera controls enabled
    wr1_packets.c            : 186:  919950: Ins 0: Got a request frame available (cmd 0x6F, id 0x 8)
    wr1_packets.c            : 329:  919950: Ins 0: Will delete reply cmd 0xEF loc 6
    unleashed_core.c         :1045:  919951: Wake up complete: Going to state IDLE
    unleashed_core.c         :1138:  919952: Going to state ACTIVE
    wr1_packets.c            : 186:  919954: Ins 0: Got a request frame available (cmd 0x6F, id 0x A)
    libUARTE_async      : Unexpected RX free input parameter. length 2 rx_free_cnt 129 rx_buf_size 128


    because so much is happening at a very high speed, enabling more debug output usually helps prevent race conditions, and since I was seeing the issue so seldomly, I doubt I'll be able to catch it with more debug output.

    what debug output would help? (maybe I can selectively turn on some of it?)

  • Um .. "After all, the callbacks to our code, (where we increment the head of our buffer to mach what DMA already copied, then call the above rx_free() function)"

    Doesn't that mean _rx_free is called twice (if called in the callback)? Might be an issue since p_evt-> data and length is not guaranteed to be the latest (updated) values .. (edit: if rx_free not called in callback just ignore this)

        uart_rx_complete_callback(instance, len, data);
        nrf_libuarte_async_rx_free(instance->uart.uart_instance_ptr, p_evt->data.rxtx.p_data, p_evt->data.rxtx.length);

  • Good idea, but no - my fault though: Inconsistent naming!

    with "the callbacks to our code", I meant libuarte calling our libuarte_event_handler() 
    and NOT the callback in our code (ie. libuarte_event_handler() calling uart_rx_complete_callback() )

    the rx_free() I pasted the original question is the only place we ever call it.

  • there might be something wrong when it comes to recovering from the error. I will need to look deeper into that. Will come back to you after Easter.

Related