Simultaneous communication on 2 sync libUARTe

Hi.

We are using nRF52833 (on BL653 module) where UART0 and UART1 are used for communication purposes (one with STM MUC and the other with quectel modem). Current solution uses SDK 17.0.2, softdevice and freeRTOS, where async libuarte support is used for uart communication. The main problem is that, from time to time (every couple of minutes), we get a HardFault. But only when both are active. If we disable on of them, the code looks to work as expected.

Currently, the UARTs are defined as (using app_timer):

NRF_LIBUARTE_ASYNC_DEFINE(libuarte, 1, 4, NRF_LIBUARTE_PERIPHERAL_NOT_USED, NRF_LIBUARTE_PERIPHERAL_NOT_USED, 1024, 3);

NRF_LIBUARTE_ASYNC_DEFINE(quectel_libuarte, 0, 3, NRF_LIBUARTE_PERIPHERAL_NOT_USED, NRF_LIBUARTE_PERIPHERAL_NOT_USED, 1024, 3);

We found out that using exact TIMER (TIMER2) works worse as before we used:
NRF_LIBUARTE_ASYNC_DEFINE(libuarte, 1, 4, 2, NRF_LIBUARTE_PERIPHERAL_NOT_USED, 1024, 3);
NRF_LIBUARTE_ASYNC_DEFINE(quectel_libuarte, 0, 3, NRF_LIBUARTE_PERIPHERAL_NOT_USED, NRF_LIBUARTE_PERIPHERAL_NOT_USED, 1024, 3);
So, the first question is which TIMER is preferred to be used for UART when softdevice is present?
And while changing to app_timer made things better, HardFault still occurs. In the docs you can find: "If the TXD.PTR and the RXD.PTR are not pointing to the Data RAM region, an EasyDMA transfer may result in a HardFault or RAM corruption.". What is the best scenario to check whether this is the cause in our case?
Also, do both interfaces share any common shared source (besides easyDMA, which should work with both of them)? For example, should UART1 wait for UART0 to finish anything it does (writing and/or reading)? We are aware that UART can not read and write at the same time. Are we missing something else?
Best regards,
Vojko
Parents
  • Just thinking out loud here. Async UART lib is a bit of resource hungry in the sense that it uses TIMER and RTC if inactivity timeout is used. If you are using tw libuarte then you aer using atleast two high resolution timers and maybe two additional RTCs if you are using inactivity timeout for receiver.  Also note that FreeRTOS needs RTC1 (instance is configurable). So it looks like it might be very easy to get into a resource conflict using two libuarte on FreeRTOS. My suggestions are

    1. Disable receiver timeout functionality and make sure RTC is not used in libUARTE. This will avoid any RTC resource conflicts I think might be happening with FreeRTOS RTC instance and your libuarte.
    2. Debug your hardfault first. Get the exact hardfaulting instruction. Maybe what you are seeing is just a stack overflow error caused by under defining the task stack size. 
Reply
  • Just thinking out loud here. Async UART lib is a bit of resource hungry in the sense that it uses TIMER and RTC if inactivity timeout is used. If you are using tw libuarte then you aer using atleast two high resolution timers and maybe two additional RTCs if you are using inactivity timeout for receiver.  Also note that FreeRTOS needs RTC1 (instance is configurable). So it looks like it might be very easy to get into a resource conflict using two libuarte on FreeRTOS. My suggestions are

    1. Disable receiver timeout functionality and make sure RTC is not used in libUARTE. This will avoid any RTC resource conflicts I think might be happening with FreeRTOS RTC instance and your libuarte.
    2. Debug your hardfault first. Get the exact hardfaulting instruction. Maybe what you are seeing is just a stack overflow error caused by under defining the task stack size. 
Children
  • Hi.

    Thanks for the response.

    Just to double check if we understood you correctly.

    1. You're suggesting not to used async library (nrf_libuarte_async) and work only with low level libuarte driver (nrf_libuarte_drv)? Because according to the docs, if you are using app_timer RTC is not used. Or is there another option to disable receiver timeout we have missed?

    2. If we checkout the address from .map that the hardfault is reffering to (pc), always comes from malloc(_malloc_r) or free (_free_r) source file. Any good suggestions how to backtrace the origin of the failure?

  • For example:

    HARD FAULT at 0x0004EE9A

    R0: 0x20009E60 R1: 0x2001B670 R2: 0x054E022D R3: 0x054E022D

    R12: 0x2000ED44 LR: 0x0004EE71 PSR: 0x81000000

    and from .map

    .text 0x000000000004ee54 0xc4 /usr/lib/gcc/arm-none-eabi/10.3.1/../../../arm-none-eabi/lib/thumb/v7e-m+fp/hard/libc_nano.a(lib_a-nano-freer.o)
    0x000000000004ee54 _free_r
    .text 0x000000000004ef18 0xb4 /usr/lib/gcc/arm-none-eabi/10.3.1/../../../arm-none-eabi/lib/thumb/v7e-m+fp/hard/libc_nano.a(lib_a-nano-mallocr.o)
    0x000000000004ef18 _malloc_r

  • Hi,

    Is this reproducible on nRF52833 DK? If yes, can you please share the project so that I can attempt to start the debugger and figure out the context of the fault. Seems like there is some stack corruption here.

  • Hi.

    Unfortunately, we have a custom board and it would probably take a lot of changes to test it on DK. We will try to make a DK version. 

    But in the mean time. We do use malloc and free. Would it make sense to use pvPortMalloc from freertos (heap_3 variant) or even nrf_malloc to prevent memory corruption? 

    What is the best way to check memory consumption (analitically or real-life)? The solution with printing address of NULL pointer variable always returns the same address from which we are not able to detect if memory consumption is growing. Any other way to detect memory leaks? Or to at least detect that memory leak is present?

    And to check it analytically. Soft_device will take a part of memory, freertos heap will be next and in the end malloc calls, right? If there is enough space malloc should not intervene in the memory locations of other two, right? It should return NULL.

  • Vojko Glaser said:
    What is the best way to check memory consumption (analitically or real-life)? The solution with printing address of NULL pointer variable always returns the same address from which we are not able to detect if memory consumption is growing. Any other way to detect memory leaks? Or to at least detect that memory leak is present?

    This is never straightforward. But I once followed this long time ago on FreeRTOSv9 with some success. I do not remember the details but I bookmarked this page as it was helpful. 

    Vojko Glaser said:
    And to check it analytically. Soft_device will take a part of memory, freertos heap will be next and in the end malloc calls, right? If there is enough space malloc should not intervene in the memory locations of other two, right? It should return NULL.

    Yes, if there are no stack overflows, then the memory used by softdevice or application should not split into using the reserved heap area.

Related