Log messages over RTT don't show when running multiple threads

Hi all,

I'm developing a custom product based on the nrf5340 + nrf7002 that relies on BLE (and eventually wifi), and has an LCD screen (driven via LVGL), among other peripherals. All the codebase is being written in C++. We are using nRF Connect SDK v2.3.0 and zephyr-sdk-0.16.0 for the toolchain.

For debugging purposes we enabled the Log facility and the shell, both of them through RTT. Also, they are configured on their non-minimal variant.

We have a class that creates two control loop threads (think of their entry as an initialization stage plus a big while(true) that implements a PID controller and a k_yield() at the bottom). Both of them are created like this:

#define CONTROL_THREAD_PRIORITY 1
K_THREAD_STACK_DEFINE(CONTROL_THREAD_LEFT_STACK, 2048);
K_THREAD_STACK_DEFINE(CONTROL_THREAD_RIGHT_STACK, 2048);

// ...

k_tid_t id;

id = k_thread_create(
    &m_control_threads[0], CONTROL_THREAD_LEFT_STACK,
    K_THREAD_STACK_SIZEOF(CONTROL_THREAD_LEFT_STACK),
    reinterpret_cast<k_thread_entry_t>(control_thread),
    this, UINT_TO_POINTER(0), nullptr,
    K_PRIO_PREEMPT(CONTROL_THREAD_PRIORITY), 0, K_NO_WAIT);
k_thread_name_set(id, "control_left");

id = k_thread_create(
    &m_control_threads[1], CONTROL_THREAD_RIGHT_STACK,
    K_THREAD_STACK_SIZEOF(WEIGHT_SELECTION_SUBSYSTEM_CONTROL_THREAD_RIGHT_STACK),
    reinterpret_cast<k_thread_entry_t>(control_thread),
    this, UINT_TO_POINTER(1), nullptr,
    K_PRIO_PREEMPT(CONTROL_THREAD_PRIORITY), 0, K_NO_WAIT);
k_thread_name_set(id, "control_right");

// ...

void ControlSubsystem::control_thread(void* subsystem, void* p_side, void*)
{
    ControlSubsystem* const SUBSYSTEM = reinterpret_cast<ControlSubsystem*>(subsystem);
    const uint8_t SIDE = POINTER_TO_UINT(p_side);
    LOG_INF("Starting control thread %s", (SIDE == 0) ? "LEFT" : "RIGHT");
    while(true) {
        // control stuff, basically read a setpoint from SUBSYSTEM and set the duty cycle of a PWM peripheral.
        k_yield();
    }
}

So far so good... Except that, when we turn a rotary encoder for changing the setpoint, the motors won't spin. Unless we connect through RTT: in that case, all LOG messages appear, the control loop start running and everything seems to be OK.

We were thinking about a priorities issue, so we used CONFIG_LOG_THREAD_CUSTOM_PRIORITY=y and LOG_THREAD_PRIORITY=5 to set a new priority to the log facility.

If we flash the new firmware and run the device standalone, the motors start working as we turn the encoder, which is good! But if this time we connect through RTT... Well, there are not any log messages, just the shell.

We tried several config options, and ended up having this one (regarding Log/Shell and misc configs):

CONFIG_CPLUSPLUS=y
CONFIG_CPP_MAIN=y
CONFIG_LIB_CPLUSPLUS=y
CONFIG_FPU=y
CONFIG_NEWLIB_LIBC=y
CONFIG_NEWLIB_LIBC_FLOAT_PRINTF=y
CONFIG_MAIN_STACK_SIZE=4096

# Debug Options. Comment them if needed
CONFIG_DEBUG_OPTIMIZATIONS=y
CONFIG_DEBUG_THREAD_INFO=y

# Enable RTT
CONFIG_USE_SEGGER_RTT=y
CONFIG_SEGGER_RTT_MODE_NO_BLOCK_SKIP=y
CONFIG_SEGGER_RTT_BUFFER_SIZE_DOWN=8192
CONFIG_SEGGER_RTT_BUFFER_SIZE_UP=8192
CONFIG_SEGGER_RTT_PRINTF_BUFFER_SIZE=8192

# Enable log on RTT
CONFIG_LOG=y
CONFIG_LOG_BACKEND_RTT=y
CONFIG_LOG_BACKEND_RTT_BUFFER=1
CONFIG_LOG_BACKEND_RTT_MODE_DROP=y
CONFIG_LOG_BACKEND_RTT_MODE_BLOCK=n
CONFIG_LOG_BACKEND_RTT_BUFFER_SIZE=8192
CONFIG_LOG_BACKEND_UART=n

# Log settings
CONFIG_LOG_PRINTK=y
CONFIG_LOG_MODE_DEFERRED=y
CONFIG_LOG_MODE_OVERFLOW=y
CONFIG_LOG_BLOCK_IN_THREAD=n
CONFIG_LOG_BUFFER_SIZE=8192
CONFIG_LOG_PROCESS_THREAD=y
CONFIG_LOG_PROCESS_THREAD_CUSTOM_PRIORITY=y
# Conflicting conf here?
CONFIG_LOG_PROCESS_THREAD_PRIORITY=5
CONFIG_LOG_PROCESS_THREAD_STACK_SIZE=4096
CONFIG_LOG_MODE_MINIMAL=n

# Shell settings
CONFIG_SHELL_THREAD_PRIORITY_OVERRIDE=y
CONFIG_SHELL_THREAD_PRIORITY=1
CONFIG_PRINTK=y
CONFIG_SHELL=y
CONFIG_CONSOLE=y
CONFIG_RTT_CONSOLE=y
CONFIG_SHELL_BACKEND_RTT=y
CONFIG_SHELL_BACKEND_SERIAL=n
CONFIG_SHELL_MINIMAL=n
CONFIG_SHELL_STACK_SIZE=8192
CONFIG_KERNEL_SHELL=y
CONFIG_BOOT_BANNER=y
CONFIG_DEVICE_SHELL=y
CONFIG_STATS=y
CONFIG_STATS_SHELL=y
CONFIG_STDOUT_CONSOLE=y
CONFIG_UART_CONSOLE=n
CONFIG_SHELL_BACKEND_RTT_LOG_MESSAGE_QUEUE_SIZE=8192
CONFIG_SHELL_PROMPT_RTT="rtt:~$ "

However, no matter what combination we do, we end up in this kind of trade-off in which we have our system being functional but cannot read any message (which difficult further development) or we have our system with log messages, but cannot work standalone (which difficult testing for people who does not have access to a JTAG device).

Any help will be appreciated! It's the first time I'm posting on DevZone, so if I'm missing something, please let me know. Also, let me know if should I provide further details.

Thanks in advance!

Parents
  • Hi,

     

    Any help will be appreciated! It's the first time I'm posting on DevZone, so if I'm missing something, please let me know. Also, let me know if should I provide further details.

    Thank you so much for providing such detailed information. Your explanation, and the configurations that you've tried, explains the scenario thoroughly.

     

    So far so good... Except that, when we turn a rotary encoder for changing the setpoint, the motors won't spin. Unless we connect through RTT: in that case, all LOG messages appear, the control loop start running and everything seems to be OK.

    By default, the log thread will run on the lowest priority, to avoid interfering with other high priority threads that might run in your system. I also see that you have the shell subsys enabled, which will also behave similar in terms of thread priority as the log module.

    Q1: What happens if you disable the shell and only keep the LOG enabled?

    Q2: IS the motor logic in the thread itself, with no other dependencies in ISR/workqueues' or similar?

    Q3: Have you tried attaching the debugger only, ie. a debug session, to see where the firmware is stuck prior to opening a RTT session? This can be done using Segger Ozone for instance.

    It is quite strange that your control thread thread runs partially (motor not updating..) when having the highest priority, especially when you have a non-blocking option set in SEGGER RTT.

    We were thinking about a priorities issue, so we used CONFIG_LOG_THREAD_CUSTOM_PRIORITY=y and LOG_THREAD_PRIORITY=5 to set a new priority to the log facility.

    If we flash the new firmware and run the device standalone, the motors start working as we turn the encoder, which is good! But if this time we connect through RTT... Well, there are not any log messages, just the shell.

    k_yield()'s behavior comes into play here, as it will yield for threads with equal or higher priority: https://docs.zephyrproject.org/latest/kernel/services/threads/index.html#c.k_yield

    Note that any other sleep function (except for k_busy_sleep) will also yield and allow pending threads (of all priorities) to run.

     

    Kind regards,

    Håkon

  • Hi again, I have some discoveries done during a debug session using VSCode.

    Long story shot: it seems like the Segger's RTT buffers get filled, so it gets stuck waiting for them to be emtpy (which does not happen ever).

    This is the infinite loop, inside z_shell_write()

    z_shell_write

    The variable lenght should be reduced by steps of size tmp_cnt but that value is always zero.

    tmp_cnt content is updated on shell_rtt.c 's write():

    shell_rtt_write

    Since rtt_blocking = false, SEGGER_RTT_Write() is called on the else statement:

    Finally, on SEGGER_RTT_WriteNoLock() what happens is that, since we don't have available space (our buffer of size 8192 has only 7 free bytes), we return 0:

    SEGGER_RTT_WriteNoLock

    It seems like the setting CONFIG_SEGGER_RTT_MODE_NO_BLOCK_SKIP=y is not working, or it does not work like I would expect (just discard unsent bytes). Is there anything I can do in order to force flushing the buffers? I think that should do the trick.

Reply
  • Hi again, I have some discoveries done during a debug session using VSCode.

    Long story shot: it seems like the Segger's RTT buffers get filled, so it gets stuck waiting for them to be emtpy (which does not happen ever).

    This is the infinite loop, inside z_shell_write()

    z_shell_write

    The variable lenght should be reduced by steps of size tmp_cnt but that value is always zero.

    tmp_cnt content is updated on shell_rtt.c 's write():

    shell_rtt_write

    Since rtt_blocking = false, SEGGER_RTT_Write() is called on the else statement:

    Finally, on SEGGER_RTT_WriteNoLock() what happens is that, since we don't have available space (our buffer of size 8192 has only 7 free bytes), we return 0:

    SEGGER_RTT_WriteNoLock

    It seems like the setting CONFIG_SEGGER_RTT_MODE_NO_BLOCK_SKIP=y is not working, or it does not work like I would expect (just discard unsent bytes). Is there anything I can do in order to force flushing the buffers? I think that should do the trick.

Children
  • Hi,

     

    Thank you for the impressive debug work you've done!

     

    I created a hello_world with RTT log and shell, to mimic the behavior. It behaves exactly as you describe, meaning that it is stuck in the shell_write function here:

    https://github.com/nrfconnect/sdk-zephyr/blob/v3.3.99-ncs1/subsys/shell/shell_ops.c#L403-L425

     

    Here's the minimal project I used to recreate the issue, where I placed all threads on the same priority:

    hello_world_shell_rtt.zip

    I've used the gpio's P0.28/P0.29 to indicate if the main/blink threads are running, and when the RTT buffer is filled up; it completely dies due to the shell subsys running in a loop.

    Disabling shell via CONFIG_SHELL=n will cause the threads to run as expected.

    I will report this internally as a bug in the shell subsys.

     

    Kind regards,

    Håkon

  • Hi,

    Thanks for reporting this bug! Is there a place where we can keep track of the issue status?

    I guess we can handle this issue on our side having two separate configurations, like Debug and Release-ish

    • On Debug we can have everything enabled, since we're using RTT actively that problem won't arise
    • On Release-ish we can disable Shell and Log messages in order to avoid this

    Mi question is: How can we handle multiple prj.conf files in order to support multiple configurations? I'm aware of how specify overlays, but no configuration files.

    Thanks!

  • Hi,

     

    It depends a bit on what you have, and what you want.

    First option is to have a full prj-debug.conf / prj-release.conf, and the second is to use a common prj.conf file and add functionality on top, for instance, overlay-debug.conf/overlay-bootloader.conf etc.

     

    If you are using vscode, you can select the appropriate .conf file, as well as add a kconfig fragment:

     

    Alternatively, you can do this with west.

    To point to a new prj config (ie. a overwrite):

    west build ... -- -DCONF_FILE=prj-release.conf

     

    To append an already present prj.conf, you can do this to add specific kconfig entries on top:

    west build ... -- -DOVERLAY_CONFIG=overlay-debug.conf

     

    lsoria said:
    Thanks for reporting this bug! Is there a place where we can keep track of the issue status?

    I have reported it internally to the team which works in the upstream zephyr project. A issue has not yet been posted to the zephyrproject's github.

     

    Kind regards,

    Håkon

Related