Zigbee traces use the wrong log level

I am using the light_bulb sample from NCS v2.0.0 with an nRF52840 DK board.  I changed prj.conf to enable Warning traces from ZBOSS:

+# Enable traces on UART_1 (P1.02) @ 115200bps
+CONFIG_ZIGBEE_ENABLE_TRACES=y
+CONFIG_ZBOSS_TRACE_MASK=0x0003
+CONFIG_ZBOSS_TRACE_LOG_LEVEL_WRN=y
+CONFIG_ZBOSS_TRACE_BINARY_LOGGING=y
+CONFIG_ZBOSS_TRACE_LOGGER_DEVICE_NAME="UART_1"

I added test code to the app to verify that the correct loglevel setting was being applied and that the filters are working as intended:

 extern int zb_trace_check(zb_uint_t level, zb_uint_t mask);
LOG_INF("TRACE: %d: %d %d %d %d %d", g_trace_level,
zb_trace_check(0, 1),
zb_trace_check(1, 1),
zb_trace_check(2, 1),
zb_trace_check(3, 1),
zb_trace_check(4, 1));
LOG_INF("TRACE: %d: %d %d %d %d %d", g_trace_level,
zb_trace_check(0, 0x100),
zb_trace_check(1, 0x100),
zb_trace_check(2, 0x100),
zb_trace_check(3, 0x100),
zb_trace_check(4, 0x100));

The output was:

I: TRACE: 2: 1 1 1 0 0
I: TRACE: 2: 0 0 0 0 0

Unfortunately this still generates a massive amount of binary log data on UART1.  All of these look to be extremely mundane operations that hardly merit the Warning loglevel:

2022-06-09 16:23:23,115 RAW [de,ad,0e,02,9a,00,3d,07,18,00,77,02,06,00,00,00]
2022-06-09 16:23:23,115 ts=009a m=0x0002 lev=1 zb_buf_begin_func:631 data=[06,00,00,00]
2022-06-09 16:23:23,115 RAW [de,ad,0e,02,9a,00,3e,07,18,00,17,05,06,00,00,00]
2022-06-09 16:23:23,115 ts=009a m=0x0002 lev=1 zb_buf_get_status_func:1303 data=[06,00,00,00]
2022-06-09 16:23:23,115 RAW [de,ad,0e,02,9a,00,3f,07,18,00,1e,05,00,00,00,00]
2022-06-09 16:23:23,115 ts=009a m=0x0002 lev=1 zb_buf_get_status_func:1310 data=[00,00,00,00]
2022-06-09 16:23:23,120 RAW [de,ad,0e,02,9a,00,40,07,18,00,a1,02,06,00,00,00]
2022-06-09 16:23:23,120 ts=009a m=0x0002 lev=1 zb_buf_len_func:673 data=[06,00,00,00]
2022-06-09 16:23:23,120 RAW [de,ad,0e,02,9a,00,41,07,18,00,a1,02,06,00,00,00]
2022-06-09 16:23:23,120 ts=009a m=0x0002 lev=1 zb_buf_len_func:673 data=[06,00,00,00]
2022-06-09 16:23:23,121 RAW [de,ad,0e,02,9a,00,42,07,18,00,77,02,06,00,00,00]
2022-06-09 16:23:23,121 ts=009a m=0x0002 lev=1 zb_buf_begin_func:631 data=[06,00,00,00]
2022-06-09 16:23:23,121 RAW [de,ad,0e,02,9a,00,43,07,18,00,17,05,06,00,00,00]
2022-06-09 16:23:23,121 ts=009a m=0x0002 lev=1 zb_buf_get_status_func:1303 data=[06,00,00,00]
2022-06-09 16:23:23,136 RAW [de,ad,0e,02,9a,00,44,07,18,00,1e,05,00,00,00,00]

For instance, zb_buf_get_tail_func() is logging at the Error level even when no errors are occurring.

The reason why I worry about this is because I'm trying to capture information about an actual ZBOSS failure, and there is so much data coming out of this UART that it's dropping/corrupting frames.

I see similar issues when using other trace masks too.

  • Hello,

    Sorry for the basic questions, but what is the logging supposed to output?

    I: TRACE: 2: 1 1 1 0 0
    I: TRACE: 2: 0 0 0 0 0

    The zboss trace log is meant to log a bit, but unless they changed anything, the logs are encrypted, and they need to be sent to the company who provided the zboss stack (via us). Is there somethign particular you want to debug? 

    Best regards,

    Edvin

  • what is the logging supposed to output?

    I'm calling the function that ZBOSS uses to figure out whether to log a given message to the trace output:

    extern int zb_trace_check(zb_uint_t level, zb_uint_t mask);

    My test shows that the log mask and log level set in Kconfig are being respected by ZBOSS.  The problem is that the callers to zb_trace_msg_port() are specifying unreasonably high log level parameters, like logging routine buffer allocation events as errors.

    The zboss trace log is meant to log a bit, but unless they changed anything, the logs are encrypted, and they need to be sent to the company who provided the zboss stack (via us).

    They are not encrypted.  The binary format is trivial to decode; I wrote a couple of simple python scripts to work with them.

    Is there somethign particular you want to debug? 

    Yes, I'd like to reduce the trace log output volume because it's generating output faster than the serial/CDC output can handle it.  I'm troubleshooting a couple of different ZBOSS issues right now and this bug is potentially an impediment.

    Could you please ask DSR to fix the logging levels so that configuring e.g. CONFIG_ZBOSS_TRACE_LOG_LEVEL_WRN=y only outputs warnings rather than spewing debug traces into the binary trace log?

  • I see. I will forward this to our Zigbee team. I will keep you posted.

    Best regards,

    Edvin

  • Hello,

    This is the reply I got from our Zigbee team:

    ---------------------------

    I don't really know what the customer wants to achieve. They have mentioned capturing ZBOSS errors - ZBOSS asserts are used for that by the stack.
    I am guessing that this ticket was created because of the network "stability issues" they previously had.

    I can't verify whether or not these logs does in fact are all about "extremely mundane operations that hardly merit the Warning loglevel" if they can collect these logs and provide them to us.

    When it comes to improving logging the ZBOSS traces, they could speed up UART by changing baudrate can be raised to 1M and/or increase internal ring buffer that stores zboss traces binary data.

    Also, you can inform customer that without access to the ZBOSS stack sources they wouldn't be able to decode the ZBOSS trace messages.

    Could you please ask DSR to fix the logging levels so that configuring e.g. CONFIG_ZBOSS_TRACE_LOG_LEVEL_WRN=y only outputs warnings rather than spewing debug traces into the binary trace log?

    There are no DBG level trace logs in the trace variant of ZBOSS libraries as they are not compiled in. Only logs up to Warning log level are compiled in.

    ----------------------------

    I am not really sure what you are trying to report, and I see that you have a lot of questions/cases here on DevZone. Are they all related? If so, perhaps we can try to boil it down to one ticket. The reason I am saying this is that our Zigbee team has a limited amount of support resources, and now they are seeing a lot of unrelated questions, which makes it take even longer to get a reply. Also, it doesn't help that it is summer holidays in Norway, unfortunately. 

    Best regards,

    Edvin

  • I can't verify whether or not these logs does in fact are all about "extremely mundane operations that hardly merit the Warning loglevel" if they can collect these logs and provide them to us.

    These were posted in text format in the original request.  I'll attach the corresponding binary trace here:

    log.bin

    As I mentioned earlier, I get these rapid-fire debug logs when I enable CONFIG_ZBOSS_TRACE_LOG_LEVEL_WRN=y.  This is quite obviously incorrect, because enabling the log option for "warnings+errors only" shouldn't be generating a flurry of log data on mundane buffer allocation events that aren't experiencing problems.  It should only be logging actual warnings.

    I would like to be able to enable warnings+errors (only) from all ZBOSS modules so I can spot problems, but because of this bug, enabling warnings generates logging overruns and massive amounts of irrelevant data.

    Also, you can inform customer that without access to the ZBOSS stack sources they wouldn't be able to decode the ZBOSS trace messages.

    I wrote my own parser to decode the trace messages, which is able to display many of the fields without source access.  It isn't able to decode the free-form parameters at the end, although sometimes I can figure out what those represent.  I'm using this parser to describe the log entries that I have questions about, as I did in this ticket.

    our Zigbee team has a limited amount of support resources, and now they are seeing a lot of unrelated questions, which makes it take even longer to get a reply

    Unfortunately I am finding a lot of Zigbee-related problems in the course of developing this product.

    When possible I try to fix the bug myself, and I've sent you a couple of pull requests on Github.  But my ability to troubleshoot the closed source libraries is somewhat limited so regrettably many of the bugs do need to be escalated to the Zigbee team.  I appreciate your support in getting to the bottom of things.

Related