Zigbee end device bound to another device always issues broadcasts to resolve the short network address based on target's MAC

Hi,

I tried searching public question on the matter and wasn't able to find anything reminiscent of the problem I have.

I'm developing a presence sensor using ZigBee protocol for communications. The device is an end device. Among other things it implements a client on/off cluster

for the purposes of direct binding and sending on/off commands directly to appliances (smart lights, smart sockets, etc.).

In principle it works and in a small testing network (2 devices: the said presence sensor and a smart socket) binding my presence sensor to a smart socket works without problems.

However when I tried for testing purposes adding the presence sensor to a much bigger zigbee network (90+ devices: 37 routers, 55 end devices) the results were highly mixed.

When bound to one of smart sockets, the binding seems to work, on/off commands are issued, received and executed in a timely manner with an acceptable minimal latency.

However in another case binding to a smart bulb although succeeds in terms of a binding operation, when it comes to an actual transmission of on/off commands the outcome is highly unreliable:

it works sometimes, much more often a really big delay happens.

I've used Wireshark to sniff the traffic that's happening and was able to conclude that said delays are there due to my presence sensor sending "ZigBee Device Profile: Network Address Request" broadcast requests always to determine a short network address of the target smart bulb. Smart bulb responses with own address as requested. However shortly after (say, when 'Off' command is sent after presence is no more detected), it sends again the same set of broadcast requests to figure out the network address it literally got just half a minute ago.

This suggests that for whatever reason it doesn't store the translation for MAC->address in the address table. The only "workaround" I have at the moment is to add my presence sensor device directly via the smart bulb I want to control, so that it's a direct parent node in the mesh. Then the commands are issued with a valid short address immediately without sending tons of broadcast requests.

I've tried "turning the knobs" like:

#define ZB_CONFIG_OVERALL_NETWORK_SIZE 100 
#define ZB_CONFIG_HIGH_TRAFFIC
#define ZB_CONFIG_APPLICATION_COMPLEX

#include <zb_mem_config_common.h>

#undef ZB_CONFIG_IOBUF_POOL_SIZE
#define ZB_CONFIG_IOBUF_POOL_SIZE 128
#undef ZB_CONFIG_SCHEDULER_Q_SIZE
#define ZB_CONFIG_SCHEDULER_Q_SIZE 64
#undef ZB_CONFIG_APS_DUPS_TABLE_SIZE
#define ZB_CONFIG_APS_DUPS_TABLE_SIZE 64
#define ZB_CONFIG_NWK_DISC_TABLE_SIZE 32U

but this doesn't seem to help.

What am I missing? Is there some other setting I could tweak?

I'm using ZigBee R23 Add-on, 1.2.1

if that matters.

Thanks!  

Parents
  • Hello,

    Can you please try to capture a ZBOSS trace from your ZED node. Try to enable it by adding these Kconfigs to your prj.conf:

    CONFIG_ZBOSS_TRACE_MASK=0x80000051
    CONFIG_ZBOSS_TRACE_LOG_LEVEL_DBG=y

    And monitor the UART output. It will look like jibberish data, but save it to a file and upload it here. It needs to be decoded (since this trace is not open source).

    Best regards,

    Edvin

  • Hi

    it proved a bit challenging. With what you've provided I was unable to get anything.

    And correct me if I've screwed up somewhere, but I had to:

     * define 'ncs,zigbee-uart = &uart00;' in my DTS overlay file (uart instance is correct)

    * also add the following in my config (besides those 2 you've suggested):

    CONFIG_ZIGBEE_ENABLE_TRACES=y
    CONFIG_ZIGBEE_UART_SUPPORTS_FLOW_CONTROL=y
    CONFIG_ZBOSS_TRACE_BINARY_NCP_TRANSPORT_LOGGING=y
    CONFIG_ZIGBEE_HAVE_ASYNC_SERIAL=y
    

    Flow control is indeed present and configured for that uart instance.

    But I'm not sure about the rest. Anyhow with that I was able to get something regularly printed to uart apparently in a binary format.

    Also a curios thing: with either CONFIG_ZIGBEE_HAVE_ASYNC_SERIAL or CONFIG_ZIGBEE_HAVE_SERIAL enabled I somehow losing the initialization of my other uart instances (I communicate with mmWave presence sensors via uart too).

    So I had to disable them and just replace with a dumb timer that every 10 seconds sends the off command to bound devices.

    I was observing the whole thing with a wireshark and saw already familiar abundance of broadcast address requests

    and I was collecting the uart output in parallel.

    Nothing else was using that uart instance, it was assigned only to ncs,zigbee-uart.

    I've captured it with 'tio' providing the correct baudrate, flowcontrol, etc.

    It includes me: binding the lamp to the device. And device attempting to send 'off' command to bound lamp. 

    During this process I saw many broadcast address requests.

    File attached.zboss_trace.bin

  • Hello,

    From our Zigbee team:

    The traces confirms the issue. Somehow the device doesn't reuse the already resolved short address, but current traces doesn't really show exactly why. We have created a new library with more traces and a potential fix. Can you please try to run the same traces again, but use the attached library (copied into ncs-zigbee/lib/zboss/trace/lib/cortex-m33/hard-float/)

    libzboss.ed.a

    Best regards,

    Edvin

  • Hi,

    it's great that you were able to see the issue in the traces, it means I've previously dumped the right thing Slight smile

    I've used and linked the version of zboss lib you've provided, also made sure that it's actually used and actually linked. Unfortunately the potential fix didn't seem to work as I have observed the same behavior.

    I'm attaching the new trace log. It includes binding and sending of several on/off commands.

    Let me know if you need anything else.

    zboss_trace2.bin

  • Thank you!

    Ok, next iteration. Can you please try the attached library?

    libzboss.ed.a-2.zip

    Just rename it and place it in the same location as the previous one. To make sure that it is updated, delete the build folder and build from scratch (this was not an issue. You did use the correct one in the previous trace, I just thought I'd mention it).

    Best regards,

    Edvin

Reply Children
  • Good news Slight smile

    this version of the fix has worked. I could see that my device has resolved the address once and then used the resolution result to send the on/off commands.

    I've also tried it with a different target device with an on/off cluster that was also failing before ('ff:fe' in EUI64 as well) and there it has worked as well.

    This trace is for the IKEA smart lamp:

    zboss_trace3.bin

    This trace is for some noname smart socket:

    zboss_trace4.bin

    Can I have the zboss lib version with a fix but without traces, please? ;-)

    It's just that this one with traces I have to actually read from uart or it freezes %(

    or can I count on the 1.3.1 bugfix maybe..?

    If you need me to test some other version of the fix - let me know, I'm opened for this.

    Thanks!

  • That is great news. See if this one works as expected.
    libzboss.ed.no-traces.zip

    I will get back to you after discussing with the Zigbee team, but I believe this will be patched in the next version of the Zigbee addon.

    But let me know whether this one works as expected or not first.

    Best regards,

    Edvin

  • I can confirm that this library version works as expected.

    Should I mark your post as the 'answer'? Or should I wait until it's published as part of the next ncs-zigbee release and make a post referencing that release?

    Unrelated question: was the bug in the generic part of ZBOSS library or Nordic-specific? In other words: would zigbee-capable MCUs from Espressif (like esp32-c6, esp32-h2) suffer from the same issue since AFAIK they are also using ZBOSS as their Zigbee stack implementation?

    Thanks!

  • theorlangur said:

    Should I mark your post as the 'answer'? Or should I wait until it's published as part of the next ncs-zigbee release and make a post referencing that release?

    That is up to you.

    I can confirm that the bug was a generic Zboss library, but it has been fixed ipstream, so the fix will be included in the next zigbee release from us.

    Whether those other targets are affected, I don't know. It depends on what library, specifically they are using. Not sure if they are all using the exact same code base, or if Zboss delivers slightly different versions to the different vendors.

    Best regards,

    Edvin

  • got it

    would it be possible for you to provide also a router (non-ed) version of the zboss library with a fix and without traces?

    making my device an End Device was rather a forced move while attempting to troubleshoot the network issues at the beginning

    is there a rough ETA on the next ncs-zigbee release? are we talking about weeks or months or more..?

Related