This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

High latency on Zigbee router

I have a Zigbee router app that is mostly based on the light_bulb sample.  I have noticed that unlike other Zigbee devices on the network (e.g. Philips Hue bulbs), there is often a noticeable lag between the time a command is sent and when the nRF52840 processes it.

Upon further investigation it seems like the Zigbee stack on this device is pretty laggy in general.  For instance, at the bottom of main() I added:

 while (1) {
LOG_INF("tick %lld", k_uptime_get());
zb_ret_t error_code = zb_buf_get_out_delayed_func(tock_cb);
//zb_ret_t error_code = ZB_SCHEDULE_APP_CALLBACK(tock_cb, 0);
if (error_code != RET_OK)
LOG_ERR("error %d", error_code);
k_sleep(K_SECONDS(2));
}

And the function it calls is just:

static void tock_cb(zb_uint8_t bufid)
{
LOG_INF("tock %lld", k_uptime_get());
if (bufid)
zb_buf_free(bufid);
}

zb_buf_get_out_delayed_func() often takes hundreds of milliseconds before running the callback.  This happens even if there is almost no Zigbee network traffic going on, and even if I use zb_mem_config_max.h.  There are no other threads in my app, just the Zigbee thread and the main thread.  Also no Bluetooth support.

I: Device is running
I: tick 44
I: Production configuration is not present or invalid (status: -1)
I: tock 53
I: Zigbee stack initialized
I: Unimplemented signal (signal: 54, status: 0)
I: Joined network successfully on reboot signal (Extended PAN ID: xxx, PAN ID: yyy)
I: tick 2046
I: tock 2112
I: tick 4047
I: tock 4132
I: tick 6048
I: tock 6159
I: tick 8050
I: tock 8187
I: tick 10051
I: tock 10185
I: tick 12052
I: tock 12218
I: tick 14054
I: tock 14247
[...]
I: tick 26062
I: tock 26262
I: tick 28064
I: tock 28444
I: tick 30065
I: tock 30157

When the device is not connected to a network the time delta is on the order of a few milliseconds (presumably some sort of scheduling interval).  Also, if I use ZB_SCHEDULE_APP_CALLBACK instead of zb_buf_get_out_delayed_func, a very short interval is seen as well.  But when it's connected, 

This app isn't using sleepy mode or any other sort of explicit power save feature.

Where is this latency coming from, and what can I do about it?

  • BTW, I did see in https://developer.nordicsemi.com/nRF_Connect_SDK/doc/zboss/3.5.2.0/zigbee_prog_principles.html 

    Except for the scheduler API, the Zigbee stack API is not thread-safe. This means that calls to the Zigbee API must be invoked through scheduler callbacks or from the same thread that runs the scheduler loop. In NCS, the scheduler API is overloaded by functions that make it interrupt- and thread-safe. These functions can be found in nrf/subsys/zigbee/osif/zb_nrf_platform.h.

    which suggests that maybe I shouldn't be trying to allocate ZB bufs from the main app thread.

    But I see zb_buf_get_out_delayed_ext() being called from e.g. non-Zigbee worker threads in light_switch's button_handler(), and even in the timer callback light_switch_button_handler(), so I assumed it was safe to do.  Is that correct?

    If I change my code to do this, the latency problem vanishes:

    static void tock_cb2(zb_uint8_t bufid)
    {
    LOG_INF("tock %lld", k_uptime_get());
    if (bufid)
    zb_buf_free(bufid);
    }

    static void tock_cb1(zb_uint8_t bufid)
    {
    zb_buf_get_out_delayed_func(tock_cb2);
    }

    /* at the end of main() */
    while (1) {
    LOG_INF("tick %lld", k_uptime_get());
    zb_ret_t error_code = ZB_SCHEDULE_APP_CALLBACK(tock_cb1, 0);
    if (error_code != RET_OK)
    LOG_ERR("error %d", error_code);
    k_sleep(K_SECONDS(2));
    }

  • Hello,

    Sorry for the late reply. I also handle the other ticket that you have ongoing. I see that it is the same application, and seeing as you sort of figured this out, is this subject still something you need us to look into?

    BR,

    Edvin

  • Are the samples wrong to be calling zb_buf_get_out_delayed_func() from outside the Zigbee thread?

  • Hello,

    After talking with our Zigbee team, and they said that you should indeed not use the zb_buf_get_out_delayed_func(). You should rather use zb_buf_get_out_delayed(). They said that it should reduce the delay.

    The reason for the delay in the first place is that whenever the Zboss stack decides that it has nothing to do, it may suspend itself for a given amount of time. Using the ZBOSS API doesn't directly resume the ZBOSS thread. If I understood them correctly, using zb_buf_get_out_delayed() (instead of zb_buf_get_out_delayed_func()) will still be thread safe, as the _func() variant also is), but it should also resume/wake up the ZBOSS thread, which should significantly improve the latency. 

    Note that at times of high ZBOSS activity, you still may see some delay.

    They also mentioned that because of what that is explained above, a zigbee application may actually run slower when there is less traffic (less events to wake up the ZBOSS stack).

    BR,

    Edvin

Related