Zigbee ZBOSS Fatal Error After Changing ZB_DEV_REJOIN_TIMEOUT_MS

We are producing an issue where the ZBOSS stack will emit a ZBOSS Fatal Error without any context if we alter ZB_DEV_REJOIN_TIMEOUT_MS. Specifically, the steps to reproduce are:

1. Build light_bulb sample with device configured as an sleepy end device with TC_REJOIN enabled, adding the build flag ZB_DEV_REJOIN_TIMEOUT_MS=604800000

west build -b nrf52840dk_nrf52840 -- -DCMAKE_C_FLAGS="-DZB_DEV_REJOIN_TIMEOUT_MS=604800000"

2. Connect to a hub successfully

3. Remove power from the hub so that the end device enters rejoin steering. After 30-35 minutes of steering, we observe a ZBOSS fatal error.

In these circumstances, we are producing this failure every time on the light_bulb sample app with nrf52840dk. Since the fatal error is emitted within the ZBOSS stack and has no surrounding context, we are unsure what the underlying cause of the failure is. If we do not modify ZB_DEV_REJOIN_TIMEOUT_MS, a fatal error is not seen. We'd like to increase ZB_DEV_REJOIN_TIMEOUT in order to allow the device to be more resilient to loss of power in the coordinator, however this behavior seems to prohibit that.

nRF Connect SDK v2.6.0 - below attached is console log, ZBOSS trace, and built hex app which produces the issue.

/cfs-file/__key/communityserver-discussions-components-files/4/1854.merged.hex

/cfs-file/__key/communityserver-discussions-components-files/4/zboss_2D00_fatal_2D00_error.log.zip

Parents
  • Hi Kendall,

    1. Build light_bulb sample with device configured as an sleepy end device with TC_REJOIN enabled, adding the build flag ZB_DEV_REJOIN_TIMEOUT_MS=604800000

    I understand that you want to save power, though a week for ZB_DEV_REJOIN_TIMEOUT_MS is very high. I'm not seeing any info in the docs about any limit for this, I can look into that. 

    Are you seeing this for any value for ZB_DEV_REJOIN_TIMEOUT_MS  higher than these 30-35 minutes?

    Regards,

    Elfving

  • Elfving,

    Yes, if we set ZB_DEV_REJOIN_TIMEOUT_MS to 2400000 we still see the same failure case. Further, as a workaround, we tried to leave ZB_DEV_REJOIN_TIMEOUT_MS at its default value, but instead feed user_input_indicate() regularly to restart the steering process after it times out. However, even in this case, we still see the same fatal error after 30-35 minutes.

    For our product use case, we want to ensure that we are resilient to a power outage which would disable the hub for some period of time, but we still want to reconnect to the hub again once it is online without the need for user interaction with the product. As such, we feel that a 7d rejoin timeout will likely confirm that the device eventually connects to the hub, while not being too power expensive (since rejoin uses a exponential backoff capping at 15 minutes). 

    Thanks

    Kendall

Reply
  • Elfving,

    Yes, if we set ZB_DEV_REJOIN_TIMEOUT_MS to 2400000 we still see the same failure case. Further, as a workaround, we tried to leave ZB_DEV_REJOIN_TIMEOUT_MS at its default value, but instead feed user_input_indicate() regularly to restart the steering process after it times out. However, even in this case, we still see the same fatal error after 30-35 minutes.

    For our product use case, we want to ensure that we are resilient to a power outage which would disable the hub for some period of time, but we still want to reconnect to the hub again once it is online without the need for user interaction with the product. As such, we feel that a 7d rejoin timeout will likely confirm that the device eventually connects to the hub, while not being too power expensive (since rejoin uses a exponential backoff capping at 15 minutes). 

    Thanks

    Kendall

Children
No Data
Related