Inquiry regarding ZBOSS bug fixes in NCS v2.9.0: KRKNWK-17472 and NCSIDB-1410

Hello,

We are currently developing a Zigbee product using nRF Connect SDK v2.8.0. While reviewing the release notes for NCS v2.9.0, we noticed two critical bug fixes related to the ZBOSS stack.

To evaluate whether we should migrate our current project to v2.9.0, we would like to understand the technical details and risks associated with these issues. Could you please provide more information on the following?

1. [KRKNWK-17472] Zigbee devices stops sending/receiving packets when jammed or high wireless traffic is present

  • Root Cause: What is the underlying cause of this issue? Is it a buffer exhaustion, a MAC layer state machine deadlock, or something else?

  • Failure Mode: When the device "stops sending/receiving," does the ZBOSS stack require a hardware reset to recover, or can it recover automatically once the traffic subsides?

  • Reproducibility: Does this occur specifically under 2.4GHz Wi-Fi interference, or is it triggered by a high density of Zigbee packets?

  • Workaround: Is there any known workaround or configuration change available for NCS v2.8.0?

2. [NCSIDB-1410] NWK_addr_req Extended Response issue fix

  • Symptom: What was the specific incorrect behavior in the Extended Response of the NWK_addr_req? (e.g., missing fields, incorrect status codes, or failure to respond to specific request types?)

  • Impact: Does this affect standard device discovery/binding processes or specific Zigbee profiles?

  • Scope: Does this issue primarily affect End Devices, Routers, or the Coordinator?

We are concerned about the stability of our current v2.8.0-based firmware in noisy environments. Any technical insights or internal ticket details you can share would be greatly appreciated.

Best regards,

Parents
  • Hi Atsushi, 

    I will ask internally if we are able to share any details on these tickets with you. I'll get back to you before Thursday. 

    Since KRKNWK-17472 was a ZBOSS issue, it is possible that there is no workaround you can do on your end, but I will ask. 

    Best regards,

    Maria

  • Hello Atsushi, 

    1. [KRKNWK-17472] Zigbee devices stops sending/receiving packets when jammed or high wireless traffic is present

    • Root Cause: What is the underlying cause of this issue? Is it a buffer exhaustion, a MAC layer state machine deadlock, or something else?

    The root cause is a memory leak inside the ZBOSS stack, which caused an OOM state to occur. 

    Failure Mode: When the device "stops sending/receiving," does the ZBOSS stack require a hardware reset to recover, or can it recover automatically once the traffic subsides?

    A hardware reset was required to recover. 

    Reproducibility: Does this occur specifically under 2.4GHz Wi-Fi interference, or is it triggered by a high density of Zigbee packets?

    This was reproduced with a very high density of Zigbee packets. A jammer continously (k_sleep of 1ms) sent 802.15.4 packets on one channel to a broadcast PAN ID and broadcast address. 

    Workaround: Is there any known workaround or configuration change available for NCS v2.8.0?

    A workaround which has worked for others is to implement a watchdog inside zigbee_l2_recv(). But the recommended way to resolve this issue is to upgrade the SDK. 

    2. [NCSIDB-1410] NWK_addr_req Extended Response issue fix

    This issue is pretty well described in this DevZone ticket:  CSA Certification Issues for ZBOSS 3.11.4.0 End Devices

    Symptom: What was the specific incorrect behavior in the Extended Response of the NWK_addr_req? (e.g., missing fields, incorrect status codes, or failure to respond to specific request types?)

    The Extended response was the same as the Simple response.

    Impact: Does this affect standard device discovery/binding processes or specific Zigbee profiles?

    Any End Device which receives NWK_addr_req is affected. This may happen in standard device discovery and binding processes. 

    Scope: Does this issue primarily affect End Devices, Routers, or the Coordinator?

    This issue only affects End Devices. The process for Extended responses for Routers and Coordinators is defined in the specification and implemented in the stack accordingly. 

    Best regards,

    Maria

Reply
  • Hello Atsushi, 

    1. [KRKNWK-17472] Zigbee devices stops sending/receiving packets when jammed or high wireless traffic is present

    • Root Cause: What is the underlying cause of this issue? Is it a buffer exhaustion, a MAC layer state machine deadlock, or something else?

    The root cause is a memory leak inside the ZBOSS stack, which caused an OOM state to occur. 

    Failure Mode: When the device "stops sending/receiving," does the ZBOSS stack require a hardware reset to recover, or can it recover automatically once the traffic subsides?

    A hardware reset was required to recover. 

    Reproducibility: Does this occur specifically under 2.4GHz Wi-Fi interference, or is it triggered by a high density of Zigbee packets?

    This was reproduced with a very high density of Zigbee packets. A jammer continously (k_sleep of 1ms) sent 802.15.4 packets on one channel to a broadcast PAN ID and broadcast address. 

    Workaround: Is there any known workaround or configuration change available for NCS v2.8.0?

    A workaround which has worked for others is to implement a watchdog inside zigbee_l2_recv(). But the recommended way to resolve this issue is to upgrade the SDK. 

    2. [NCSIDB-1410] NWK_addr_req Extended Response issue fix

    This issue is pretty well described in this DevZone ticket:  CSA Certification Issues for ZBOSS 3.11.4.0 End Devices

    Symptom: What was the specific incorrect behavior in the Extended Response of the NWK_addr_req? (e.g., missing fields, incorrect status codes, or failure to respond to specific request types?)

    The Extended response was the same as the Simple response.

    Impact: Does this affect standard device discovery/binding processes or specific Zigbee profiles?

    Any End Device which receives NWK_addr_req is affected. This may happen in standard device discovery and binding processes. 

    Scope: Does this issue primarily affect End Devices, Routers, or the Coordinator?

    This issue only affects End Devices. The process for Extended responses for Routers and Coordinators is defined in the specification and implemented in the stack accordingly. 

    Best regards,

    Maria

Children
  • Hello again,

    Thank you for the detailed clarification regarding KRKNWK-17472.

    Regarding the "Failure Mode," you mentioned that a hardware reset is required to recover from the OOM state caused by the memory leak inside the ZBOSS stack. I have a follow-up question regarding the system-wide impact of this OOM condition.

    When the ZBOSS stack hits this OOM state and stops functioning, do the UART and BLE stacks remain alive and operational? We would like to understand if this memory leak is isolated to ZBOSS-specific memory pools, or if it exhausts the global system heap, thereby halting other independent peripherals and protocol stacks.

    Could you please clarify if UART and BLE communications are expected to survive this specific scenario?

    Thank you for your continued support.

    Best regards,

    Atsushi

Related