Zigbee SED network loss

Hello DevZone,

I'm developing a battery-powered Zigbee Sleepy End Device, and I have a randomly "loose network connection" issue with my device. By randomly I mean that the same device can work for 2 weeks in the test and in the next test with the "same" testing environment it will lose network in one day. And by "loose network connection" I mean that device messages can be seen by ZC but commands from ZC to ED are sent but not received. ZC still think that ED is in the network and ED thinks that it is in the same network but something in their communication is wrong.

Hardware:

- My device is based on HOLYIOT-18010 module(nRF52840 in it).

- I use Home Assistant with Zigbee2MQTT addon as my Zigbee network coordinator.

- ZC is the SONOFF USB dongle.

Software:

- SEGGER IDE

- SDK 4.2 for Thread and Zigbee

- Wireshark and ZBOSS Sniffer with TI CC2531 USB Dongle

-------------------

Test Setup:

I started ZBOSS Sniffer to write data to the file and connected 3 version of the tested device to the Home Assistant.

Three device are:

- My device battery powered(like normal)

- My second device battery powered(like normal)

- Nordic development board PCA10056 with same firmware version as my device(to check if the problem maybe in the HOLYIOT module)

-----------------

Test results:

After 17 hours I checked all device if they are still working and one of them was not. In the Home Assistant I check reporting from the device(when device sends attribute change) and sending command from HA to the device(when HA sends attribute change to the device), reporting was okish(HA receives about 50% of them) but command send is failed (HA sended 30 commands and device didn't responded once).

In HA device still in the network and after I connected device to the SEGGER with debugger I saw that ZB_JOINED is true and network PAN is correct.

-----------------

Logs:

In the logs file three devices addresses are:

- My device battery powered - 0x6EF0 - this device has "lost network connection"

- My second device battery powered - 0xC28A

- Nordic dev board - 0x0445

I saw on DevZone issues about missing network key change in SED or not responding to any messages SED, so I started to look for those.

Nordic dev board 0x0445 communicated to HA and received messages from it until I stopped the logs.

Second device 0xC28A communicated to HA and received messages from it until I stopped the logs.

And First device 0x6EF0 (that lost network connection) stoped responding to HA messages after network rejoin.

HA still think that first device(0x6EF0) is in the network and sends find route request.

The first device still send attribute reporting and OTA request to HA.

They are both in the same network have same Network security Key but don't hear each other.

 

I can't understand what causes this issue and I need to know where to search next.

Please suggest what this issue can be and how it can be fixed. This is the last major issue with the device until the device release.

Thanks in advance.

logs.pcap

Parents
  • Hi,

    Have you checked the device logs from the device when it fails? Does it give any information regarding why it does not work as expected? You can increase the log levels and enable ZBOSS traces to get more details, see Debugging for more information.

    Have you tried debugging the application when this issue occurs, to see if it is stuck in some state where it will not process the events properly?

    Best regards,
    Jørgen

  • Hi Jørgen,

    I tried to gather logs from SEGGER but failed. The attached debugger holds about 2-3 hours then receives some error and disconnects, so with this setup, I can't catch the bug that occurs once in a few days.

    Logs that where captured:

    <info> app: Buttons and LEDs init. Error code: 0
    
    
    <info> app: Buttons and LEDs was initialised
    
    
    <debug> nrf_dfu_settings: Calling nrf_dfu_settings_init()...
    
    
    <debug> nrf_dfu_flash: Initializing nrf_fstorage_nvmc backend.
    
    
    <debug> nrf_dfu_settings: Using settings page.
    
    
    <debug> nrf_dfu_settings: Copying forbidden parts from backup page.
    
    
    <debug> nrf_dfu_settings: Destination settings are identical to source, write not needed. Skipping.
    
    
    <info> zboss:  DE AD 0A 02 00 00 00 00|........
    
    
    <info> zboss:  DE AD 0E 02 0D 05 18 00|........
    
    
    <info> zboss:  41 00 C9 00 05 00 00 00|A.......
    
    
    <info> zboss:  DE AD 0E 02 A8 09 19 00|.....	..
    
    
    <info> zboss:  41 00 C9 00 05 00 00 00|A.......
    
    
    <info> zboss:  DE AD 12 02 A8 09 1A 00|.....	..
    
    
    <info> zboss:  3E 00 FA 01 0A 00 00 00|>.......
    
    
    <info> zboss:  03 00 00 00            |....    
    
    
    <info> app: Network steering was not successful (status: -1)
    
    
    <info> app: ZBOSS recived BDB signal STEERING
    
    
    <info> app: Network steering restarted (started: 1)
    
    
    <info> zboss:  DE AD 1E 02 32 0A 1B 00|....2...
    
    
    <info> zboss:  3E 00 5A 01 02 00 00 00|>.Z.....
    
    
    <info> zboss:  00 00 00 00 00 00 00 00|........
    
    
    <info> zboss:  00 00 00 00 00 00 00 00|........
    
    
    <info> zboss:  DE AD 0A 02 32 0A 1C 00|....2...
    
    
    <info> zboss:  3E 00 95 01            |>...    
    
    
    <info> zboss:  DE AD 0E 02 CD 0E 1D 00|........
    
    
    <info> zboss:  41 00 C9 00 05 00 00 00|A.......
    
    
    <info> zboss:  DE AD 0E 02 68 13 1E 00|....h...
    
    
    <info> zboss:  41 00 C9 00 05 00 00 00|A.......
    
    
    <info> zboss:  DE AD 12 02 68 13 1F 00|....h...
    
    
    <info> zboss:  3E 00 FA 01 0A 00 00 00|>.......
    
    
    <info> zboss:  03 00 00 00            |....    
    
    
    <info> app: Network steering was not successful (status: -1)
    
    
    <info> app: ZBOSS recived BDB signal STEERING
    
    
    <info> app: Valve closing
    
    
    <info> app: Network steering restarted (started: 1)
    
    
    <info> zboss:  DE AD 1E 02 74 14 20 00|....t. .
    
    
    <info> zboss:  3E 00 5A 01 02 00 00 00|>.Z.....
    
    
    <info> zboss:  00 00 00 00 00 00 00 00|........
    
    
    <info> zboss:  00 00 00 00 00 00 00 00|........
    
    
    <info> zboss:  DE AD 0A 02 74 14 21 00|....t.!.
    
    
    <info> zboss:  3E 00 95 01            |>...    
    
    
    <info> zboss:  DE AD 0E 02 0F 19 22 00|......".
    
    
    <info> zboss:  41 00 C9 00 05 00 00 00|A.......
    
    
    <info> zboss:  DE AD 0E 02 AA 1D 23 00|......#.
    
    
    <info> zboss:  41 00 C9 00 05 00 00 00|A.......
    
    
    <info> zboss:  DE AD 12 02 AA 1D 24 00|......$.
    
    
    <info> zboss:  3E 00 FA 01 0A 00 00 00|>.......
    
    
    <info> zboss:  03 00 00 00            |....    
    
    
    <info> app: Network steering was not successful (status: -1)
    
    
    <info> app: ZBOSS recived BDB signal STEERING
    
    
    <info> app: Network steering restarted (started: 1)
    
    
    <info> zboss:  DE AD 1E 02 BA 1F 25 00|......%.
    
    
    <info> zboss:  3E 00 5A 01 02 00 00 00|>.Z.....
    
    
    <info> zboss:  00 00 00 00 00 00 00 00|........
    
    
    <info> zboss:  00 00 00 00 00 00 00 00|........
    
    
    <info> zboss:  DE AD 0A 02 BA 1F 26 00|......&.
    
    
    <info> zboss:  3E 00 95 01            |>...    
    
    
    <info> zboss:  DE AD 0E 02 55 24 27 00|....U$'.
    
    
    <info> zboss:  41 00 C9 00 05 00 00 00|A.......
    
    
    <info> app: Valve closing
    
    
    <info> zboss:  DE AD 0E 02 F0 28 28 00|.....((.
    
    
    <info> zboss:  41 00 C9 00 05 00 00 00|A.......
    
    
    <info> zboss:  DE AD 12 02 F0 28 29 00|.....().
    
    
    <info> zboss:  3E 00 FA 01 0A 00 00 00|>.......
    
    
    <info> zboss:  03 00 00 00            |....    
    
    
    <info> app: Network steering was not successful (status: -1)
    
    
    <info> app: ZBOSS recived BDB signal STEERING
    
    
    <info> app: Network steering restarted (started: 1)
    
    
    <info> zboss:  DE AD 1E 02 09 2D 2A 00|....	-*.
    
    
    <info> zboss:  3E 00 5A 01 02 00 00 00|>.Z.....
    
    
    <info> zboss:  00 00 00 00 00 00 00 00|........
    
    
    <info> zboss:  00 00 00 00 00 00 00 00|........
    
    
    <info> zboss:  DE AD 0A 02 09 2D 2B 00|....	-+.
    
    
    <info> zboss:  3E 00 95 01            |>...    
    
    
    <info> zboss:  DE AD 0E 02 A4 31 2C 00|.....1,.
    
    
    <info> zboss:  41 00 C9 00 05 00 00 00|A.......
    
    
    <info> app: Network rejoin procedure requested to be stopped.
    
    
    <info> app: Network rejoin procedure stopped as scheduled.
    
    
    <info> zboss:  DE AD 0E 02 3F 36 2D 00|....?6-.
    
    
    <info> zboss:  41 00 C9 00 05 00 00 00|A.......
    
    
    <info> zboss:  DE AD 12 02 3F 36 2E 00|....?6..
    
    
    <info> zboss:  3E 00 FA 01 0A 00 00 00|>.......
    
    
    <info> zboss:  03 00 00 00            |....    
    
    
    <info> app: Network steering was not successful (status: -1)
    
    
    <info> app: ZBOSS recived BDB signal STEERING
    
    
    <info> app: Valve closing
    
    
    <info> app: Valve closing
    
    
    

    Maybe you have any other suggestions?

    Best regards.

Reply
  • Hi Jørgen,

    I tried to gather logs from SEGGER but failed. The attached debugger holds about 2-3 hours then receives some error and disconnects, so with this setup, I can't catch the bug that occurs once in a few days.

    Logs that where captured:

    <info> app: Buttons and LEDs init. Error code: 0
    
    
    <info> app: Buttons and LEDs was initialised
    
    
    <debug> nrf_dfu_settings: Calling nrf_dfu_settings_init()...
    
    
    <debug> nrf_dfu_flash: Initializing nrf_fstorage_nvmc backend.
    
    
    <debug> nrf_dfu_settings: Using settings page.
    
    
    <debug> nrf_dfu_settings: Copying forbidden parts from backup page.
    
    
    <debug> nrf_dfu_settings: Destination settings are identical to source, write not needed. Skipping.
    
    
    <info> zboss:  DE AD 0A 02 00 00 00 00|........
    
    
    <info> zboss:  DE AD 0E 02 0D 05 18 00|........
    
    
    <info> zboss:  41 00 C9 00 05 00 00 00|A.......
    
    
    <info> zboss:  DE AD 0E 02 A8 09 19 00|.....	..
    
    
    <info> zboss:  41 00 C9 00 05 00 00 00|A.......
    
    
    <info> zboss:  DE AD 12 02 A8 09 1A 00|.....	..
    
    
    <info> zboss:  3E 00 FA 01 0A 00 00 00|>.......
    
    
    <info> zboss:  03 00 00 00            |....    
    
    
    <info> app: Network steering was not successful (status: -1)
    
    
    <info> app: ZBOSS recived BDB signal STEERING
    
    
    <info> app: Network steering restarted (started: 1)
    
    
    <info> zboss:  DE AD 1E 02 32 0A 1B 00|....2...
    
    
    <info> zboss:  3E 00 5A 01 02 00 00 00|>.Z.....
    
    
    <info> zboss:  00 00 00 00 00 00 00 00|........
    
    
    <info> zboss:  00 00 00 00 00 00 00 00|........
    
    
    <info> zboss:  DE AD 0A 02 32 0A 1C 00|....2...
    
    
    <info> zboss:  3E 00 95 01            |>...    
    
    
    <info> zboss:  DE AD 0E 02 CD 0E 1D 00|........
    
    
    <info> zboss:  41 00 C9 00 05 00 00 00|A.......
    
    
    <info> zboss:  DE AD 0E 02 68 13 1E 00|....h...
    
    
    <info> zboss:  41 00 C9 00 05 00 00 00|A.......
    
    
    <info> zboss:  DE AD 12 02 68 13 1F 00|....h...
    
    
    <info> zboss:  3E 00 FA 01 0A 00 00 00|>.......
    
    
    <info> zboss:  03 00 00 00            |....    
    
    
    <info> app: Network steering was not successful (status: -1)
    
    
    <info> app: ZBOSS recived BDB signal STEERING
    
    
    <info> app: Valve closing
    
    
    <info> app: Network steering restarted (started: 1)
    
    
    <info> zboss:  DE AD 1E 02 74 14 20 00|....t. .
    
    
    <info> zboss:  3E 00 5A 01 02 00 00 00|>.Z.....
    
    
    <info> zboss:  00 00 00 00 00 00 00 00|........
    
    
    <info> zboss:  00 00 00 00 00 00 00 00|........
    
    
    <info> zboss:  DE AD 0A 02 74 14 21 00|....t.!.
    
    
    <info> zboss:  3E 00 95 01            |>...    
    
    
    <info> zboss:  DE AD 0E 02 0F 19 22 00|......".
    
    
    <info> zboss:  41 00 C9 00 05 00 00 00|A.......
    
    
    <info> zboss:  DE AD 0E 02 AA 1D 23 00|......#.
    
    
    <info> zboss:  41 00 C9 00 05 00 00 00|A.......
    
    
    <info> zboss:  DE AD 12 02 AA 1D 24 00|......$.
    
    
    <info> zboss:  3E 00 FA 01 0A 00 00 00|>.......
    
    
    <info> zboss:  03 00 00 00            |....    
    
    
    <info> app: Network steering was not successful (status: -1)
    
    
    <info> app: ZBOSS recived BDB signal STEERING
    
    
    <info> app: Network steering restarted (started: 1)
    
    
    <info> zboss:  DE AD 1E 02 BA 1F 25 00|......%.
    
    
    <info> zboss:  3E 00 5A 01 02 00 00 00|>.Z.....
    
    
    <info> zboss:  00 00 00 00 00 00 00 00|........
    
    
    <info> zboss:  00 00 00 00 00 00 00 00|........
    
    
    <info> zboss:  DE AD 0A 02 BA 1F 26 00|......&.
    
    
    <info> zboss:  3E 00 95 01            |>...    
    
    
    <info> zboss:  DE AD 0E 02 55 24 27 00|....U$'.
    
    
    <info> zboss:  41 00 C9 00 05 00 00 00|A.......
    
    
    <info> app: Valve closing
    
    
    <info> zboss:  DE AD 0E 02 F0 28 28 00|.....((.
    
    
    <info> zboss:  41 00 C9 00 05 00 00 00|A.......
    
    
    <info> zboss:  DE AD 12 02 F0 28 29 00|.....().
    
    
    <info> zboss:  3E 00 FA 01 0A 00 00 00|>.......
    
    
    <info> zboss:  03 00 00 00            |....    
    
    
    <info> app: Network steering was not successful (status: -1)
    
    
    <info> app: ZBOSS recived BDB signal STEERING
    
    
    <info> app: Network steering restarted (started: 1)
    
    
    <info> zboss:  DE AD 1E 02 09 2D 2A 00|....	-*.
    
    
    <info> zboss:  3E 00 5A 01 02 00 00 00|>.Z.....
    
    
    <info> zboss:  00 00 00 00 00 00 00 00|........
    
    
    <info> zboss:  00 00 00 00 00 00 00 00|........
    
    
    <info> zboss:  DE AD 0A 02 09 2D 2B 00|....	-+.
    
    
    <info> zboss:  3E 00 95 01            |>...    
    
    
    <info> zboss:  DE AD 0E 02 A4 31 2C 00|.....1,.
    
    
    <info> zboss:  41 00 C9 00 05 00 00 00|A.......
    
    
    <info> app: Network rejoin procedure requested to be stopped.
    
    
    <info> app: Network rejoin procedure stopped as scheduled.
    
    
    <info> zboss:  DE AD 0E 02 3F 36 2D 00|....?6-.
    
    
    <info> zboss:  41 00 C9 00 05 00 00 00|A.......
    
    
    <info> zboss:  DE AD 12 02 3F 36 2E 00|....?6..
    
    
    <info> zboss:  3E 00 FA 01 0A 00 00 00|>.......
    
    
    <info> zboss:  03 00 00 00            |....    
    
    
    <info> app: Network steering was not successful (status: -1)
    
    
    <info> app: ZBOSS recived BDB signal STEERING
    
    
    <info> app: Valve closing
    
    
    <info> app: Valve closing
    
    
    

    Maybe you have any other suggestions?

    Best regards.

Children
  • Hi,

    Jørgen is out of office, so I have taken over your ticket.

    One thing I see in the sniffer log is that the device sends a Device Announcement that is not acknowledged. There is a known issue related to this, where the end device does not recover from a broken rejoin procedure.

    Can you try the workaround for the known issue and see if that makes the end device able to successfully rejoin the network?

    1. Introduce helper variable joining_signal_received.

    2. Extend zigbee_default_signal_handler() by completing the following steps:

      1. Set joining_signal_received to true in the following signals: ZB_BDB_SIGNAL_DEVICE_FIRST_START, ZB_BDB_SIGNAL_DEVICE_REBOOT, ZB_BDB_SIGNAL_STEERING.

      2. If leave_type is set to ZB_NWK_LEAVE_TYPE_REJOIN, set joining_signal_received to false in the ZB_ZDO_SIGNAL_LEAVE signal.

      3. Handle the ZB_NLME_STATUS_INDICATION signal to detect when End Device failed to transmit packet to its parent, reported by signal’s status ZB_NWK_COMMAND_STATUS_PARENT_LINK_FAILURE.

    See the following snippet for an example:

    /* Add helper variable that will be used for detecting broken rejoin procedure. */
    /* Flag indicating if joining signal has been received since restart or leave with rejoin. */
    bool joining_signal_received = false;
    /* Extend the zigbee_default_signal_handler() function. */
    case ZB_BDB_SIGNAL_DEVICE_FIRST_START:
        ...
        joining_signal_received = true;
        break;
    case ZB_BDB_SIGNAL_DEVICE_REBOOT:
        ...
        joining_signal_received = true;
        break;
    case ZB_BDB_SIGNAL_STEERING:
        ...
        joining_signal_received = true;
        break;
    case ZB_ZDO_SIGNAL_LEAVE:
        if (status == RET_OK) {
            zb_zdo_signal_leave_params_t *leave_params = ZB_ZDO_SIGNAL_GET_PARAMS(sig_hndler, zb_zdo_signal_leave_params_t);
            LOG_INF("Network left (leave type: %d)", leave_params->leave_type);
    
            /* Set joining_signal_received to false so broken rejoin procedure can be detected correctly. */
            if (leave_params->leave_type == ZB_NWK_LEAVE_TYPE_REJOIN) {
                joining_signal_received = false;
            }
        ...
        break;
    case ZB_NLME_STATUS_INDICATION: {
        zb_zdo_signal_nlme_status_indication_params_t *nlme_status_ind =
            ZB_ZDO_SIGNAL_GET_PARAMS(sig_hndler, zb_zdo_signal_nlme_status_indication_params_t);
        if (nlme_status_ind->nlme_status.status == ZB_NWK_COMMAND_STATUS_PARENT_LINK_FAILURE) {
    
            /* Check for broken rejoin procedure and restart the device to recover. */
            if (stack_initialised && !joining_signal_received) {
                zb_reset(0);
            }
        }
        break;
    }

    This is a known issue in nRF Connect SDK, but it is from a version of the SDK using the same version of the ZBOSS stack as nRF5 SDK for Thread and Zigbee v4.2.0.

    Best regards,
    Marte

Related