This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

Zigbee Light Demo stops responding

I programmed a NRF52840 Dongle with the example light switch code having changed the channel to 14 which is what my coordinator is on.

I then programmed a NRF52840DK with the example switch code and adjusted the search to look specifically for the dongle with the endpoint at 10. All the other compatible devices in my Zigbee network have endpoints set to 1.

I programmed my Smartthings hub to recognise the Nordic dongle as a simple light bulb.

I put the hub in discover mode and fire up the dongle and the DK which both connect.

The Smartthings hub recognises the dongle as a light and I can turn it on and off from the Smartthings app on my phone.

The NRF52840DK switch also finds the dongle and I can control the light on the dongle using the two buttons on the DK board. The Smartthings hub DOES NOT see the updates to the dongle made from the DK switch unless I manually refresh it, but that is another issue.

Everything is fine for a random amount of time varying between a few minutes and several hours up to just over a day on one occasion. Then the light (dongle) simply stops responding. It no longer responds to the Smartthings hub nor to the DK board switch. If I restart the switch it fails to find the light. Only when I restart the light does it work again.

Does anyone have any idea why this should be happening? Can anyone suggest how I can debug this situation?

Parents
  • Hi,

    Have you tried running the light bulb application on the DK, and do you experience this issue on this as well? If you see the issue, you should enter debug mode and check if you can see where the application is stuck. You can also connect the debug out pins of the DK to SWD pinson the dongle, to debug the application running on this as well.

    Best regards,
    Jørgen

  • Can you please share the project you used when seeing this issue? That would help us with reproducing and debugging it.

  • I had to download the Wireshark source and recompile it to get it to run on the Mac with the nRf52840 dongle, sorry for the delay.

    I don't see where to add an attachment so the sniffer file is on my WEB site at www.cedartechnology.com/.../nRF52840DK.pcapng

    The nRF52840DK is running the sample light bulb application. It was started just after the trace was begun. The device has address 0x10ad. The hub (address 0x0000) is a SmartThings V2 hub and has the nRF52840 listed as a light bulb. I am able to turn the light on and off via the SmartThings application on an iPhone.

    Looking at the trace the last Link Status broadcast from the nRF52840 is at entry number 29229 followed by a couple of Route Reply messages which I assume are in response to the broadcast Route Request messages from the devices that are being replied to.

    After this the nRF52840DK became unresponsive as can be seen starting at row 31914 where the SmartThings hub sent the Off command to turn off the light. This was not acknowledged. The hub repeated the command several times but no acknowledgements were sent by the nRF52840.

    I do not see problems when the nRF52840 is in its own little network with the sample coordinator and switch only when in the full environment with the Samsung SmartThings controlled network.

    Thanks very much for looking into this. If I can supply any further information please let me know. I can be reach by phone at +1-352-281-6286

  • Unfortunately, I'm not able to read the sniffer trace as it is encrypted. Please make sure you run the sniffer before commissioning the device into the network, for the sniffer to obtain the keys. This could require you to erase the devices before starting the commissioning procedure on the devices again.

  • Sorry about that. I did not realise the trace would not have been decrypted already. To save time, here are the keys my system is using to read this data...

    Transport Key : 5A 69 67 42 65 65 41 6C 6C 69 61 6E 63 65 30 39

    Network Key : 50dcaba83f417592833406696ed2bea4

    I hope that helps.

  • Did the keys work for you? I have been trying to do more checking myself. When I run the code on the nRF52840DK I can see that most of the time it stops in the zb_nrf52840_abort function sitting in an endless loop flushing the zb_osif_serial buffer, which is of course empty.

    The stack trace shows only back to the zb_free_buf() function (1 level up) so I cannot tell where it came to from that. Is there any more I can do to enable me to help find out where it is going wrong?

  • Yes, I was able to read the sniffer trace using the keys. I have forwarded the details to our developers, hopefully they can shed some more light on what is happening from the sniffer trace and stack logs.

Reply Children
  • Is there any news from development? Is there anything I can do to help? It seems like a buffer management issue since it invariably stops around the allocation or deallocation of a ZBoss buffer.

    I really like the Nordic environment and the nRF52840 but I am coming under a lot of pressure to look at alternative devices from other manufacturers if I can't get this to work. All I am trying to do is to get a stable platform running in a commercially available mix of Zigbee products so I can develop five different devices to monitor and control an off grid power and air conditioning system. I have some of the code working but I can't keep the nRF52840 connected for more than a hour or so which means it is all pretty useless.

  • Unfortunately, I have not heard back from the developers yet. They are quite busy, but I have requested an update as soon as possible. Sorry for the inconvenience.

  • I just got some feedback from our developers:

    Unfortunately provided zigbee_stack_trace.log does not contain any useful information. The problem is complex and hard to reproduce. However nRF5 SDK for Thread and Zigbee 3.0.0 have been made available today. This release include a lot of bug fixes.

    We advise to the customer to do the following:

    • build the application with the new version of the SDK
      • If the problem occurs again (we hope it won't), recompile the application with zigbee library traces:
        • use trace-enabled libraries, you will find them in $(SDK_ROOT)/sdk/nrf5/external/zboss/lib/debug
        • enable the tracing by setting in application's sdk_config.h the following:
          • ZIGBEE_TRACE_LEVEL 4
          • ZIGBEE_TRACE_MASK 0xFFFF
          • NRF_LOG_DEFAULT_LEVEL 4
          • NRF_LOG_BUFSIZE 65536
          • NRF_LOG_MSGPOOL_ELEMENT_COUNT 32
          • NRF_LOG_MSGPOOL_ELEMENT_SIZE 40
          • NRF_LOG_BACKEND_RTT_TX_RETRY_CNT 100
          • SEGGER_RTT_CONFIG_BUFFER_SIZE_UP 32768
        • Please provide source code of the application (if possible, the link http://www.cedartechnology.com/Nordic/light_bulb.zip is dead), wireshark log (or other pcap compatible) and traces.
  • I downloaded the new SDK and tried to recompile my code. Since there appear to be many changes to the SDK and no apparent documentation explaining what must be done to existing projects to allow them to use the new 3.0 SDK I decided to simply start again with the 3.0 demo light bulb code.

    I added the necessary bis to the Segger project to allow RTT log output. I am still amazed that you ship examples without this enabled! I then added all the logging stuff. I wasnt sure how to go about changing the library to the debug version so I simply replaced the two standard files with the debug versions indicated in your last post.

    I changed the default Zigbee channel mask to  ZB_TRANSCEIVER_ALL_CHANNELS_MASK and added code to reset the persistent storage if the ON button is pressed on startup, This second piece should be added to the examples with a note explaining what it does. You should also make the network connection light blink when it is waiting to connect but I did not do this so I could keep as close to your original code as possible.

    The program failed after only a few minutes in the same way as before.

    I did not capture a new WireShark log but the old one is still on my WEB site and nothing has changed in the environment. I place a ZIP copy of the entire project and a text file with the output of the debug window. These are available at http://www.cedartechnology.com/Nordic.

  • I have forwarded your findings to the developers. They have started investigating and trying to reproduce the issue.

Related