Unwanted "Leave" command from coordinator

Hello, I developed for my customer, a ZigBee coordinator in last 2019, using nRF5 SDK for Thread and Zigbee v3.2.0.
Now my customer has this issue:
- End Node connected via router, if router deads, the coordinator rejoin it but after few time send "leave" command to end node
Here attached WireShark log file of ZigBee radio communication
Some details:
- end node has NW address E3DF
- router has NW address D01B
- at line 14631 and following, the router deads
- at line 14680 starting end node routing search
- at line 14747 and followings rejon procedure
- at line 14850 Coordinator sends LEAVE command to end node E3DF

What is happened? Why Coordinator sends "Leave" command?
How can we avoid this unwanted behaviour?

Many thanks for your help!

Abele

Parents
  • Hi,

    nRF5 SDK for Thread and Zigbee v3.2.0 is quite old one. Can you verify it with latest nRF5 SDK for Thread and Zigbee v4.2?

    Kenneth

  • Hi Kenneth
    this for me is very hard, the new 4.2 SDK (for what I know) seems not compatible with old one!
    And the coordinator FW now in production is a result of over two years of developing ...
    Could you help on SDK I have used ?

  • I believe both the nRF5 SDK and T&Z releases should have some migration information in the release notes or as a separate document. Though you may first need to look at the first major release update (e.g. 4.0 or 16.0), since that kind of information is typically there.

    Kenneth

  • Kennet, 
    as you know, the guide is only a brief summary how to migrate standard examples, but in my use case I used CLI example on SDK 3.2 as base for my project, to which I have done a lot changes.
    Then, sure I will read guide, but perphaps I will need technical support by Nordic to best understand and to solve possible difficulties and issues.
    Surely the developer teams has some suggests and indications how to make migration more easily.
    Could I have these helps?
    Thanks

    PS there are chances to migrate only the ZBOSS stack v3.1.0 to ZBOSS stack v3.3.0, and not all SDK?

    Abele

  • ... I taked a look to new SDK 4.2 ... it hasn't IAR EWARM project!!!
    My actual project is under IAR EWARM 8.30

  • Hi Abele,

    abe said:
    PS there are chances to migrate only the ZBOSS stack v3.1.0 to ZBOSS stack v3.3.0, and not all SDK?

    You must migrate the entire SDK if you are to migrate.

    abe said:
    it hasn't IAR EWARM project!!!

    Support for IAR and Keil was removed in v4.2. You can use either GCC or SES.

    I have an update from the developers:

    1. Migration from SDK v3.2 to SDK v4.X requires factory reset.
    2. For start I would suggest customer reproduces this issue and collects ZBOSS trace logs so maybe we can have an idea why is this happening in the first place.
      Also, can the same issue be reproduced with an unmodified coordinator or CLI example?
      Could customer also describe how their application works as this is based on the CLI sample?
      How big is this network? How many nodes are there?
    3. I have seen that the customer's coordinator device behaves in a strange manner - Link status is being sent 2 or even 3 times more frequent than it is supposed to - I don't have good explanation for this but it's concerning.
    4. I have seen that customer has some issues with the device - ZBOSS asserts were triggered which results in zb_nrf52840_abort() being called.
      https://devzone.nordicsemi.com/f/nordic-q-a/85117/zb_nrf52840_abort-function
      Have these issues been solved already?
      Also, was the device programmed to reset itself after zb_nrf52840_abort() is invoked in the customer's application?
    5. If customer is willing to share application's source code we may have a quick look at it. Making sure that there are no "unholy" things done or some shenanigans may be a good start.

    Best regards,

    Marte

  • You must migrate the entire SDK if you are to migrate.

    then it is not possible to update only Zboss Stack? I have already done it from 3.0.0 to 3.1,0

    reproduces this issue and collects ZBOSS trace logs

    Zboss log need to be enabled (how we can do it?) and then we can try to reproduce the issue and register data. We must also "fisically" connect the serial out that is not provided on our boards

    Could customer also describe how their application works as this is based on the CLI sample?

    In general, from  CLI sample is used only the Coordinator code part, in order to join on network some kind of end nodes.
    This brief summary of functionallity:
    - Start steering command to accept end node on network
    - FDS memory handle to store End Node data (such as cluster endpoint attributes values and so on)
    - BLE gatt used to read and write some ZigBee attributes
    - some attributes configured on autoreporting (tipically for ON/OFF status for bulbs and smart plug) and charge level for battery powered end nodes (such as door sensor, pusbutton, motion sensor ...)

    How big is this network? How many nodes are there?

    The networks we are using are 5 to 10 end nodes

    Link status is being sent 2 or even 3 times more frequent than it is supposed

    This is same on CLI original example of SDK3.2

    ZBOSS asserts were triggered which results in zb_nrf52840_abort() being called.

    Yes solved

    If customer is willing to share application's source code

    I will ask to you if it agree your request

    Also, can the same issue be reproduced with an unmodified coordinator or CLI example?

    This is not very simple to test ... but could you help us anyway to find the issue?

    Thanks for help

    Abele

Reply
  • You must migrate the entire SDK if you are to migrate.

    then it is not possible to update only Zboss Stack? I have already done it from 3.0.0 to 3.1,0

    reproduces this issue and collects ZBOSS trace logs

    Zboss log need to be enabled (how we can do it?) and then we can try to reproduce the issue and register data. We must also "fisically" connect the serial out that is not provided on our boards

    Could customer also describe how their application works as this is based on the CLI sample?

    In general, from  CLI sample is used only the Coordinator code part, in order to join on network some kind of end nodes.
    This brief summary of functionallity:
    - Start steering command to accept end node on network
    - FDS memory handle to store End Node data (such as cluster endpoint attributes values and so on)
    - BLE gatt used to read and write some ZigBee attributes
    - some attributes configured on autoreporting (tipically for ON/OFF status for bulbs and smart plug) and charge level for battery powered end nodes (such as door sensor, pusbutton, motion sensor ...)

    How big is this network? How many nodes are there?

    The networks we are using are 5 to 10 end nodes

    Link status is being sent 2 or even 3 times more frequent than it is supposed

    This is same on CLI original example of SDK3.2

    ZBOSS asserts were triggered which results in zb_nrf52840_abort() being called.

    Yes solved

    If customer is willing to share application's source code

    I will ask to you if it agree your request

    Also, can the same issue be reproduced with an unmodified coordinator or CLI example?

    This is not very simple to test ... but could you help us anyway to find the issue?

    Thanks for help

    Abele

Children
  • Hi Abele,

    Thank you for providing the additional information!

    abe said:
    then it is not possible to update only Zboss Stack? I have already done it from 3.0.0 to 3.1,0

    It is possible to do so, but the SDK might not be compatible with a different version of ZBOSS than it was made for. Additionally, platform certification points out one particular version of ZBOSS, so you will not be able to use platform certification if you use a different version of the ZBOSS stack than the SDK was certified with.

    abe said:
    Zboss log need to be enabled (how we can do it?)

    This is described in the documentation here: Debugging, but what you need to do is to:

    • Replace the libzboss libraries in your project with the files located in <InstallFolder>/external/zboss/lib/debug
    • Configure the following in sdk_config.h:
    Configuration option Value for nRF52840 Value for nRF52833*
    NRF_LOG_DEFAULT_LEVEL 4 4
    NRF_LOG_BUFSIZE 65536 32768
    NRF_LOG_MSGPOOL_ELEMENT_COUNT 32 32
    NRF_LOG_MSGPOOL_ELEMENT_SIZE 40 40
    NRF_LOG_BACKEND_RTT_TX_RETRY_CNT 100 100
    SEGGER_RTT_CONFIG_BUFFER_SIZE_UP 32768 512
    • Then change these in main():

      ZB_SET_TRACE_LEVEL(ZIGBEE_TRACE_LEVEL);
      ZB_SET_TRACE_MASK(ZIGBEE_TRACE_MASK);
      to
      ZB_SET_TRACE_LEVEL(4);
      ZB_SET_TRACE_MASK(TRACE_SUBSYSTEM_ZCL | TRACE_SUBSYSTEM_ZDO | TRACE_SUBSYSTEM_NWK | TRACE_SUBSYSTEM_SECUR);

    The UART log from nRF Should output some character strings. Please save this log and upload it here so I can forward it to the developers for decoding.

    Best regards,

    Marte

  • Thanks Marte
    I next days (or next week) I will arrange the needed modify to have Zboss Log Out and I will try to reproduce the issue (but it is not easy, the issue is randomly)
    As I have news I will send you the results
    Abele

  • Hi Abele,

    I have an update from the developers as well:

    1. Changing only ZBOSS libs should be only with the rest of the SDK - there were some major changes between SDK v3.2 and v4.X.
      I pretty sure that even attempt to change ZBOSS libs will require factory reset of the device.
    2. ZBOSS logs could also be configured to be printed over RTT so maybe the additional connector for serial is not required. 
    3. Customer claims that the Link Status issue could be reproduced with the original CLI example.
      Could you ask them to how can it be reproduced? I have tried a bit but wasn't able to reproduce it.

    Best regards,

    Marte

  • Hello Marte,

    Changing only ZBOSS libs should be only with the rest of the SDK - there were some major changes between SDK v3.2 and v4.X.

    Yes, of course, I see there are some function that must be adapted/changed, but this could be resolved, I think

    I pretty sure that even attempt to change ZBOSS libs will require factory reset of the device.

    This is not a issue, all custom data/configurations stored on FDS will be restored via BLE

    For the issue, I haven't said that it can be reproduced with CLI sample, I said that ii needs to be tested.
    I said that Link Status timing are same of CLI example

    To try to reproduce the issue, you must have a Battery powered end node (button or door sensor) and a Bulb or other mains powered end node that act also as Router
    If you see my WireShark log, you can see that the all end nodes regullary report some attribute, in addition, the Door Sensor is configured also to send IAS zone status info to Coordinator and it send these information via router.
    If you power down the router, and for example open / close the door sensor, it try to send via router, but it is not reachable; then it ask to find a route, the coordinator rejoins it and accept data, but after some seconds sends "leave" to just rejoined end node.
    Unfortunatly this happens "randomly" not every time you make this sequence

    Abele

    PS We are using Heiman end nodes (bulbs, door sensor , ...)

  • Hi Marte
    Have you still tried to reproduce the issue? (using my suggests...)
    Have further news?
    On my side, I hope to try with Zboss log enabled during this week ...

    Abele

Related