Unwanted "Leave" command from coordinator

Hello, I developed for my customer, a ZigBee coordinator in last 2019, using nRF5 SDK for Thread and Zigbee v3.2.0.
Now my customer has this issue:
- End Node connected via router, if router deads, the coordinator rejoin it but after few time send "leave" command to end node
Here attached WireShark log file of ZigBee radio communication
Some details:
- end node has NW address E3DF
- router has NW address D01B
- at line 14631 and following, the router deads
- at line 14680 starting end node routing search
- at line 14747 and followings rejon procedure
- at line 14850 Coordinator sends LEAVE command to end node E3DF

What is happened? Why Coordinator sends "Leave" command?
How can we avoid this unwanted behaviour?

Many thanks for your help!

Abele

Parents

0 Kenneth over 2 years ago

Hi,

nRF5 SDK for Thread and Zigbee v3.2.0 is quite old one. Can you verify it with latest nRF5 SDK for Thread and Zigbee v4.2?

Kenneth
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 abe over 2 years ago in reply to Kenneth

Hi Kenneth
this for me is very hard, the new 4.2 SDK (for what I know) seems not compatible with old one!
And the coordinator FW now in production is a result of over two years of developing ...
Could you help on SDK I have used ?
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Kenneth over 2 years ago in reply to abe

I believe both the nRF5 SDK and T&Z releases should have some migration information in the release notes or as a separate document. Though you may first need to look at the first major release update (e.g. 4.0 or 16.0), since that kind of information is typically there.

Kenneth
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 abe over 2 years ago in reply to Kenneth

Kennet,
as you know, the guide is only a brief summary how to migrate standard examples, but in my use case I used CLI example on SDK 3.2 as base for my project, to which I have done a lot changes.
Then, sure I will read guide, but perphaps I will need technical support by Nordic to best understand and to solve possible difficulties and issues.
Surely the developer teams has some suggests and indications how to make migration more easily.
Could I have these helps?
Thanks

PS there are chances to migrate only the ZBOSS stack v3.1.0 to ZBOSS stack v3.3.0, and not all SDK?

Abele
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 abe over 2 years ago in reply to abe

... I taked a look to new SDK 4.2 ... it hasn't IAR EWARM project!!!
My actual project is under IAR EWARM 8.30
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Marte Myrvold over 2 years ago in reply to abe
Hi Abele,

abe said:
PS there are chances to migrate only the ZBOSS stack v3.1.0 to ZBOSS stack v3.3.0, and not all SDK?

You must migrate the entire SDK if you are to migrate.

abe said:
it hasn't IAR EWARM project!!!

Support for IAR and Keil was removed in v4.2. You can use either GCC or SES.

I have an update from the developers:

Migration from SDK v3.2 to SDK v4.X requires factory reset.

For start I would suggest customer reproduces this issue and collects ZBOSS trace logs so maybe we can have an idea why is this happening in the first place.
Also, can the same issue be reproduced with an unmodified coordinator or CLI example?
Could customer also describe how their application works as this is based on the CLI sample?
How big is this network? How many nodes are there?

I have seen that the customer's coordinator device behaves in a strange manner - Link status is being sent 2 or even 3 times more frequent than it is supposed to - I don't have good explanation for this but it's concerning.

I have seen that customer has some issues with the device - ZBOSS asserts were triggered which results in zb_nrf52840_abort() being called.
https://devzone.nordicsemi.com/f/nordic-q-a/85117/zb_nrf52840_abort-function
Have these issues been solved already?
Also, was the device programmed to reset itself after zb_nrf52840_abort() is invoked in the customer's application?

If customer is willing to share application's source code we may have a quick look at it. Making sure that there are no "unholy" things done or some shenanigans may be a good start.

Best regards,

Marte
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 abe over 2 years ago in reply to Marte Myrvold

Marte Myrvold said:
You must migrate the entire SDK if you are to migrate.

then it is not possible to update only Zboss Stack? I have already done it from 3.0.0 to 3.1,0

Marte Myrvold said:
reproduces this issue and collects ZBOSS trace logs

Zboss log need to be enabled (how we can do it?) and then we can try to reproduce the issue and register data. We must also "fisically" connect the serial out that is not provided on our boards

Marte Myrvold said:
Could customer also describe how their application works as this is based on the CLI sample?

In general, from CLI sample is used only the Coordinator code part, in order to join on network some kind of end nodes.
This brief summary of functionallity:
- Start steering command to accept end node on network
- FDS memory handle to store End Node data (such as cluster endpoint attributes values and so on)
- BLE gatt used to read and write some ZigBee attributes
- some attributes configured on autoreporting (tipically for ON/OFF status for bulbs and smart plug) and charge level for battery powered end nodes (such as door sensor, pusbutton, motion sensor ...)

Marte Myrvold said:
How big is this network? How many nodes are there?

The networks we are using are 5 to 10 end nodes

Marte Myrvold said:
Link status is being sent 2 or even 3 times more frequent than it is supposed

This is same on CLI original example of SDK3.2

Marte Myrvold said:
ZBOSS asserts were triggered which results in zb_nrf52840_abort() being called.

Yes solved

Marte Myrvold said:
If customer is willing to share application's source code

I will ask to you if it agree your request

Marte Myrvold said:
Also, can the same issue be reproduced with an unmodified coordinator or CLI example?

This is not very simple to test ... but could you help us anyway to find the issue?

Thanks for help

Abele
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Reply

0 abe over 2 years ago in reply to Marte Myrvold

Marte Myrvold said:
You must migrate the entire SDK if you are to migrate.

then it is not possible to update only Zboss Stack? I have already done it from 3.0.0 to 3.1,0

Marte Myrvold said:
reproduces this issue and collects ZBOSS trace logs

Zboss log need to be enabled (how we can do it?) and then we can try to reproduce the issue and register data. We must also "fisically" connect the serial out that is not provided on our boards

Marte Myrvold said:
Could customer also describe how their application works as this is based on the CLI sample?

In general, from CLI sample is used only the Coordinator code part, in order to join on network some kind of end nodes.
This brief summary of functionallity:
- Start steering command to accept end node on network
- FDS memory handle to store End Node data (such as cluster endpoint attributes values and so on)
- BLE gatt used to read and write some ZigBee attributes
- some attributes configured on autoreporting (tipically for ON/OFF status for bulbs and smart plug) and charge level for battery powered end nodes (such as door sensor, pusbutton, motion sensor ...)

Marte Myrvold said:
How big is this network? How many nodes are there?

The networks we are using are 5 to 10 end nodes

Marte Myrvold said:
Link status is being sent 2 or even 3 times more frequent than it is supposed

This is same on CLI original example of SDK3.2

Marte Myrvold said:
ZBOSS asserts were triggered which results in zb_nrf52840_abort() being called.

Yes solved

Marte Myrvold said:
If customer is willing to share application's source code

I will ask to you if it agree your request

Marte Myrvold said:
Also, can the same issue be reproduced with an unmodified coordinator or CLI example?

This is not very simple to test ... but could you help us anyway to find the issue?

Thanks for help

Abele
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Children

0 Marte Myrvold over 2 years ago in reply to abe

Hi Abele,

Thank you for providing the additional information!

abe said:
then it is not possible to update only Zboss Stack? I have already done it from 3.0.0 to 3.1,0

It is possible to do so, but the SDK might not be compatible with a different version of ZBOSS than it was made for. Additionally, platform certification points out one particular version of ZBOSS, so you will not be able to use platform certification if you use a different version of the ZBOSS stack than the SDK was certified with.

abe said:
Zboss log need to be enabled (how we can do it?)

This is described in the documentation here: Debugging, but what you need to do is to:

Replace the libzboss libraries in your project with the files located in <InstallFolder>/external/zboss/lib/debug
Configure the following in sdk_config.h:

Configuration option	Value for nRF52840	Value for nRF52833*
NRF_LOG_DEFAULT_LEVEL	4	4
NRF_LOG_BUFSIZE	65536	32768
NRF_LOG_MSGPOOL_ELEMENT_COUNT	32	32
NRF_LOG_MSGPOOL_ELEMENT_SIZE	40	40
NRF_LOG_BACKEND_RTT_TX_RETRY_CNT	100	100
SEGGER_RTT_CONFIG_BUFFER_SIZE_UP	32768	512

Then change these in main():

ZB_SET_TRACE_LEVEL(ZIGBEE_TRACE_LEVEL);
ZB_SET_TRACE_MASK(ZIGBEE_TRACE_MASK);
to
ZB_SET_TRACE_LEVEL(4);
ZB_SET_TRACE_MASK(TRACE_SUBSYSTEM_ZCL | TRACE_SUBSYSTEM_ZDO | TRACE_SUBSYSTEM_NWK | TRACE_SUBSYSTEM_SECUR);

The UART log from nRF Should output some character strings. Please save this log and upload it here so I can forward it to the developers for decoding.

Best regards,

Marte

0 abe over 2 years ago in reply to Marte Myrvold

Thanks Marte
I next days (or next week) I will arrange the needed modify to have Zboss Log Out and I will try to reproduce the issue (but it is not easy, the issue is randomly)
As I have news I will send you the results
Abele
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Marte Myrvold over 2 years ago in reply to abe
Hi Abele,

I have an update from the developers as well:

Changing only ZBOSS libs should be only with the rest of the SDK - there were some major changes between SDK v3.2 and v4.X.
I pretty sure that even attempt to change ZBOSS libs will require factory reset of the device.

ZBOSS logs could also be configured to be printed over RTT so maybe the additional connector for serial is not required.

Customer claims that the Link Status issue could be reproduced with the original CLI example.
Could you ask them to how can it be reproduced? I have tried a bit but wasn't able to reproduce it.

Best regards,

Marte
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 abe over 2 years ago in reply to Marte Myrvold

Hello Marte,

Marte Myrvold said:
Changing only ZBOSS libs should be only with the rest of the SDK - there were some major changes between SDK v3.2 and v4.X.

Yes, of course, I see there are some function that must be adapted/changed, but this could be resolved, I think

Marte Myrvold said:
I pretty sure that even attempt to change ZBOSS libs will require factory reset of the device.

This is not a issue, all custom data/configurations stored on FDS will be restored via BLE

For the issue, I haven't said that it can be reproduced with CLI sample, I said that ii needs to be tested.
I said that Link Status timing are same of CLI example

To try to reproduce the issue, you must have a Battery powered end node (button or door sensor) and a Bulb or other mains powered end node that act also as Router
If you see my WireShark log, you can see that the all end nodes regullary report some attribute, in addition, the Door Sensor is configured also to send IAS zone status info to Coordinator and it send these information via router.
If you power down the router, and for example open / close the door sensor, it try to send via router, but it is not reachable; then it ask to find a route, the coordinator rejoins it and accept data, but after some seconds sends "leave" to just rejoined end node.
Unfortunatly this happens "randomly" not every time you make this sequence

Abele

PS We are using Heiman end nodes (bulbs, door sensor , ...)
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 abe over 2 years ago in reply to abe

Hi Marte
Have you still tried to reproduce the issue? (using my suggests...)
Have further news?
On my side, I hope to try with Zboss log enabled during this week ...

Abele
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel