This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

nRF52840 NCP become leader and stop receiving from old network

Hi,

I am developing an application using the nRF52840 as NCP with a raspberry pi as the primary processor, using wpantund. I have 3 devices (3 raspberry, each with its own nRF52840) in the same Thread network, but at some point one of them stop receiving messages from the others (my application uses UDP, but it is irrelevant), and when I look at the log i see that it became leader, but without any other board outputting any message or the previous leader leaving it's status :

Device 1 (OK, original leader):

Jun 12 09:07:55 raspberrypi wpantund[416]: Starting wpantund 0.08.00d (May 14 2018 17:06:58) . . .
Jun 12 09:07:55 raspberrypi wpantund[416]: #011SOURCE_VERSION = 0.07.01-209-ga88988a
Jun 12 09:07:55 raspberrypi wpantund[416]: #011BUILD_VERSION = 0.07.01-209-ga88988a
Jun 12 09:07:55 raspberrypi wpantund[416]: Configuration file "/etc/wpantund.conf" read.
Jun 12 09:07:55 raspberrypi wpantund[416]: Ready. Using DBUS bus ":1.3"
Jun 12 09:07:55 raspberrypi wpantund[416]: Running as root without dropping privileges!
Jun 12 09:07:55 raspberrypi wpantund[416]: [-NCP-]: NCP was reset (STATUS_RESET_EXTERNAL, 113)
Jun 12 09:07:55 raspberrypi wpantund[416]: State change: "uninitialized" -> "offline"
Jun 12 09:07:55 raspberrypi wpantund[416]: NCP is running "OPENTHREAD/20170716-00532-gec59d7e-dirty; NRF52840; Apr  6 2018 23:55:28"
Jun 12 09:07:55 raspberrypi wpantund[416]: Driver is running "0.08.00d (0.07.01-209-ga88988a; May 14 2018 17:06:58)"
Jun 12 09:07:55 raspberrypi wpantund[416]: Network is not joinable
Jun 12 09:07:55 raspberrypi wpantund[416]: State change: "offline" -> "offline:commissioned"
Jun 12 09:07:55 raspberrypi wpantund[416]: Finished initializing NCP
Jun 12 09:07:55 raspberrypi wpantund[416]: AutoResume is enabled. Trying to resume.
Jun 12 09:07:55 raspberrypi wpantund[416]: NCP is commissioned. Resuming...
Jun 12 09:07:55 raspberrypi wpantund[416]: State change: "offline:commissioned" -> "associating"
Jun 12 09:07:59 raspberrypi wpantund[416]: State change: "associating" -> "associated"
Jun 12 09:07:59 raspberrypi wpantund[416]: Node type change: "unknown" -> "leader"

Device 2 (Faulty):

Jun 12 09:07:57 raspberrypi wpantund[423]: Starting wpantund 0.08.00d (May 14 2018 17:06:58) . . .
Jun 12 09:07:57 raspberrypi wpantund[423]: #011SOURCE_VERSION = 0.07.01-209-ga88988a
Jun 12 09:07:57 raspberrypi wpantund[423]: #011BUILD_VERSION = 0.07.01-209-ga88988a
Jun 12 09:07:57 raspberrypi wpantund[423]: Configuration file "/etc/wpantund.conf" read.
Jun 12 09:07:57 raspberrypi wpantund[423]: Ready. Using DBUS bus ":1.3"
Jun 12 09:07:57 raspberrypi wpantund[423]: Running as root without dropping privileges!
Jun 12 09:07:57 raspberrypi wpantund[423]: [-NCP-]: NCP was reset (STATUS_RESET_EXTERNAL, 113)
Jun 12 09:07:57 raspberrypi wpantund[423]: State change: "uninitialized" -> "offline"
Jun 12 09:07:57 raspberrypi wpantund[423]: NCP is running "OPENTHREAD/20170716-00532-gec59d7e-dirty; NRF52840; Apr  6 2018 23:55:28"
Jun 12 09:07:57 raspberrypi wpantund[423]: Driver is running "0.08.00d (0.07.01-209-ga88988a; May 14 2018 17:06:58)"
Jun 12 09:07:57 raspberrypi wpantund[423]: HARDWARE ADDRESS IS INVALID, MULTICAST BIT IS SET!
Jun 12 09:07:57 raspberrypi wpantund[423]: Network is not joinable
Jun 12 09:07:57 raspberrypi wpantund[423]: State change: "offline" -> "offline:commissioned"
Jun 12 09:07:57 raspberrypi wpantund[423]: Finished initializing NCP
Jun 12 09:07:57 raspberrypi wpantund[423]: AutoResume is enabled. Trying to resume.
Jun 12 09:07:57 raspberrypi wpantund[423]: NCP is commissioned. Resuming...
Jun 12 09:07:57 raspberrypi wpantund[423]: State change: "offline:commissioned" -> "associating"
Jun 12 09:08:02 raspberrypi wpantund[423]: State change: "associating" -> "associated"
Jun 12 09:08:02 raspberrypi wpantund[423]: Node type change: "unknown" -> "end-device"
Jun 12 09:08:03 raspberrypi wpantund[423]: Node type change: "end-device" -> "router"
 < 2 hours later >
Jun 12 11:07:01 raspberrypi wpantund[423]: Node type change: "router" -> "leader"

Device 3 (OK):

Jun 12 09:07:58 raspberrypi wpantund[426]: Starting wpantund 0.08.00d (May 14 2018 17:06:55) . . .
Jun 12 09:07:58 raspberrypi wpantund[426]: #011SOURCE_VERSION = 0.07.01-209-ga88988a
Jun 12 09:07:58 raspberrypi wpantund[426]: #011BUILD_VERSION = 0.07.01-209-ga88988a
Jun 12 09:07:58 raspberrypi wpantund[426]: Configuration file "/etc/wpantund.conf" read.
Jun 12 09:07:58 raspberrypi wpantund[426]: Ready. Using DBUS bus ":1.3"
Jun 12 09:07:58 raspberrypi wpantund[426]: Running as root without dropping privileges!
Jun 12 09:07:58 raspberrypi wpantund[426]: [-NCP-]: NCP was reset (STATUS_RESET_EXTERNAL, 113)
Jun 12 09:07:58 raspberrypi wpantund[426]: State change: "uninitialized" -> "offline"
Jun 12 09:07:58 raspberrypi wpantund[426]: NCP is running "OPENTHREAD/20170716-00532-gec59d7e-dirty; NRF52840; Apr  6 2018 23:55:28"
Jun 12 09:07:58 raspberrypi wpantund[426]: Driver is running "0.08.00d (0.07.01-209-ga88988a; May 14 2018 17:06:55)"
Jun 12 09:07:58 raspberrypi wpantund[426]: Network is not joinable
Jun 12 09:07:58 raspberrypi wpantund[426]: State change: "offline" -> "offline:commissioned"
Jun 12 09:07:58 raspberrypi wpantund[426]: Finished initializing NCP
Jun 12 09:07:58 raspberrypi wpantund[426]: AutoResume is enabled. Trying to resume.
Jun 12 09:07:58 raspberrypi wpantund[426]: NCP is commissioned. Resuming...
Jun 12 09:07:58 raspberrypi wpantund[426]: State change: "offline:commissioned" -> "associating"
Jun 12 09:08:01 raspberrypi wpantund[426]: State change: "associating" -> "associated"
Jun 12 09:08:01 raspberrypi wpantund[426]: Node type change: "unknown" -> "end-device"
Jun 12 09:08:04 raspberrypi wpantund[426]: Node type change: "end-device" -> "router"

Looking at my application network log i can see that the last message received by it is at 11:04:37, so 2:30 before becoming leader.

The other devices receive their last message from Device 2 at 11:04:25 for both Device 1 and 3, despite Device 2 trying to send several messages after 11:04:25 (even after becoming the "new" leader).

Do you ever have encountered such a issue ? To me it looks like one board stopped receiving messages for any reason, and after a while decided to start it's own network.

After a reboot it all restart normally and the devices are in the same network and communicating fine again.

I am using the nRF52840 Hex from the "nRF5_SDK_for_Thread_and_Zigbee_v1.0.0" (example/thread/ncp/usb/hex/nrf52840_xxaa.hex) and wpantund commit a88988a (28 april 2018). It is the first and only time this bug occurred, despite several run longer than 2 hours.

My application is running on the raspberry and just using the network interface wpan0 provided by the wpantund driver, without needing/using any further knowledge.

Thank you,

Virgile

Edit: Same issue appeared again (from device 3 this time), it was not happening with only 2 boards, that's why I said it was the first time it appeared in the network (I only tested with 2 devices until now). This time I didn't kill the process immediatly, so I run a wpanctl status, and I get Device 1 and 3 both output Network:NodeType = Leader, with identical Network Name, PANID, XPANID, channel and mesh local prefix, so basically the same thread network. I also did some network capture (wpanctl pcap + wireshark) from all three devices, and there are not packets from the new fake leader visible on any of the 3 nodes, even the fake leader. The devices other than this new fake leader still communicate (device 1 and 2) correctly.

This time again the incorrect devices can't send or receive about 2 min before decided to become a new leader.

This is not a range issue, as the 3 boards are on the same table at about 5/10 cm from each other

Parents Reply Children
No Data
Related