This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

nRF52840 NCP become leader and stop receiving from old network

Hi,

I am developing an application using the nRF52840 as NCP with a raspberry pi as the primary processor, using wpantund. I have 3 devices (3 raspberry, each with its own nRF52840) in the same Thread network, but at some point one of them stop receiving messages from the others (my application uses UDP, but it is irrelevant), and when I look at the log i see that it became leader, but without any other board outputting any message or the previous leader leaving it's status :

Device 1 (OK, original leader):

Jun 12 09:07:55 raspberrypi wpantund[416]: Starting wpantund 0.08.00d (May 14 2018 17:06:58) . . .
Jun 12 09:07:55 raspberrypi wpantund[416]: #011SOURCE_VERSION = 0.07.01-209-ga88988a
Jun 12 09:07:55 raspberrypi wpantund[416]: #011BUILD_VERSION = 0.07.01-209-ga88988a
Jun 12 09:07:55 raspberrypi wpantund[416]: Configuration file "/etc/wpantund.conf" read.
Jun 12 09:07:55 raspberrypi wpantund[416]: Ready. Using DBUS bus ":1.3"
Jun 12 09:07:55 raspberrypi wpantund[416]: Running as root without dropping privileges!
Jun 12 09:07:55 raspberrypi wpantund[416]: [-NCP-]: NCP was reset (STATUS_RESET_EXTERNAL, 113)
Jun 12 09:07:55 raspberrypi wpantund[416]: State change: "uninitialized" -> "offline"
Jun 12 09:07:55 raspberrypi wpantund[416]: NCP is running "OPENTHREAD/20170716-00532-gec59d7e-dirty; NRF52840; Apr  6 2018 23:55:28"
Jun 12 09:07:55 raspberrypi wpantund[416]: Driver is running "0.08.00d (0.07.01-209-ga88988a; May 14 2018 17:06:58)"
Jun 12 09:07:55 raspberrypi wpantund[416]: Network is not joinable
Jun 12 09:07:55 raspberrypi wpantund[416]: State change: "offline" -> "offline:commissioned"
Jun 12 09:07:55 raspberrypi wpantund[416]: Finished initializing NCP
Jun 12 09:07:55 raspberrypi wpantund[416]: AutoResume is enabled. Trying to resume.
Jun 12 09:07:55 raspberrypi wpantund[416]: NCP is commissioned. Resuming...
Jun 12 09:07:55 raspberrypi wpantund[416]: State change: "offline:commissioned" -> "associating"
Jun 12 09:07:59 raspberrypi wpantund[416]: State change: "associating" -> "associated"
Jun 12 09:07:59 raspberrypi wpantund[416]: Node type change: "unknown" -> "leader"

Device 2 (Faulty):

Jun 12 09:07:57 raspberrypi wpantund[423]: Starting wpantund 0.08.00d (May 14 2018 17:06:58) . . .
Jun 12 09:07:57 raspberrypi wpantund[423]: #011SOURCE_VERSION = 0.07.01-209-ga88988a
Jun 12 09:07:57 raspberrypi wpantund[423]: #011BUILD_VERSION = 0.07.01-209-ga88988a
Jun 12 09:07:57 raspberrypi wpantund[423]: Configuration file "/etc/wpantund.conf" read.
Jun 12 09:07:57 raspberrypi wpantund[423]: Ready. Using DBUS bus ":1.3"
Jun 12 09:07:57 raspberrypi wpantund[423]: Running as root without dropping privileges!
Jun 12 09:07:57 raspberrypi wpantund[423]: [-NCP-]: NCP was reset (STATUS_RESET_EXTERNAL, 113)
Jun 12 09:07:57 raspberrypi wpantund[423]: State change: "uninitialized" -> "offline"
Jun 12 09:07:57 raspberrypi wpantund[423]: NCP is running "OPENTHREAD/20170716-00532-gec59d7e-dirty; NRF52840; Apr  6 2018 23:55:28"
Jun 12 09:07:57 raspberrypi wpantund[423]: Driver is running "0.08.00d (0.07.01-209-ga88988a; May 14 2018 17:06:58)"
Jun 12 09:07:57 raspberrypi wpantund[423]: HARDWARE ADDRESS IS INVALID, MULTICAST BIT IS SET!
Jun 12 09:07:57 raspberrypi wpantund[423]: Network is not joinable
Jun 12 09:07:57 raspberrypi wpantund[423]: State change: "offline" -> "offline:commissioned"
Jun 12 09:07:57 raspberrypi wpantund[423]: Finished initializing NCP
Jun 12 09:07:57 raspberrypi wpantund[423]: AutoResume is enabled. Trying to resume.
Jun 12 09:07:57 raspberrypi wpantund[423]: NCP is commissioned. Resuming...
Jun 12 09:07:57 raspberrypi wpantund[423]: State change: "offline:commissioned" -> "associating"
Jun 12 09:08:02 raspberrypi wpantund[423]: State change: "associating" -> "associated"
Jun 12 09:08:02 raspberrypi wpantund[423]: Node type change: "unknown" -> "end-device"
Jun 12 09:08:03 raspberrypi wpantund[423]: Node type change: "end-device" -> "router"
 < 2 hours later >
Jun 12 11:07:01 raspberrypi wpantund[423]: Node type change: "router" -> "leader"

Device 3 (OK):

Jun 12 09:07:58 raspberrypi wpantund[426]: Starting wpantund 0.08.00d (May 14 2018 17:06:55) . . .
Jun 12 09:07:58 raspberrypi wpantund[426]: #011SOURCE_VERSION = 0.07.01-209-ga88988a
Jun 12 09:07:58 raspberrypi wpantund[426]: #011BUILD_VERSION = 0.07.01-209-ga88988a
Jun 12 09:07:58 raspberrypi wpantund[426]: Configuration file "/etc/wpantund.conf" read.
Jun 12 09:07:58 raspberrypi wpantund[426]: Ready. Using DBUS bus ":1.3"
Jun 12 09:07:58 raspberrypi wpantund[426]: Running as root without dropping privileges!
Jun 12 09:07:58 raspberrypi wpantund[426]: [-NCP-]: NCP was reset (STATUS_RESET_EXTERNAL, 113)
Jun 12 09:07:58 raspberrypi wpantund[426]: State change: "uninitialized" -> "offline"
Jun 12 09:07:58 raspberrypi wpantund[426]: NCP is running "OPENTHREAD/20170716-00532-gec59d7e-dirty; NRF52840; Apr  6 2018 23:55:28"
Jun 12 09:07:58 raspberrypi wpantund[426]: Driver is running "0.08.00d (0.07.01-209-ga88988a; May 14 2018 17:06:55)"
Jun 12 09:07:58 raspberrypi wpantund[426]: Network is not joinable
Jun 12 09:07:58 raspberrypi wpantund[426]: State change: "offline" -> "offline:commissioned"
Jun 12 09:07:58 raspberrypi wpantund[426]: Finished initializing NCP
Jun 12 09:07:58 raspberrypi wpantund[426]: AutoResume is enabled. Trying to resume.
Jun 12 09:07:58 raspberrypi wpantund[426]: NCP is commissioned. Resuming...
Jun 12 09:07:58 raspberrypi wpantund[426]: State change: "offline:commissioned" -> "associating"
Jun 12 09:08:01 raspberrypi wpantund[426]: State change: "associating" -> "associated"
Jun 12 09:08:01 raspberrypi wpantund[426]: Node type change: "unknown" -> "end-device"
Jun 12 09:08:04 raspberrypi wpantund[426]: Node type change: "end-device" -> "router"

Looking at my application network log i can see that the last message received by it is at 11:04:37, so 2:30 before becoming leader.

The other devices receive their last message from Device 2 at 11:04:25 for both Device 1 and 3, despite Device 2 trying to send several messages after 11:04:25 (even after becoming the "new" leader).

Do you ever have encountered such a issue ? To me it looks like one board stopped receiving messages for any reason, and after a while decided to start it's own network.

After a reboot it all restart normally and the devices are in the same network and communicating fine again.

I am using the nRF52840 Hex from the "nRF5_SDK_for_Thread_and_Zigbee_v1.0.0" (example/thread/ncp/usb/hex/nrf52840_xxaa.hex) and wpantund commit a88988a (28 april 2018). It is the first and only time this bug occurred, despite several run longer than 2 hours.

My application is running on the raspberry and just using the network interface wpan0 provided by the wpantund driver, without needing/using any further knowledge.

Thank you,

Virgile

Edit: Same issue appeared again (from device 3 this time), it was not happening with only 2 boards, that's why I said it was the first time it appeared in the network (I only tested with 2 devices until now). This time I didn't kill the process immediatly, so I run a wpanctl status, and I get Device 1 and 3 both output Network:NodeType = Leader, with identical Network Name, PANID, XPANID, channel and mesh local prefix, so basically the same thread network. I also did some network capture (wpanctl pcap + wireshark) from all three devices, and there are not packets from the new fake leader visible on any of the 3 nodes, even the fake leader. The devices other than this new fake leader still communicate (device 1 and 2) correctly.

This time again the incorrect devices can't send or receive about 2 min before decided to become a new leader.

This is not a range issue, as the 3 boards are on the same table at about 5/10 cm from each other

  • Hi,

    Our developers have been able to reproduce this behavior and are investigating this issue.

    Could you run the following commands on your border router(s), and post the output in this thread?

    wpanctl getprop
    wpanctl getprop OpenThread:MsgBufferCounters:AsString

    Best regards,
    Jørgen

  • Hi,

    Thank you for the quick update, unfortunately I restarted it in an attempt to have more information, now I have to wait for it to crash again, I will keep you updated as soon as it happens.

    In the mean time, it seems you think it is related to some kind of buffer being filled or overflowed, is that correct ? My application indeed sends quiet a lot of messages (around 70 packets/s) but stays under the ~200 kb/s (closer to 40-50 kb/s). These information have been measured with the pyspinel sniffer following this procedure : http://infocenter.nordicsemi.com/index.jsp?topic=%2Fcom.nordic.infocenter.threadsdk.v0.11.0%2Fthread_sniffer.html, if this can be useful for the developers.

    Best,

    Virgile

    Edit: It failed again very quickly after I reduced the time between messages.

    Here it is:

    Device01 get prop

    Config:NCP:DriverName = "spinel"
    Config:TUN:InterfaceName = "wpan0"
    Daemon:AutoAssociateAfterReset = true
    Daemon:AutoDeepSleep = false
    Daemon:Enabled = true
    Daemon:ReadyForHostSleep = true
    Daemon:SetDefaultRouteForAutoAddedPrefix = false
    Daemon:TerminateOnFault = false
    Daemon:Version = "0.08.00d (0.07.01-209-ga88988a; May 14 2018 17:06:58)"
    IPv6:AllAddresses = [
    	"fd48:b09f:aca7:0:b51f:a9cb:fe70:67ac     prefix_len:64   origin:ncp      valid:forever   preferred:forever"
    	"fe80::40a3:d787:a5a3:c788                prefix_len:64   origin:ncp      valid:forever   preferred:forever"
    ]
    IPv6:LinkLocalAddress = "fe80::40a3:d787:a5a3:c788"
    IPv6:MeshLocalAddress = "fd48:b09f:aca7:0:b51f:a9cb:fe70:67ac"
    IPv6:MeshLocalPrefix = "fd48:b09f:aca7::/64"
    IPv6:MulticastAddresses = [
    	"ff02::1                                  origin:ncp"
    	"ff02::2                                  origin:ncp"
    	"ff03::1                                  origin:ncp"
    	"ff03::2                                  origin:ncp"
    	"ff03::fc                                 origin:ncp"
    	"ff03::114                                origin:user"
    	"ff32:40:fd48:b09f:aca7::1                origin:ncp"
    	"ff33:40:fd48:b09f:aca7::1                origin:ncp"
    ]
    IPv6:Routes = []
    IPv6:SetSLAACForAutoAddedPrefix = false
    Interface:Up = true
    MAC:Blacklist:Enabled = false
    MAC:Blacklist:Entries = []
    MAC:Whitelist:Enabled = false
    MAC:Whitelist:Entries = []
    NCP:CCAFailureRate = 0x06B2
    NCP:Channel = 23
    NCP:ChannelMask = 134215680
    NCP:Counter:AllMac = [
    	"TxTotal              = 2182"
    	"TxUnicast            = 342"
    	"TxBroadcast          = 1840"
    	"TxAckRequested       = 342"
    	"TxAcked              = 342"
    	"TxNoAckRequested     = 1840"
    	"TxData               = 2182"
    	"TxDataPoll           = 0"
    	"TxBeacon             = 0"
    	"TxBeaconRequest      = 0"
    	"TxOther              = 0"
    	"TxRetry              = 192"
    	"TxErrCca             = 155"
    	"TxErrAbort           = 0"
    	"TxErrBusyChannel     = 0"
    	"RxTotal              = 2874"
    	"RxUnicast            = 355"
    	"RxBroadcast          = 2338"
    	"RxData               = 2691"
    	"RxDataPoll           = 0"
    	"RxBeacon             = 0"
    	"RxBeaconRequest      = 0"
    	"RxOther              = 0"
    	"RxAddressFiltered    = 0"
    	"RxDestAddrFiltered   = 181"
    	"RxDuplicated         = 2"
    	"RxErrNoFrame         = 0"
    	"RxErrUnknownNeighbor = 0"
    	"RxErrInvalidSrcAddr  = 0"
    	"RxErrSec             = 0"
    	"RxErrFcs             = 0"
    	"RxErrOther           = 0"
    ]
    NCP:Counter:RX_IP_DROPPED = 0
    NCP:Counter:RX_IP_INSEC_TOTAL = 0
    NCP:Counter:RX_IP_SEC_TOTAL = 319
    NCP:Counter:RX_SPINEL_ERR = 0
    NCP:Counter:RX_SPINEL_TOTAL = 385
    NCP:Counter:TX_IP_DROPPED = 0
    NCP:Counter:TX_IP_INSEC_TOTAL = 0
    NCP:Counter:TX_IP_SEC_TOTAL = 278
    NCP:Counter:TX_SPINEL_TOTAL = 0
    NCP:ExtendedAddress = [42A3D787A5A3C788]
    NCP:Frequency = 2465000
    NCP:HardwareAddress = [86B68714DDA9397B]
    NCP:MACAddress = [42A3D787A5A3C788]
    NCP:RSSI = -46
    NCP:SleepyPollInterval = 235000
    NCP:State = "associated"
    NCP:TXPower = 0
    NCP:Version = "OPENTHREAD/20170716-00532-gec59d7e-dirty; NRF52840; Apr  6 2018 23:55:28"
    Network:IsCommissioned = true
    Network:Key = [A8421C1F871DCA54DDBB942F89688B22]
    Network:KeyIndex = 1
    Network:Name = "testmesh"
    Network:NodeType = "leader"
    Network:PANID = 0x9358
    Network:PSKc = [00000000000000000000000000000000]
    Network:PartitionId = 1888992561
    Network:XPANID = 0x48B09FACA7039EEE
    Thread:ActiveDataset = [
    	"Dataset:ActiveTimestamp          =  0x0000000000000000"
    	"Dataset:Channel                  =  23"
    	"Dataset:NetworkName              =  "testmesh""
    	"Dataset:PanId                    =  0x9358"
    	"Dataset:ExtendedPanId            =  0x48B09FACA7039EEE"
    	"Dataset:MasterKey                =  [A8421C1F871DCA54DDBB942F89688B22]"
    	"Dataset:MeshLocalPrefix          =  fd48:b09f:aca7::/64"
    	"Dataset:ChannelMaskPage0         =  0x07FFF800"
    	"Dataset:PSKc                     =  [00000000000000000000000000000000]"
    	"Dataset:SecPolicy:KeyRotation    =  672"
    	"Dataset:SecPolicy:Flags          =  0xFF"
    ]
    Thread:ChildTable = []
    Thread:ChildTable:Addresses = []
    Thread:Commissioner:Enabled = false
    Thread:DeviceMode = 0x0F
    Thread:Leader:Address = "fd48:b09f:aca7::ff:fe00:cc00"
    Thread:Leader:LocalWeight = 0x40
    Thread:Leader:NetworkData = [08040B020000]
    Thread:Leader:RouterID = 0x33
    Thread:Leader:StableNetworkData = []
    Thread:Leader:Weight = 0x40
    Thread:NeighborTable = [
    	"C269CEFE5C8ED7C7, RLOC16:1000, LQIn:3, AveRssi:-45, LastRssi:-46, Age:5, LinkFC:1748959, MleFC:78862, IsChild:no, RxOnIdle:yes, FFD:yes, SecDataReq:no, FullNetData:yes"
    ]
    Thread:NeighborTable:ErrorRates = [
    	"C269CEFE5C8ED7C7, RLOC16:1000, FrameErrRate:8.09%, MsgErrorRate:0.00%, AveRssi:-45, LastRssi:-46, "
    ]
    Thread:NetworkData = []
    Thread:NetworkDataVersion = 0xBC
    Thread:OffMeshRoutes = []
    Thread:OnMeshPrefixes = []
    Thread:PendingDataset = []
    Thread:RLOC16 = 0xCC00
    Thread:RouterID = 0x33
    Thread:RouterTable = [
    	"C269CEFE5C8ED7C7, RLOC16:1000, RouterId:4, NextHop:63, PathCost:0, LQIn:3, LQOut:3, Age:5, LinkEst:yes"
    	"3A8483AD34114F18, RLOC16:9000, RouterId:36, NextHop:63, PathCost:0, LQIn:0, LQOut:0, Age:85, LinkEst:no"
    	"42A3D787A5A3C788, RLOC16:cc00, RouterId:51, NextHop:51, PathCost:0, LQIn:0, LQOut:0, Age:123, LinkEst:no"
    ]
    Thread:StableNetworkData = []
    Thread:StableNetworkDataVersion = 0x10
    TmfProxy:Enabled = false
    com.nestlabs.internal:Network:AllowingJoin = false
    com.nestlabs.internal:Network:PassthruPort = 0x1634

    Device 01 msgBuffers

    OpenThread:MsgBufferCounters:AsString = "TotalBuffers = 128, FreeBuffers = 125,
    6loSendMessages = 0  , 6loSendBuffers = 0  , 6loReassemblyMessages = 1  ,
    6loReassemblyBuffers = 3  , Ip6Messages = 0  , Ip6Buffers = 0  , MplMessages = 0  ,
    MplBuffers = 0  , MleMessages = 0  , MleBuffers = 0  , ArpMessages = 0  , ArpBuffers = 0  ,
    CoapClientMessages = 0  , CoapClientBuffers = 0  , "

    Device 02 (faulty) get prop

    Config:NCP:DriverName = "spinel"
    Config:TUN:InterfaceName = "wpan0"
    Daemon:AutoAssociateAfterReset = true
    Daemon:AutoDeepSleep = false
    Daemon:Enabled = true
    Daemon:ReadyForHostSleep = true
    Daemon:SetDefaultRouteForAutoAddedPrefix = false
    Daemon:TerminateOnFault = false
    Daemon:Version = "0.08.00d (0.07.01-209-ga88988a; May 14 2018 17:06:58)"
    IPv6:AllAddresses = [
    	"fd48:b09f:aca7:0:1f25:ce15:4dcc:d0fd     prefix_len:64   origin:ncp      valid:forever   preferred:forever"
    	"fe80::3884:83ad:3411:4f18                prefix_len:64   origin:ncp      valid:forever   preferred:forever"
    ]
    IPv6:LinkLocalAddress = "fe80::3884:83ad:3411:4f18"
    IPv6:MeshLocalAddress = "fd48:b09f:aca7:0:1f25:ce15:4dcc:d0fd"
    IPv6:MeshLocalPrefix = "fd48:b09f:aca7::/64"
    IPv6:MulticastAddresses = [
    	"ff02::1                                  origin:ncp"
    	"ff02::2                                  origin:ncp"
    	"ff03::1                                  origin:ncp"
    	"ff03::2                                  origin:ncp"
    	"ff03::fc                                 origin:ncp"
    	"ff03::114                                origin:user"
    	"ff32:40:fd48:b09f:aca7::1                origin:ncp"
    	"ff33:40:fd48:b09f:aca7::1                origin:ncp"
    ]
    IPv6:Routes = []
    IPv6:SetSLAACForAutoAddedPrefix = false
    Interface:Up = true
    MAC:Blacklist:Enabled = false
    MAC:Blacklist:Entries = []
    MAC:Whitelist:Enabled = false
    MAC:Whitelist:Entries = []
    NCP:CCAFailureRate = 0x201A
    NCP:Channel = 23
    NCP:ChannelMask = 134215680
    NCP:Counter:AllMac = [
    	"TxTotal              = 881"
    	"TxUnicast            = 190"
    	"TxBroadcast          = 691"
    	"TxAckRequested       = 190"
    	"TxAcked              = 189"
    	"TxNoAckRequested     = 691"
    	"TxData               = 881"
    	"TxDataPoll           = 0"
    	"TxBeacon             = 0"
    	"TxBeaconRequest      = 0"
    	"TxOther              = 0"
    	"TxRetry              = 148"
    	"TxErrCca             = 128"
    	"TxErrAbort           = 0"
    	"TxErrBusyChannel     = 0"
    	"RxTotal              = 4616"
    	"RxUnicast            = 195"
    	"RxBroadcast          = 1814"
    	"RxData               = 2005"
    	"RxDataPoll           = 0"
    	"RxBeacon             = 0"
    	"RxBeaconRequest      = 0"
    	"RxOther              = 0"
    	"RxAddressFiltered    = 0"
    	"RxDestAddrFiltered   = 563"
    	"RxDuplicated         = 4"
    	"RxErrNoFrame         = 1"
    	"RxErrUnknownNeighbor = 2043"
    	"RxErrInvalidSrcAddr  = 0"
    	"RxErrSec             = 0"
    	"RxErrFcs             = 0"
    	"RxErrOther           = 0"
    ]
    NCP:Counter:RX_IP_DROPPED = 0
    NCP:Counter:RX_IP_INSEC_TOTAL = 0
    NCP:Counter:RX_IP_SEC_TOTAL = 160
    NCP:Counter:RX_SPINEL_ERR = 0
    NCP:Counter:RX_SPINEL_TOTAL = 273
    NCP:Counter:TX_IP_DROPPED = 90
    NCP:Counter:TX_IP_INSEC_TOTAL = 0
    NCP:Counter:TX_IP_SEC_TOTAL = 121
    NCP:Counter:TX_SPINEL_TOTAL = 0
    NCP:ExtendedAddress = [3A8483AD34114F18]
    NCP:Frequency = 2465000
    NCP:HardwareAddress = [67FA0605A4D94A64]
    NCP:MACAddress = [3A8483AD34114F18]
    NCP:RSSI = -37
    NCP:SleepyPollInterval = 235000
    NCP:State = "associated"
    NCP:TXPower = 0
    NCP:Version = "OPENTHREAD/20170716-00532-gec59d7e-dirty; NRF52840; Apr  6 2018 23:55:28"
    Network:IsCommissioned = true
    Network:Key = [A8421C1F871DCA54DDBB942F89688B22]
    Network:KeyIndex = 1
    Network:Name = "testmesh"
    Network:NodeType = "leader"
    Network:PANID = 0x9358
    Network:PSKc = [00000000000000000000000000000000]
    Network:PartitionId = 1476316774
    Network:XPANID = 0x48B09FACA7039EEE
    Thread:ActiveDataset = [
    	"Dataset:ActiveTimestamp          =  0x0000000000000000"
    	"Dataset:Channel                  =  23"
    	"Dataset:NetworkName              =  "testmesh""
    	"Dataset:PanId                    =  0x9358"
    	"Dataset:ExtendedPanId            =  0x48B09FACA7039EEE"
    	"Dataset:MasterKey                =  [A8421C1F871DCA54DDBB942F89688B22]"
    	"Dataset:MeshLocalPrefix          =  fd48:b09f:aca7::/64"
    	"Dataset:ChannelMaskPage0         =  0x07FFF800"
    	"Dataset:PSKc                     =  [00000000000000000000000000000000]"
    	"Dataset:SecPolicy:KeyRotation    =  672"
    	"Dataset:SecPolicy:Flags          =  0xFF"
    ]
    Thread:ChildTable = []
    Thread:ChildTable:Addresses = []
    Thread:Commissioner:Enabled = false
    Thread:DeviceMode = 0x0F
    Thread:Leader:Address = "fd48:b09f:aca7::ff:fe00:9000"
    Thread:Leader:LocalWeight = 0x40
    Thread:Leader:NetworkData = [08040B020001]
    Thread:Leader:RouterID = 0x24
    Thread:Leader:StableNetworkData = []
    Thread:Leader:Weight = 0x40
    Thread:NeighborTable = []
    Thread:NeighborTable:ErrorRates = []
    Thread:NetworkData = []
    Thread:NetworkDataVersion = 0x8E
    Thread:OffMeshRoutes = []
    Thread:OnMeshPrefixes = []
    Thread:PendingDataset = []
    Thread:RLOC16 = 0x9000
    Thread:RouterID = 0x24
    Thread:RouterTable = [
    	"3A8483AD34114F18, RLOC16:9000, RouterId:36, NextHop:36, PathCost:0, LQIn:0, LQOut:0, Age:159, LinkEst:no"
    ]
    Thread:StableNetworkData = []
    Thread:StableNetworkDataVersion = 0x0F
    TmfProxy:Enabled = false
    com.nestlabs.internal:Network:AllowingJoin = false
    com.nestlabs.internal:Network:PassthruPort = 0x1634

    Device 02 msgBuffers

    OpenThread:MsgBufferCounters:AsString = "TotalBuffers = 128, FreeBuffers = 0  ,
    6loSendMessages = 42 , 6loSendBuffers = 128, 6loReassemblyMessages = 0  ,
    6loReassemblyBuffers = 0  , Ip6Messages = 0  , Ip6Buffers = 0  , MplMessages = 0  ,
    MplBuffers = 0  , MleMessages = 0  , MleBuffers = 0  , ArpMessages = 0  , ArpBuffers = 0  ,
    CoapClientMessages = 0  , CoapClientBuffers = 0  , "

    Device 03 get prop

    Config:NCP:DriverName = "spinel"
    Config:TUN:InterfaceName = "wpan0"
    Daemon:AutoAssociateAfterReset = true
    Daemon:AutoDeepSleep = false
    Daemon:Enabled = true
    Daemon:ReadyForHostSleep = true
    Daemon:SetDefaultRouteForAutoAddedPrefix = false
    Daemon:TerminateOnFault = false
    Daemon:Version = "0.08.00d (0.07.01-209-ga88988a; May 14 2018 17:06:55)"
    IPv6:AllAddresses = [
    	"fd48:b09f:aca7:0:dd86:bf38:ffba:d60b     prefix_len:64   origin:ncp      valid:forever   preferred:forever"
    	"fe80::c069:cefe:5c8e:d7c7                prefix_len:64   origin:ncp      valid:forever   preferred:forever"
    ]
    IPv6:LinkLocalAddress = "fe80::c069:cefe:5c8e:d7c7"
    IPv6:MeshLocalAddress = "fd48:b09f:aca7:0:dd86:bf38:ffba:d60b"
    IPv6:MeshLocalPrefix = "fd48:b09f:aca7::/64"
    IPv6:MulticastAddresses = [
    	"ff02::1                                  origin:ncp"
    	"ff02::2                                  origin:ncp"
    	"ff03::1                                  origin:ncp"
    	"ff03::2                                  origin:ncp"
    	"ff03::fc                                 origin:ncp"
    	"ff03::114                                origin:user"
    	"ff32:40:fd48:b09f:aca7::1                origin:ncp"
    	"ff33:40:fd48:b09f:aca7::1                origin:ncp"
    ]
    IPv6:Routes = []
    IPv6:SetSLAACForAutoAddedPrefix = false
    Interface:Up = true
    MAC:Blacklist:Enabled = false
    MAC:Blacklist:Entries = []
    MAC:Whitelist:Enabled = false
    MAC:Whitelist:Entries = []
    NCP:CCAFailureRate = 0x07B6
    NCP:Channel = 23
    NCP:ChannelMask = 134215680
    NCP:Counter:AllMac = [
    	"TxTotal              = 2433"
    	"TxUnicast            = 382"
    	"TxBroadcast          = 2051"
    	"TxAckRequested       = 382"
    	"TxAcked              = 381"
    	"TxNoAckRequested     = 2051"
    	"TxData               = 2433"
    	"TxDataPoll           = 0"
    	"TxBeacon             = 0"
    	"TxBeaconRequest      = 0"
    	"TxOther              = 0"
    	"TxRetry              = 192"
    	"TxErrCca             = 158"
    	"TxErrAbort           = 0"
    	"TxErrBusyChannel     = 2"
    	"RxTotal              = 3110"
    	"RxUnicast            = 370"
    	"RxBroadcast          = 2568"
    	"RxData               = 2936"
    	"RxDataPoll           = 0"
    	"RxBeacon             = 0"
    	"RxBeaconRequest      = 0"
    	"RxOther              = 0"
    	"RxAddressFiltered    = 0"
    	"RxDestAddrFiltered   = 170"
    	"RxDuplicated         = 2"
    	"RxErrNoFrame         = 2"
    	"RxErrUnknownNeighbor = 0"
    	"RxErrInvalidSrcAddr  = 0"
    	"RxErrSec             = 0"
    	"RxErrFcs             = 0"
    	"RxErrOther           = 0"
    ]
    NCP:Counter:RX_IP_DROPPED = 0
    NCP:Counter:RX_IP_INSEC_TOTAL = 0
    NCP:Counter:RX_IP_SEC_TOTAL = 347
    NCP:Counter:RX_SPINEL_ERR = 0
    NCP:Counter:RX_SPINEL_TOTAL = 369
    NCP:Counter:TX_IP_DROPPED = 0
    NCP:Counter:TX_IP_INSEC_TOTAL = 0
    NCP:Counter:TX_IP_SEC_TOTAL = 306
    NCP:Counter:TX_SPINEL_TOTAL = 0
    NCP:ExtendedAddress = [C269CEFE5C8ED7C7]
    NCP:Frequency = 2465000
    NCP:HardwareAddress = [46CB21581041267F]
    NCP:MACAddress = [C269CEFE5C8ED7C7]
    NCP:RSSI = -45
    NCP:SleepyPollInterval = 235000
    NCP:State = "associated"
    NCP:TXPower = 0
    NCP:Version = "OPENTHREAD/20170716-00532-gec59d7e-dirty; NRF52840; Apr  6 2018 23:55:28"
    Network:IsCommissioned = true
    Network:Key = [A8421C1F871DCA54DDBB942F89688B22]
    Network:KeyIndex = 1
    Network:Name = "testmesh"
    Network:NodeType = "router"
    Network:PANID = 0x9358
    Network:PSKc = [00000000000000000000000000000000]
    Network:PartitionId = 1888992561
    Network:XPANID = 0x48B09FACA7039EEE
    Thread:ActiveDataset = [
    	"Dataset:ActiveTimestamp          =  0x0000000000000000"
    	"Dataset:Channel                  =  23"
    	"Dataset:NetworkName              =  "testmesh""
    	"Dataset:PanId                    =  0x9358"
    	"Dataset:ExtendedPanId            =  0x48B09FACA7039EEE"
    	"Dataset:MasterKey                =  [A8421C1F871DCA54DDBB942F89688B22]"
    	"Dataset:MeshLocalPrefix          =  fd48:b09f:aca7::/64"
    	"Dataset:ChannelMaskPage0         =  0x07FFF800"
    	"Dataset:PSKc                     =  [00000000000000000000000000000000]"
    	"Dataset:SecPolicy:KeyRotation    =  672"
    	"Dataset:SecPolicy:Flags          =  0xFF"
    ]
    Thread:ChildTable = []
    Thread:ChildTable:Addresses = []
    Thread:Commissioner:Enabled = false
    Thread:DeviceMode = 0x0F
    Thread:Leader:Address = "fd48:b09f:aca7::ff:fe00:cc00"
    Thread:Leader:LocalWeight = 0x40
    Thread:Leader:NetworkData = [08040B020000]
    Thread:Leader:RouterID = 0x33
    Thread:Leader:StableNetworkData = []
    Thread:Leader:Weight = 0x40
    Thread:NeighborTable = [
    	"42A3D787A5A3C788, RLOC16:cc00, LQIn:3, AveRssi:-44, LastRssi:-45, Age:26, LinkFC:3516331, MleFC:157625, IsChild:no, RxOnIdle:yes, FFD:yes, SecDataReq:no, FullNetData:yes"
    ]
    Thread:NeighborTable:ErrorRates = [
    	"42A3D787A5A3C788, RLOC16:cc00, FrameErrRate:3.32%, MsgErrorRate:0.41%, AveRssi:-44, LastRssi:-45, "
    ]
    Thread:NetworkData = []
    Thread:NetworkDataVersion = 0xBC
    Thread:OffMeshRoutes = []
    Thread:OnMeshPrefixes = []
    Thread:PendingDataset = []
    Thread:RLOC16 = 0x1000
    Thread:RouterID = 0x04
    Thread:RouterTable = [
    	"C269CEFE5C8ED7C7, RLOC16:1000, RouterId:4, NextHop:63, PathCost:0, LQIn:0, LQOut:0, Age:214, LinkEst:no"
    	"42A3D787A5A3C788, RLOC16:cc00, RouterId:51, NextHop:63, PathCost:0, LQIn:3, LQOut:3, Age:26, LinkEst:yes"
    ]
    Thread:StableNetworkData = []
    Thread:StableNetworkDataVersion = 0x10
    TmfProxy:Enabled = false
    com.nestlabs.internal:Network:AllowingJoin = false
    com.nestlabs.internal:Network:PassthruPort = 0x1634

    Device 03 msgBuffers

    OpenThread:MsgBufferCounters:AsString = "TotalBuffers = 128, FreeBuffers = 128,
    6loSendMessages = 0  , 6loSendBuffers = 0  , 6loReassemblyMessages = 0  ,
    6loReassemblyBuffers = 0  , Ip6Messages = 0  , Ip6Buffers = 0  , MplMessages = 0  ,
    MplBuffers = 0  , MleMessages = 0  , MleBuffers = 0  , ArpMessages = 0  , ArpBuffers = 0  ,
    CoapClientMessages = 0  , CoapClientBuffers = 0  , "

  • You can provide the output of the commands regardless of if the issue have ocurred or not, it should give static information.

    Yes, the buffer size is of interest to consider memory leaks, etc. If you could provide sniffer logs, please attach it and I will forward it to the developers in case it could be useful

  • experiment.tar.gz

    Here is an archive of the latest experiment that failed, this time it is the device 01 that failed (it was a router). There is the output of the commands for each devices and a capture performed using pyspinel on another PC. Device01 address is 0xcc00 and its last message is at packet 319922 (15:53:52.525670). Hope it helps.

    It is the sending that fails first,  the buggy devices still receives messages until 15:53:57.677357 (that is around packet 320101), and the syslog indicate that it went from router to leader at 15:56:45 :

    Jun 13 15:56:45 raspberrypi wpantund[422]: Node type change: "router" -> "leader"
    Jun 13 16:08:47 raspberrypi wpantund[422]: SpinelNCPTaskSendCommand.cpp:322: Requirement Failed ((mRetVal) == 0)
    Jun 13 16:08:47 raspberrypi wpantund[422]: SendCommand task encountered an error: 16 (0x00000010)

  • Hello vneu,

    We are still investigate similar bug that you have found, however, based on your logs, device 2 has no free buffers and all of them are placed in 6Lo layer queue.

    I have fixed that bug already: https://github.com/openthread/openthread/commit/082481864686abebab7b0f04790200f872b92bb3

    Would it be possible for you to rebuild OpenThread libraries from the master branch and try to run your examples again?

    I guess the best way would be to build them under Linux OS, since there were several changes in the OpenThread files.

    https://infocenter.nordicsemi.com/index.jsp?topic=%2Fcom.nordic.infocenter.thread_zigbee.v1.0.0%2Fthread_intro.html&cp=4_2_0_1_0_4&anchor=thread_qsg_libraries

    We are sorry for inconvinience, the new libraries with proper fixes will be included in the subsequent releases of nRF5 SDK for Thread.

Related