<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="https://devzone.nordicsemi.com/cfs-file/__key/system/syndication/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Mesh back and forth seems to break connection</title><link>https://devzone.nordicsemi.com/f/nordic-q-a/107621/mesh-back-and-forth-seems-to-break-connection</link><description>Hi, 
 We have one customer having two CoAP hosts and some CoAP clients in the form of wireless sensors. The sensors are paired to a single host. The pairing is actually in the app level, where the sensor discovers the network IP of the host in pairing</description><dc:language>en-US</dc:language><generator>Telligent Community 13</generator><lastBuildDate>Thu, 20 Jun 2024 01:07:55 GMT</lastBuildDate><atom:link rel="self" type="application/rss+xml" href="https://devzone.nordicsemi.com/f/nordic-q-a/107621/mesh-back-and-forth-seems-to-break-connection" /><item><title>RE: Mesh back and forth seems to break connection</title><link>https://devzone.nordicsemi.com/thread/489608?ContentTypeID=1</link><pubDate>Thu, 20 Jun 2024 01:07:55 GMT</pubDate><guid isPermaLink="false">137ad170-7792-4731-bb38-c0d22fbe4515:ad1be723-8b60-4c3e-9743-c6218b5eb4b7</guid><dc:creator>kaushalyasat</dc:creator><description>[quote userid="26071" url="~/f/nordic-q-a/107621/mesh-back-and-forth-seems-to-break-connection/489524"] Are you saying that you have two networks that each router will be part of? One small network between itself and the sensors, and one other &amp;quot;big&amp;quot; network with all the routers and the gateway?[/quote]
&lt;p&gt;Not exactly. We need a router per installation. This installation is like a home. Then the sensors in that property shall be commissioned to that router. If certain sensors are out of reach, we provision a range extender (another router) to bridge the gap. All in all, the entire property shall have a single network.&amp;nbsp;&lt;/p&gt;
&lt;p&gt;The neighboring property, say this is a multi-storey apartment building, will have another network on their own as I mentioned before. So each property will have a small network of their own. This way a sensor of one property shall not mesh to a router in another property, which could yield to partitioning.&lt;/p&gt;
&lt;p&gt;Still a partition could happen inside a property due to some radio disturbance later on. All the property owner need to do is&amp;nbsp;&lt;/p&gt;
&lt;p&gt;1. power cycle each sensor that has disconnected.&lt;/p&gt;
&lt;p&gt;2. or worst case power cycle the router/routers.&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Also as we report partitionID of each router to a dashboard, we can remotely detect that there is partitioning inside a property, as all the routers that belong to a property are grouped together. Only primary router will be connected to internet and if we cant see other routers, that means they are partitioned.&amp;nbsp;&lt;/p&gt;
&lt;p&gt;I have had a discussion with OpenThread GIT group and their opinion is the partitioning and the SEDs behaviors are as part of the OpenThread spec. But I think this behavior is not correct as it can yield to non-recoverable communication failures in the network. Even if we detect partitioning in the SEDs, there is nothing application level program can do to connect to a different parent.&lt;/p&gt;
&lt;p&gt;Only way I can think of is the routers to detect partitioning and do a individual network restart until a single partition is made. If after couple of cycles still not resolvable then the primary router should report this to the user and the dashboard. Then human intervention is needed to commission another router to resolve.&lt;/p&gt;
&lt;p&gt;Cheers,&lt;/p&gt;
&lt;p&gt;Kaushalya&lt;/p&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;</description></item><item><title>RE: Mesh back and forth seems to break connection</title><link>https://devzone.nordicsemi.com/thread/489524?ContentTypeID=1</link><pubDate>Wed, 19 Jun 2024 12:59:20 GMT</pubDate><guid isPermaLink="false">137ad170-7792-4731-bb38-c0d22fbe4515:c0f2e199-9948-4411-b467-700ddf80f642</guid><dc:creator>Edvin</dc:creator><description>&lt;p&gt;If you don&amp;#39;t have a stable Thread network, I am not sure whether Openthread is really the best protocol, but I guess it is a bit late to change this now?&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;1: I think that sounds like a good idea, as it would make sure that your entire network is in one partition. However, it seems like this is not 100% going to work, since most of the time, you get all messages as expected, but after some months, the network suddenly splits. Perhaps caused by some added radio noise in the area. This means that this may not be detected during installation. But it could perhaps be used to detect the partitioning. I don&amp;#39;t know whether both partitions have internet access, and hence, whether they can report that they are on a separate partition. You are more familiar with the setup, and can say whether this would work or not.&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;2: Are you saying that you have two networks that each router will be part of? One small network between itself and the sensors, and one other &amp;quot;big&amp;quot; network with all the routers and the gateway? I am not sure how this would solve it if the&amp;nbsp;big&amp;nbsp;network struggles with partitioning. Then you will still not be able to reach all the sensors from the small networks.&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;If the issue is that there is no 100% stable route between the sensor and the gateway, then adding more routers seems like adding more routers is the way to go. The key is to understand when this is the case, and your proposal #1 would perhaps help with this?&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;Best regards,&lt;/p&gt;
&lt;p&gt;Edvin&lt;/p&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;</description></item><item><title>RE: Mesh back and forth seems to break connection</title><link>https://devzone.nordicsemi.com/thread/489169?ContentTypeID=1</link><pubDate>Mon, 17 Jun 2024 23:59:02 GMT</pubDate><guid isPermaLink="false">137ad170-7792-4731-bb38-c0d22fbe4515:e142c96e-ffcf-460c-ba2d-bd11418b0da3</guid><dc:creator>kaushalyasat</dc:creator><description>&lt;p&gt;Hi Edvin,&lt;/p&gt;
&lt;p&gt;The issue we see here is actually not partitioning, but the SEDs cant seem to proactively dislodge from a router and do a MLE reattach to another. There is no guarantee that the SED will not be connected to the same router again. From the Thread standpoint this may be a perfectly valid scenario, but its a killer or non-recoverable error for us.&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Does &amp;#39;Search for a better parent&amp;#39; feature pick up a new router ?&lt;/p&gt;
&lt;p&gt;Now we are implementing two solutions.&lt;/p&gt;
&lt;p&gt;1. Make the partitionID available in our dashboard for each router. It the commissioning stage we make sure all routers form a single partition. If not we introduce additional routers until we can get a single partition.&lt;/p&gt;
&lt;p&gt;2. Instead of a single big network, we implement networks per router. We have one main router in a installation and if we need to bring in more, we commission them with main router, so that they all have same network credentials as the main router. Similarly the SEDs will be commissioned in. At the point of commissioning we will have a single contagious network per installation. This way, meshing will be limited within that network.&lt;/p&gt;
&lt;p&gt;The only other issue this doesn&amp;#39;t solve is a one case a test user reported, where he says he didn&amp;#39;t have no&amp;nbsp; more than one router in his system at all times, but he saw his sensors disconnecting. As this is only single case we have reported, we will park this scenario till we have more conclusive data.&lt;/p&gt;
&lt;p&gt;What do you think?&lt;/p&gt;
&lt;p&gt;Cheers,&lt;/p&gt;
&lt;p&gt;Kaushalya&lt;/p&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;</description></item><item><title>RE: Mesh back and forth seems to break connection</title><link>https://devzone.nordicsemi.com/thread/489127?ContentTypeID=1</link><pubDate>Mon, 17 Jun 2024 13:45:39 GMT</pubDate><guid isPermaLink="false">137ad170-7792-4731-bb38-c0d22fbe4515:0ed1176f-b074-4d54-b4fc-822f8b5c9334</guid><dc:creator>Edvin</dc:creator><description>&lt;p&gt;Hello&amp;nbsp;Kaushalya,&lt;/p&gt;
&lt;p&gt;Interesting. If the parents split into two different networks, then there will exist two different networks with the same credentials, not able to communicate with one another. And it is not possible to connect the two networks using a SED device.&lt;/p&gt;
&lt;p&gt;The solution is &amp;quot;simple&amp;quot;. You need to make sure that if you struggle with that the two partitions of the network can&amp;#39;t reach one another, you need to add another router that can reach both the network partitions (so that they are no longer separate partitions).&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;I believe this is where we started this conversation. I remember asking for state changes to see if a router became leader (which it will if part of the network splits out in it&amp;#39;s own partition).&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;Is it an option to add another router making sure the network doesn&amp;#39;t split into two partitions?&lt;/p&gt;
&lt;p&gt;Best regards,&lt;/p&gt;
&lt;p&gt;Edvin&lt;/p&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;</description></item><item><title>RE: Mesh back and forth seems to break connection</title><link>https://devzone.nordicsemi.com/thread/488766?ContentTypeID=1</link><pubDate>Thu, 13 Jun 2024 23:53:02 GMT</pubDate><guid isPermaLink="false">137ad170-7792-4731-bb38-c0d22fbe4515:d14eab7d-e3c6-46bf-9f15-da5db1dc5e09</guid><dc:creator>kaushalyasat</dc:creator><description>&lt;p&gt;Hi Edvin,&lt;/p&gt;
&lt;p&gt;Sorry about my delay, I was not well past couple of days and only now back in work.&lt;/p&gt;
[quote userid="26071" url="~/f/nordic-q-a/107621/mesh-back-and-forth-seems-to-break-connection/487218"]But the supervision messages are only between the child and it&amp;#39;s parent, so this message is not routed to any other routers than the parent itself. Right?[/quote]
&lt;p&gt;No the message is send to another router. The parent is just routing it through.&amp;nbsp;&lt;/p&gt;
&lt;p&gt;I think we have identified a potential pitfall in openthread networks. That is partitioning. We think our failure mechanism can be explained by that.&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Think of two routers and a set of sensors (SEDs). If both routers are in the same partition, then no issues. But for whatever reason one router lost its connection to the other, then there will be two network partitions. The problem happens when the Sensors can still reach both routers. If a sensor that is sensing data the router1 suddenly lost its connection and connected to router2, the data path will be broken, but the Sensor will happily stay in the current connection until another MLE session decide to connect back to router1. The issue is for the MLE protocol, every router in range is a potential parent, but this may not work for application data.&lt;/p&gt;
&lt;p&gt;It looks like we cant constrain the MLE process to limit the parent search to its own partition. We can implement a two way handshake from application level and initiate another MLE session from the Sensor, but there is no guarantee that it may not choose a parent from a different partition.&lt;/p&gt;
[quote userid="26071" url="~/f/nordic-q-a/107621/mesh-back-and-forth-seems-to-break-connection/487218"]So this is the log from the router. When did it crash? (timestamp?)[/quote]
&lt;p&gt;Sorry Edvin, as this happened when we were not at office, we dont know exactly at what time this happened. So it may be very diff to extract it from the log.&lt;/p&gt;
&lt;p&gt;Cheers,&lt;/p&gt;
&lt;p&gt;Kaushalya&lt;/p&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;</description></item><item><title>RE: Mesh back and forth seems to break connection</title><link>https://devzone.nordicsemi.com/thread/487218?ContentTypeID=1</link><pubDate>Tue, 04 Jun 2024 07:31:14 GMT</pubDate><guid isPermaLink="false">137ad170-7792-4731-bb38-c0d22fbe4515:663ae82d-5dac-4823-996f-b427e4f6e4e9</guid><dc:creator>Edvin</dc:creator><description>[quote user="kaushalyasat"]&lt;p&gt;From what we have seen, its not the node dropping off. The nodes (SEDs) are sending their data out to the associated parent. I think the issue is the parent gets into some error state all of a sudden and stops sending these packets any further.&amp;nbsp;&lt;/p&gt;
&lt;div&gt;&lt;/div&gt;&lt;blockquote&gt;&lt;div&gt;&lt;/div&gt;&lt;/blockquote&gt;[/quote]
&lt;p&gt;But the supervision messages are only between the child and it&amp;#39;s parent, so this message is not routed to any other routers than the parent itself. Right?&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
[quote user="kaushalyasat"]&lt;p&gt;What puzzles me is how long it took for the problematic router to be healed. Also this &amp;#39;healing&amp;#39; happened when I connect to the console, so dont know if that had any effect. As this is a rare event, very difficult to deep diagnose.&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;[/quote]
&lt;p&gt;How long did it take?&lt;/p&gt;
[quote user="kaushalyasat"]&lt;p&gt;This is what I saw from the log while I was connecting to router 0x9400.&lt;/p&gt;
&lt;p&gt;


&lt;/p&gt;&lt;div&gt;
    &lt;div&gt;
        &lt;a href="#" title="Fullscreen"&gt;&lt;/a&gt;&lt;/div&gt;&lt;/div&gt;[/quote]
&lt;p&gt;So this is the log from the router. When did it crash? (timestamp?)&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
[quote user="kaushalyasat"]I guess if I wait 240 sec timeout, and try thread stop/start, I might get connected to a new router. What do you think?[/quote]
&lt;p&gt;I would assume it does. But I guess the main issue here is that this is not happening at all times, right?&lt;/p&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;</description></item><item><title>RE: Mesh back and forth seems to break connection</title><link>https://devzone.nordicsemi.com/thread/486756?ContentTypeID=1</link><pubDate>Fri, 31 May 2024 00:05:20 GMT</pubDate><guid isPermaLink="false">137ad170-7792-4731-bb38-c0d22fbe4515:5e2f7f8e-a909-48a8-958c-493d0c1eef7c</guid><dc:creator>kaushalyasat</dc:creator><description>&lt;p&gt;Hi Edvin,&lt;/p&gt;
&lt;p&gt;Sorry I forgot to attach the pcap log.&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;a href="https://devzone.nordicsemi.com/cfs-file/__key/communityserver-discussions-components-files/4/5633.May_2D00_30_2D00_prov-success-and-fail.pcapng"&gt;devzone.nordicsemi.com/.../5633.May_2D00_30_2D00_prov-success-and-fail.pcapng&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
[quote userid="26071" url="~/f/nordic-q-a/107621/mesh-back-and-forth-seems-to-break-connection/486592"]Child supervision will not prevent nodes from dropping out of the network. It will just make it easier to detect whether it has happened or not.[/quote]
&lt;p&gt;From what we have seen, its not the node dropping off. The nodes (SEDs) are sending their data out to the associated parent. I think the issue is the parent gets into some error state all of a sudden and stops sending these packets any further.&amp;nbsp;&lt;/p&gt;
[quote userid="26071" url="~/f/nordic-q-a/107621/mesh-back-and-forth-seems-to-break-connection/486592"]I would say that this is the expected behavior (from all other than the router that suddenly disconnected), don&amp;#39;t you agree?[/quote]
&lt;p&gt;What puzzles me is how long it took for the problematic router to be healed. Also this &amp;#39;healing&amp;#39; happened when I connect to the console, so dont know if that had any effect. As this is a rare event, very difficult to deep diagnose.&lt;/p&gt;
&lt;p&gt;This is what I saw from the log while I was connecting to router 0x9400.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://devzone.nordicsemi.com/cfs-file/__key/communityserver-discussions-components-files/4/7181.0x9400-log.txt"&gt;devzone.nordicsemi.com/.../7181.0x9400-log.txt&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Also, is there a way to detach from a parent and start a renegotiate to connect to a different router? I tried &amp;#39;ot detach&amp;#39;, &amp;#39;ot thread stop&amp;#39; and &amp;#39;ot&amp;nbsp; thread start&amp;#39; from the SED. But I think since the SED is still in the child table of the first parent, all that happens is that I get a new RLOC. I guess if I wait 240 sec timeout, and try thread stop/start, I might get connected to a new router. What do you think?&lt;/p&gt;
&lt;p&gt;Cheers,&lt;/p&gt;
&lt;p&gt;Kaushalya&lt;/p&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;</description></item><item><title>RE: Mesh back and forth seems to break connection</title><link>https://devzone.nordicsemi.com/thread/486592?ContentTypeID=1</link><pubDate>Thu, 30 May 2024 08:50:52 GMT</pubDate><guid isPermaLink="false">137ad170-7792-4731-bb38-c0d22fbe4515:e4df8ad1-b252-4b4a-bc93-43b8db18d9e7</guid><dc:creator>Edvin</dc:creator><description>&lt;p&gt;Hello Kaushalya,&lt;/p&gt;
[quote user="kaushalyasat"]Normally when it powers up it connects to the network and change it&amp;#39;s state to a router/leader. But this time it stayed on as a child![/quote]
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;As you can see in the&lt;span&gt;&amp;nbsp;&lt;/span&gt;&lt;a href="https://openthread.io/guides/thread-primer/node-roles-and-types"&gt;Openthread documentation&lt;/a&gt;, the network will strive to keep the number of routers below 15, if it is feasible. This is in order to be able to expand the network in any direction (physically) when needed. Therefore, if there are no devices that need to attach to your new device, and you already have a large amount of routers, there is no need to promote that device to a router.&amp;nbsp;&lt;/p&gt;
[quote user="kaushalyasat"]I guess a FTD will remain a child as long as no child connects to it, am I correct?[/quote]
&lt;p&gt;Yes, as long as there is a decent coverage of routers in the area of the new child.&amp;nbsp;&lt;/p&gt;
[quote user="kaushalyasat"]&lt;p&gt;I can see that in &amp;#39;coap_send_request (...)&amp;#39; function, it is hard coded to be non-ack type (COAP_TYPE_NON_CON). I dont know if there is a way to request with confirmation using any API.&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Suppose I clone this&amp;nbsp;&lt;span&gt;&amp;#39;coap_send_request (...)&amp;#39; and modify it to send ACK packet, then I would need to &lt;/span&gt;&lt;span&gt;pass a call back with it to know the ack is successful or not?&lt;/span&gt;&lt;/p&gt;[/quote]
&lt;p&gt;If there are too few routers in the network, it will promote some of them to routers, so that the network is ready to accept new child nodes.&lt;/p&gt;
[quote user="kaushalyasat"]This is possible. Our understanding is even it takes time to renegotiate new paths, it should be ok as we are sending temperature info in every 30 sec. Even if the rerout takes couple of minutes, it should be ok. But the issue is once the sensor&amp;#39;s (SED)&amp;nbsp; data is lost, it remains so for days.[/quote]
&lt;p&gt;Child supervision will not prevent nodes from dropping out of the network. It will just make it easier to detect whether it has happened or not.&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;...&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;I am surprised that it doesn&amp;#39;t take into account whether you provide a callback function or not, and that you could specify whether the coap_send_request() should be acked or not. But you are right, you can clone this function, and tell it to use&amp;nbsp;COAP_TYPE_CON. Then you can also provide a coap_reply_t reply_cb, which will trigger when the callback occurs. I believe the node will automatically retransmit messages that are not acked, but you can use this while debugging.&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;...&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;I agree that it shouldn&amp;#39;t happen. If the connection to the parent is lost (because it is powered off), then the child should try to reconnect to the network through some other node.&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
[quote user="kaushalyasat"]After a while it &amp;#39;auto-healed&amp;#39; by itself and my provisioning packets are now received by the SED.[/quote]
&lt;p&gt;I would say that this is the expected behavior (from all other than the router that suddenly disconnected), don&amp;#39;t you agree?&amp;nbsp;&lt;/p&gt;
&lt;p&gt;You don&amp;#39;t happen to have any logs from the router that suddenly greyed out?&lt;/p&gt;
&lt;p&gt;Best regards,&lt;/p&gt;
&lt;p&gt;Edvin&lt;/p&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;</description></item><item><title>RE: Mesh back and forth seems to break connection</title><link>https://devzone.nordicsemi.com/thread/486555?ContentTypeID=1</link><pubDate>Thu, 30 May 2024 05:52:42 GMT</pubDate><guid isPermaLink="false">137ad170-7792-4731-bb38-c0d22fbe4515:10bc9e2c-91ea-4e52-b91c-af8026564b5c</guid><dc:creator>kaushalyasat</dc:creator><description>&lt;p&gt;Hi Edvin,&lt;/p&gt;
&lt;p&gt;Dont know if this is relevant for this issue, but today I saw a parent router going into some error state. I was provisioning and un-provisioning an SED from a FTD for a new fw feature and all of a sudden I noticed the prov reply send by the FTD does not reach the SED. When I look into wireshark log, I could see the parent router not sending the reply back. Also when I had a look at TTM, it showed this parent router grayed out and ext address field was empty. After a while it &amp;#39;auto-healed&amp;#39; by itself and my provisioning packets are now received by the SED. I wonder if this could be the error state the router ends up when we cant get SED data.&lt;/p&gt;
&lt;p&gt;What do you think?&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Following is a pcap of when the provisioning was working and stopped. Provisioning router is 0xb000, provisioning SED is 0x9423 and the parent router 0x9400&lt;/p&gt;
&lt;p&gt;&lt;br /&gt;&lt;a href="https://devzone.nordicsemi.com/cfs-file/__key/communityserver-discussions-components-files/4/May_2D00_30_2D00_prov-success-and-fail.pcapng"&gt;devzone.nordicsemi.com/.../May_2D00_30_2D00_prov-success-and-fail.pcapng&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;I have also created a separate thread to show the TTM findings.&amp;nbsp;&lt;a href="https://devzone.nordicsemi.com/f/nordic-q-a/111657/an-openthread-router-shown-as-grayed-out-in-ttm"&gt;An OpenThread router shown as grayed out in TTM&lt;/a&gt;&amp;nbsp;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Cheers,&lt;/p&gt;
&lt;p&gt;Kaushalya&lt;/p&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;</description></item><item><title>RE: Mesh back and forth seems to break connection</title><link>https://devzone.nordicsemi.com/thread/486537?ContentTypeID=1</link><pubDate>Thu, 30 May 2024 01:05:15 GMT</pubDate><guid isPermaLink="false">137ad170-7792-4731-bb38-c0d22fbe4515:31d063d8-beb9-470d-9c6c-c0273a3e6c96</guid><dc:creator>kaushalyasat</dc:creator><description>&lt;p&gt;Hi Edvin,&lt;/p&gt;
[quote userid="26071" url="~/f/nordic-q-a/107621/mesh-back-and-forth-seems-to-break-connection/485822"]Is it possible to add some logging here, to see if there is a state change? It doesn&amp;#39;t necessarily change to detached, so printing the actual state change can be helpful, if this is the case.[/quote]
&lt;p&gt;Something interesting I saw couple of days ago with a FTD on my desk. Normally when it powers up it connects to the network and change it&amp;#39;s state to a router/leader. But this time it stayed on as a child! Now if a router, who is actively routing packets to another router suddenly change state to a child, I guess we can explain the behavior we are discussing here.&lt;/p&gt;
&lt;p&gt;I guess a FTD will remain a child as long as no child connects to it, am I correct? But on the other hand I have seen many instances where a router&amp;#39;s child table is empty, but it remains as a router. What conditions determine this transition from child-router and vice versa?&amp;nbsp;&lt;/p&gt;
[quote userid="26071" url="~/f/nordic-q-a/107621/mesh-back-and-forth-seems-to-break-connection/485822"]I don&amp;#39;t understand this question. And I don&amp;#39;t understand how child supervision is relevant if the packet is lost between two nodes that are not children.[/quote]
&lt;p&gt;Sorry, what I was trying to get at is, we implemented child supervision to prevent a SED disconnection from a destination router, but may be not a solution here.&amp;nbsp;&lt;/p&gt;
[quote userid="26071" url="~/f/nordic-q-a/107621/mesh-back-and-forth-seems-to-break-connection/485822"] Is it a message that is supposed to be acknowledged or not?[/quote]
&lt;p&gt;I can see that in &amp;#39;coap_send_request (...)&amp;#39; function, it is hard coded to be non-ack type (COAP_TYPE_NON_CON). I dont know if there is a way to request with confirmation using any API.&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Suppose I clone this&amp;nbsp;&lt;span&gt;&amp;#39;coap_send_request (...)&amp;#39; and modify it to send ACK packet, then I would need to &lt;/span&gt;&lt;span&gt;pass a call back with it to know the ack is successful or not?&lt;/span&gt;&lt;/p&gt;
[quote userid="26071" url="~/f/nordic-q-a/107621/mesh-back-and-forth-seems-to-break-connection/485822"]Could it be that the packet path between the source and destination was through a router that was suddenly powered off?[/quote]
&lt;p&gt;This is possible. Our understanding is even it takes time to renegotiate new paths, it should be ok as we are sending temperature info in every 30 sec. Even if the rerout takes couple of minutes, it should be ok. But the issue is once the sensor&amp;#39;s (SED)&amp;nbsp; data is lost, it remains so for days.&lt;/p&gt;
&lt;p&gt;&lt;span&gt;Cheers,&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;&lt;span&gt;Kaushalya&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;</description></item><item><title>RE: Mesh back and forth seems to break connection</title><link>https://devzone.nordicsemi.com/thread/485822?ContentTypeID=1</link><pubDate>Fri, 24 May 2024 12:47:34 GMT</pubDate><guid isPermaLink="false">137ad170-7792-4731-bb38-c0d22fbe4515:f995ab4b-0f97-4acc-b465-2aef1792aee7</guid><dc:creator>Edvin</dc:creator><description>[quote user="kaushalyasat"]But if we remove child sensor, we loose the data packet origination. So after that the parent and next routers wouldn&amp;#39;t get any data packets anyway. So the behavior is not same isn&amp;#39;t it, though the end result is the same? [/quote]
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;Of course, but I am saying that if you were to generate dummy data at the parent and send it to the destination, in theory we could see the same, because the packets are lost between the parent router and the destination router. I am just thinking of possible ways to easier reproduce the issue.&lt;/p&gt;
[quote user="kaushalyasat"]We don&amp;#39;t do anything special here. Also in the production hosts, we dont populate this LED.&amp;nbsp;[/quote]
&lt;p&gt;Is it possible to add some logging here, to see if there is a state change? It doesn&amp;#39;t necessarily change to detached, so printing the actual state change can be helpful, if this is the case.&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
[quote user="kaushalyasat"]Q1. If both routers support child supervision, does the supervision packet originate from the destination router or the parent router?[/quote]
&lt;p&gt;I don&amp;#39;t understand this question. And I don&amp;#39;t understand how child supervision is relevant if the packet is lost between two nodes that are not children.&lt;/p&gt;
&lt;p&gt;Q2: Depends. Is it a message that is supposed to be acknowledged or not? It wouldn&amp;#39;t know that the scenario 2 happened, but it can detect a missing acknowledgement. The routers should be able to detect it, though. Because all messages should be acknowledged in the 802.15.4 layer.&lt;/p&gt;
&lt;p&gt;Q3: Could it be that the packet path between the source and destination was through a router that was suddenly powered off? I am not 100% sure of the details, but openthread is not known for being very fast at determining that nodes are gone, and enabling new routes.&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;Best regards,&lt;/p&gt;
&lt;p&gt;Edvin&lt;/p&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;</description></item><item><title>RE: Mesh back and forth seems to break connection</title><link>https://devzone.nordicsemi.com/thread/485517?ContentTypeID=1</link><pubDate>Thu, 23 May 2024 00:47:51 GMT</pubDate><guid isPermaLink="false">137ad170-7792-4731-bb38-c0d22fbe4515:7a5c3d71-8d3f-4014-8882-13e36d38c7bd</guid><dc:creator>kaushalyasat</dc:creator><description>&lt;p&gt;Hi Edvin,&lt;/p&gt;
[quote userid="26071" url="~/f/nordic-q-a/107621/mesh-back-and-forth-seems-to-break-connection/485358"]So in theory, if we take out the child sensors, and the two routers&amp;nbsp;send messages back and forth, we could see the same behavior? Do you agree?&amp;nbsp;[/quote]
&lt;p&gt;What you mean is like shutdown the child sensor? But if we remove child sensor, we loose the data packet origination. So after that the parent and next routers wouldn&amp;#39;t get any data packets anyway. So the behavior is not same isn&amp;#39;t it, though the end result is the same? Moreover, in this instant, the child sensor is successfully sending the packets.&lt;/p&gt;
[quote userid="26071" url="~/f/nordic-q-a/107621/mesh-back-and-forth-seems-to-break-connection/485358"]Do you have a state changed callback in your parent&amp;#39;s application?[/quote]
&lt;p&gt;Yes we have, as our application is derived of the same example. This is what we have.&lt;/p&gt;
&lt;p&gt;&lt;pre class="ui-code" data-mode="c_cpp"&gt;static void on_thread_state_changed(uint32_t flags, void *context)
{
	struct openthread_context *ot_context = context;

	if (flags &amp;amp; OT_CHANGED_THREAD_ROLE) {
		switch (otThreadGetDeviceRole(ot_context-&amp;gt;instance)) {
		case OT_DEVICE_ROLE_CHILD:
		case OT_DEVICE_ROLE_ROUTER:
		case OT_DEVICE_ROLE_LEADER:
			dk_set_led_on(OT_CONNECTION_LED);
			break;

		case OT_DEVICE_ROLE_DISABLED:
		case OT_DEVICE_ROLE_DETACHED:
		default:
			dk_set_led_off(OT_CONNECTION_LED);
			deactivate_provisionig();
			break;
		}
	}
}&lt;/pre&gt;&lt;/p&gt;
&lt;p&gt;We don&amp;#39;t do anything special here. Also in the production hosts, we dont populate this LED.&amp;nbsp;&lt;/p&gt;
[quote userid="26071" url="~/f/nordic-q-a/107621/mesh-back-and-forth-seems-to-break-connection/485358"]Does that trigger when the nodes fall out?[/quote]
&lt;p&gt;I am afraid we don&amp;#39;t know. As we only see this issue once it happened. What you suggest is role change from &amp;#39;router&amp;#39; to &amp;#39;disabled&amp;#39; kind of? I will check the role of the host once I detect this again. I will populate the LED mentioned above&amp;nbsp; as well.&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;I am trying to understand what the MLE does in the following scenario.&lt;/p&gt;
&lt;p&gt;1. sensor&amp;nbsp; -&amp;gt; parent router -&amp;gt; destination router - good&lt;/p&gt;
&lt;p&gt;2. sensor -&amp;gt; parent router&amp;nbsp; -x-&amp;nbsp;&lt;span&gt;destination router - bad&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;Q1. If both routers support child supervision, does the supervision packet originate from the destination router or the parent router?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;Q2. Sensor wouldn&amp;#39;t know scenario 2 has happened?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;Q3. I guess when scenario 2 happens parent router would try to find another path via MLE? In our case we have about 10 routers all over the lab. One or two may be power cycled or shut down but not all and they are never moved from one place to the other. Also when I shutdown all the other routers, the sensors all connected back to the destination router. That proves there was at least one path to the parent router. (These sensors were about&lt;/span&gt;&amp;nbsp;10-50cm away from the parent router)&lt;/p&gt;
&lt;p&gt;Cheers,&lt;/p&gt;
&lt;p&gt;Kaushalya&lt;/p&gt;
&lt;p&gt;Cheers,&lt;/p&gt;
&lt;p&gt;Kaushalya&lt;/p&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;</description></item><item><title>RE: Mesh back and forth seems to break connection</title><link>https://devzone.nordicsemi.com/thread/485358?ContentTypeID=1</link><pubDate>Wed, 22 May 2024 10:30:38 GMT</pubDate><guid isPermaLink="false">137ad170-7792-4731-bb38-c0d22fbe4515:5624dafd-a04e-428e-9d1e-b9fd8e196f40</guid><dc:creator>Edvin</dc:creator><description>&lt;p&gt;Hello Kaushalya,&lt;/p&gt;
&lt;p&gt;So in theory, if we take out the child sensors, and the two routers&amp;nbsp;send messages back and forth, we could see the same behavior? Do you agree?&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Do you have a state changed callback in your parent&amp;#39;s application?&lt;/p&gt;
&lt;p&gt;If you look at e.g. the ncs\nrf\samples\openthread\coap_server sample, in coap_server.c, you can see the on_thread_state_changed() callback.&lt;/p&gt;
&lt;p&gt;Does that trigger when the nodes fall out?&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;BR,&lt;/p&gt;
&lt;p&gt;Edvin&lt;/p&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;</description></item><item><title>RE: Mesh back and forth seems to break connection</title><link>https://devzone.nordicsemi.com/thread/485255?ContentTypeID=1</link><pubDate>Wed, 22 May 2024 02:19:09 GMT</pubDate><guid isPermaLink="false">137ad170-7792-4731-bb38-c0d22fbe4515:58799e7c-da72-4dcf-a649-c9ae04c265d6</guid><dc:creator>kaushalyasat</dc:creator><description>&lt;p&gt;Hi Edvin,&lt;/p&gt;
[quote userid="26071" url="~/f/nordic-q-a/107621/mesh-back-and-forth-seems-to-break-connection/485225"]Ok, so even when you do not see that the messages are being received, you can see that it prints the return value (+)47, which is the length of the packet that you are sending?[/quote]
&lt;p&gt;Yes. Yesterday we had the first sensor fall off &amp;#39;with child supervision&amp;#39;. I couldn&amp;#39;t see any messages regarding child supervision though in these sensors. But I can verify the following.&lt;/p&gt;
&lt;p&gt;1. sensors are transmitting and the data is sent to its parent successfully. After the parent, I couldn&amp;#39;t trace the packet any more as they cannot be decrypted.&amp;nbsp;&lt;/p&gt;
&lt;p&gt;2. intended destination doesnt receive these packets in application level&lt;/p&gt;
&lt;p&gt;So I think the issue is in loosing a FTD to FTD (router to router) connection in multi-hop scenarios.&amp;nbsp;&lt;/p&gt;
[quote userid="26071" url="~/f/nordic-q-a/107621/mesh-back-and-forth-seems-to-break-connection/485225"]So this means that so far you have not confirmed that the tx function returns 47&amp;nbsp;&lt;em&gt;&lt;/em&gt;&lt;strong&gt;while&lt;/strong&gt; the issue is ongoing? Or have you confirmed this?[/quote]
&lt;p&gt;I can confirm that 47 is received while a sensor or more precisely the system in this state. As I mentioned, I dont think this is relevant&amp;nbsp; to the sensor (SED). This may well be an issue in router to router hop.&lt;/p&gt;
&lt;p&gt;I am now researching how the MLE works. What happens when a router looses its connection with another router? I think it will try to find another path. Now my question is why a path cannot be found to the destination router/leader, where it was established earlier? Only change would happen would be some&amp;nbsp; routers may be power cycles/off. But I have verified that the sensors could connect to the destination even with all the other routers are powered down.&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Cheers,&lt;/p&gt;
&lt;p&gt;Kaushalya&lt;/p&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;</description></item><item><title>RE: Mesh back and forth seems to break connection</title><link>https://devzone.nordicsemi.com/thread/485225?ContentTypeID=1</link><pubDate>Tue, 21 May 2024 20:31:25 GMT</pubDate><guid isPermaLink="false">137ad170-7792-4731-bb38-c0d22fbe4515:a76f766a-b517-4b1f-b9c8-afd7d073a38a</guid><dc:creator>Edvin</dc:creator><description>[quote user="kaushalyasat"]Yeah, we always see 47 so far.[/quote]
&lt;p&gt;Ok, so even when you do not see that the messages are being received, you can see that it prints the return value (+)47, which is the length of the packet that you are sending?&lt;/p&gt;
[quote user="kaushalyasat"]Yes I can see the LOG_ERR, from both my application level and also from coap_send_request(). I tested it from console and also RTT viewer.[/quote]
&lt;p&gt;Ok, good. That means that there is not some sort of config that disables that logging instance.&lt;/p&gt;
[quote user="kaushalyasat"]Also in this section we maintain counters for failed tx and successful tx. So far we havent seen any failed, but again it might take months before that happen.[/quote]
&lt;p&gt;So this means that so far you have not confirmed that the tx function returns 47&amp;nbsp;&lt;em&gt;&lt;/em&gt;&lt;strong&gt;while&lt;/strong&gt; the issue is ongoing? Or have you confirmed this?&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;Best regards,&lt;/p&gt;
&lt;p&gt;Edvin&lt;/p&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;</description></item><item><title>RE: Mesh back and forth seems to break connection</title><link>https://devzone.nordicsemi.com/thread/483142?ContentTypeID=1</link><pubDate>Tue, 14 May 2024 01:02:50 GMT</pubDate><guid isPermaLink="false">137ad170-7792-4731-bb38-c0d22fbe4515:6af5176e-c057-4b67-a61b-6f70435d75dc</guid><dc:creator>kaushalyasat</dc:creator><description>[quote userid="26071" url="~/f/nordic-q-a/107621/mesh-back-and-forth-seems-to-break-connection/482993"]If you see error messages printed from that file in general. You can test with adding &amp;quot;LOG_ERR(&amp;quot;Test&amp;quot;);&amp;quot;, to see if these error messages are visible in the log at all.[/quote]
&lt;p&gt;Yes I can see the LOG_ERR, from both my application level and also from coap_send_request(). I tested it from console and also RTT viewer.&lt;/p&gt;
[quote userid="26071" url="~/f/nordic-q-a/107621/mesh-back-and-forth-seems-to-break-connection/482993"]Where are those from? What trigger these?[/quote]
&lt;p&gt;This log message is sent by&amp;nbsp;send_sensor_update () in coap_client_utils.c, just before calling the&amp;nbsp;&amp;nbsp;&lt;span&gt;coap_send_request (). So we know the flow is working till that point.&amp;nbsp;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;&lt;/span&gt;&lt;/p&gt;
[quote userid="26071" url="~/f/nordic-q-a/107621/mesh-back-and-forth-seems-to-break-connection/482993"]It would be more interresting to continuously see the return value from coap_send_message()[/quote]
&lt;p&gt;Agree. Unfortunately in fw Rev 1.1.1.0, which I sent to you first, doesnt have that - my bad. In latest fw it shows it and also we have implemented a noinit memory section where we keep the last returned value to&amp;nbsp;&lt;span&gt;coap_send_request(). Also in this section we maintain counters for failed tx and successful tx. So far we havent seen any failed, but again it might take months before that happen.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;&lt;/span&gt;&lt;/p&gt;
[quote userid="26071" url="~/f/nordic-q-a/107621/mesh-back-and-forth-seems-to-break-connection/482993"]Try adding prints of the return value of the function that doesn&amp;#39;t work (regardless of whether it is 0 or something else).&amp;nbsp;[/quote]
&lt;p&gt;It is done in the latest fw. We are waiting for any sensor to go into this mode again. Currently we get 47 as the return value, which I think the number of bytes send(?)&lt;/p&gt;
&lt;p&gt;&lt;span&gt;&lt;/span&gt;&lt;/p&gt;
[quote userid="26071" url="~/f/nordic-q-a/107621/mesh-back-and-forth-seems-to-break-connection/482993"] Print that it returns 0 even though it is disconnected?[/quote]
&lt;p&gt;It prints whatever returned from&amp;nbsp;&lt;span&gt;coap_send_request() as follows. We havent seen returning 0 as there is always a network to connect to in the lab. Also if it is not connected, the&amp;nbsp;&lt;/span&gt;send_sensor_update () wouldn&amp;#39;t get called.&lt;/p&gt;
&lt;p&gt;&lt;span&gt;&lt;pre class="ui-code" data-mode="c_cpp"&gt;int ret;
.
.
.
ret = coap_send_request(COAP_METHOD_PUT, (const struct sockaddr *)&amp;amp;unique_local_addr, sensor_option, payload, sizeof(payload), NULL);
LOG_INF (&amp;quot;ZS %d, RSSI %d, LQI %d, LQO %d, FW %04x RET: %d, RLOC: %04x&amp;quot;, the_sensor_device-&amp;gt;zoneState, RSSI, linkQalIn, linkQualOut, FWRevNum, ret, rloc);&lt;/pre&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
[quote userid="26071" url="~/f/nordic-q-a/107621/mesh-back-and-forth-seems-to-break-connection/482993"]stop printing alltogether?[/quote]
&lt;p&gt;What you mean is the printing suddenly stop without any reason? We havent seen anything like that.&lt;/p&gt;
[quote userid="26071" url="~/f/nordic-q-a/107621/mesh-back-and-forth-seems-to-break-connection/482993"]Print that it returns something else than 0?[/quote]
&lt;p&gt;Yeah, we always see 47 so far.&lt;/p&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;</description></item><item><title>RE: Mesh back and forth seems to break connection</title><link>https://devzone.nordicsemi.com/thread/482993?ContentTypeID=1</link><pubDate>Mon, 13 May 2024 10:55:01 GMT</pubDate><guid isPermaLink="false">137ad170-7792-4731-bb38-c0d22fbe4515:9b3fe78a-a7f7-42a1-a5d0-6231368b4ef6</guid><dc:creator>Edvin</dc:creator><description>[quote user="kaushalyasat"]I was referring to the error logs as marked above. I dont see any of these errors in my case. So I assume that &amp;#39;coap_send_request ()&amp;#39; executes without any error. Am I correct?[/quote]
&lt;p&gt;If you see error messages printed from that file in general. You can test with adding &amp;quot;LOG_ERR(&amp;quot;Test&amp;quot;);&amp;quot;, to see if these error messages are visible in the log at all.&lt;/p&gt;
[quote user="kaushalyasat"]I dont this this is the case as I can see this log message continuously from a disconnected SED.&amp;nbsp;[/quote]
&lt;p&gt;Where are those from? What trigger these?&lt;/p&gt;
&lt;p&gt;It would be more interresting to continuously see the return value from coap_send_message(), or whatever message you use to send, at the time of the disconnection?&lt;/p&gt;
&lt;p&gt;Try adding prints of the return value of the function that doesn&amp;#39;t work (regardless of whether it is 0 or something else).&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Does it:&lt;/p&gt;
&lt;p&gt;1: Print that it returns 0 even though it is disconnected?&lt;/p&gt;
&lt;p&gt;2: stop printing alltogether?&lt;/p&gt;
&lt;p&gt;3: Print that it returns something else than 0?&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;BR,&lt;/p&gt;
&lt;p&gt;Edvin&lt;/p&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;</description></item><item><title>RE: Mesh back and forth seems to break connection</title><link>https://devzone.nordicsemi.com/thread/480973?ContentTypeID=1</link><pubDate>Mon, 29 Apr 2024 01:26:26 GMT</pubDate><guid isPermaLink="false">137ad170-7792-4731-bb38-c0d22fbe4515:4257252a-950b-45dc-acc6-3c23a87422c1</guid><dc:creator>kaushalyasat</dc:creator><description>&lt;p&gt;Hi Edvin,&lt;/p&gt;
[quote userid="26071" url="~/f/nordic-q-a/107621/mesh-back-and-forth-seems-to-break-connection/480697"]I was thinking about these. What error logs do you refer to?[/quote]
&lt;p&gt;&lt;pre class="ui-code" data-mode="c_cpp"&gt;int coap_send_request(enum coap_method method, 
					  const struct sockaddr *addr,
		      		  const char *const *uri_path_options, 
					  uint8_t *payload,
		      		  uint16_t payload_size, 
					  coap_reply_t reply_cb)
{
	int ret;
	struct coap_packet request;
	uint8_t buf[MAX_COAP_MSG_LEN];

	ret = coap_init_request(method, COAP_TYPE_NON_CON, uri_path_options,
				payload, payload_size, &amp;amp;request, buf);
	if (ret &amp;lt; 0) {
		LOG_ERR (&amp;quot;CoAP init failed: %d&amp;quot;, errno); // &amp;lt;---------------- ERROR LOG
		goto end;
	}

	if (reply_cb != NULL) {
		coap_set_response_callback(&amp;amp;request, reply_cb);
	}

	ret = coap_send_message(addr, &amp;amp;request);
	if (ret &amp;lt; 0) {
		LOG_ERR(&amp;quot;Transmission failed: %d&amp;quot;, errno);  // &amp;lt;---------------- ERROR LOG
		goto end;
	}

end:
	return ret;
}&lt;/pre&gt;&lt;/p&gt;
&lt;p&gt;I was referring to the error logs as marked above. I dont see any of these errors in my case. So I assume that &amp;#39;coap_send_request ()&amp;#39; executes without any error. Am I correct?&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
[quote userid="26071" url="~/f/nordic-q-a/107621/mesh-back-and-forth-seems-to-break-connection/480697"]That would be if&amp;nbsp;send_sensor_update() is not called. Do you have something indicating whether or not these are called at the time when the devices become unavailable?[/quote]
&lt;p&gt;I dont this this is the case as I can see this log message continuously from a disconnected SED.&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;pre class="ui-code" data-mode="c_cpp"&gt;LOG_INF (&amp;quot;ZS %d, RSSI %d, LQI %d, LQO %d, FW %04x&amp;quot;, the_sensor_device-&amp;gt;zoneState, RSSI, linkQalIn, linkQualOut, FWRevNum);&lt;/pre&gt;&lt;/p&gt;
&lt;p&gt;So it seems like my application code gets called continuously but data is not being send from that point onwards.&lt;/p&gt;
&lt;p&gt;When we look at the console of the host, we couldn&amp;#39;t see the log message for the data receive from these disconnected sensors. The disconnection could happen from&amp;nbsp;&lt;/p&gt;
&lt;p&gt;1. sensor thread stack&lt;/p&gt;
&lt;p&gt;2. host thread stack&lt;/p&gt;
&lt;p&gt;3. host application&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Do you see any other ways?&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;Cheers,&lt;/p&gt;
&lt;p&gt;Kaushalya&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;</description></item><item><title>RE: Mesh back and forth seems to break connection</title><link>https://devzone.nordicsemi.com/thread/480697?ContentTypeID=1</link><pubDate>Thu, 25 Apr 2024 12:58:21 GMT</pubDate><guid isPermaLink="false">137ad170-7792-4731-bb38-c0d22fbe4515:db2ef284-e892-4d8c-8fcc-9ed7bd2161b0</guid><dc:creator>Edvin</dc:creator><description>[quote user="kaushalyasat"]When I look into &amp;#39;&lt;span&gt;coap_send_request (...)&lt;/span&gt;&amp;#39;, it has &amp;#39;&lt;span&gt;coap_init_request (...)&lt;/span&gt;&amp;#39; and &amp;#39;coap_send_message(...)&amp;#39; functions. Both these has error logs printed if something goes wrong[/quote]
&lt;p&gt;I was thinking about these. What error logs do you refer to?&lt;/p&gt;
[quote user="kaushalyasat"]Can you think of any ways that&amp;nbsp;&lt;span&gt;coap_send_request&amp;nbsp;(..) may not have been called?&lt;/span&gt;[/quote]
&lt;p&gt;That would be if&amp;nbsp;send_sensor_update() is not called. Do you have something indicating whether or not these are called at the time when the devices become unavailable?&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;I am sorry, but we are several months into this, and I am not quite sure what we are discussing anymore. You have some devices in a remote area that you do not have physical access to where you see some devices drop out from time to time, right? Perhaps you can try to replicate this in a local area where you have access to your devices? Reset the entire network, start sniffing before you start your devices so that the sniffer can capture everything from the beginning. Then it should be able to pick up and resolve all the short addresses. When you detect the issue, look into the log from that particular device. Does it say anything when trying to call&amp;nbsp;coap_send_request()? Any error messages? coap_send_request() also returns a value based on how it did. It returns 0 on success, and a negative number on failure. Try printing something in the log in the cases where this returns &amp;lt; 0. What does it return when it fails?&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;Best regards,&lt;/p&gt;
&lt;p&gt;Edvin&lt;/p&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;</description></item><item><title>RE: Mesh back and forth seems to break connection</title><link>https://devzone.nordicsemi.com/thread/479563?ContentTypeID=1</link><pubDate>Thu, 18 Apr 2024 23:42:35 GMT</pubDate><guid isPermaLink="false">137ad170-7792-4731-bb38-c0d22fbe4515:47a3a833-e3bb-49bc-9afc-b38dde03363b</guid><dc:creator>kaushalyasat</dc:creator><description>&lt;p&gt;Hi Edvin,&lt;/p&gt;
[quote userid="26071" url="~/f/nordic-q-a/107621/mesh-back-and-forth-seems-to-break-connection/479544"]I don&amp;#39;t think it is possible to rearrange the packets like this, no.[/quote]
&lt;p&gt;As I mentioned, in this thread, one of Nordic engineers mentioned something like that -&amp;nbsp;&amp;nbsp;&lt;a href="https://devzone.nordicsemi.com/f/nordic-q-a/78423/nrf-sniffer-integration-for-802-15-4-in-a-python-scipt-pcap-file-problems"&gt;nRF Sniffer integration for 802.15.4 in a python scipt (Pcap file problems)&lt;/a&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Also I can see in wireshark, you can&amp;nbsp; time shift packets around like this.&lt;/p&gt;
&lt;p&gt;&lt;img style="max-height:240px;max-width:320px;" src="https://devzone.nordicsemi.com/resized-image/__size/640x480/__key/communityserver-discussions-components-files/4/pastedimage1713482259674v1.png" alt=" " /&gt;&lt;/p&gt;
&lt;p&gt;I tried doing this, but couldn&amp;#39;t see any effect. I am not sure my operation was correct, so I leave this for a wireshark guru to comment.&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
[quote userid="26071" url="~/f/nordic-q-a/107621/mesh-back-and-forth-seems-to-break-connection/479544"]What error messages did you see?[/quote]
&lt;p&gt;What you mean is error messages I saw in the RTT viewer? I didn&amp;#39;t see any.&lt;/p&gt;
[quote userid="26071" url="~/f/nordic-q-a/107621/mesh-back-and-forth-seems-to-break-connection/479544"]Are they printed from within the implementation of coap_send_message()?[/quote]
&lt;p&gt;Yes. If you look at&amp;nbsp;coap_send_request (...) function in coap_utils.c, you can see that&amp;nbsp;&lt;span&gt;&lt;span&gt;coap_init_request (...) is logged from within and&amp;nbsp;&lt;/span&gt;&lt;/span&gt;coap_send_message (...) return is logged. I didn&amp;#39;t go any deeper as I got stuck in&amp;nbsp;z_impl_zsock_sendto (...) in sockets.c.&lt;/p&gt;
[quote userid="26071" url="~/f/nordic-q-a/107621/mesh-back-and-forth-seems-to-break-connection/479544"]You also need to consider the possibility that for some reason these functions aren&amp;#39;t called at all, due to some other error.[/quote]
&lt;p&gt;&lt;pre class="ui-code" data-mode="c_cpp"&gt;static void send_sensor_update (struct k_work *item) {
    .
    .
    .
    
	LOG_INF (&amp;quot;ZS %d, RSSI %d, LQI %d, LQO %d, FW %04x&amp;quot;, the_sensor_device-&amp;gt;zoneState, RSSI, linkQalIn, linkQualOut, FWRevNum);

	memcpy (&amp;amp;payload[1], myExtAddr.m8, 8);
	memcpy (&amp;amp;payload[9], myEUI64.m8, 8);

	payload[17] = ((the_sensor_device-&amp;gt;temp)&amp;gt;&amp;gt;8) &amp;amp; 0xff;
	payload[18] = (the_sensor_device-&amp;gt;temp) &amp;amp; 0x00ff;

	payload[19] = ((the_sensor_device-&amp;gt;vbat)&amp;gt;&amp;gt;8) &amp;amp; 0xff;
	payload[20] = (the_sensor_device-&amp;gt;vbat) &amp;amp; 0x00ff;
	payload[21] = RSSI;
	payload[22] = linkQalIn;
	payload[23] = linkQualOut;
	payload[24] = the_sensor_device-&amp;gt;zoneState;
	payload[25] = (uint8_t)(FWRevNum &amp;gt;&amp;gt; 8);
	payload[26] = (uint8_t)(FWRevNum &amp;amp; 0x00ff);

	ARG_UNUSED(item);

	if (net_ipv6_is_addr_unspecified (&amp;amp;unique_local_addr.sin6_addr)) {
		LOG_WRN(&amp;quot;Peer address not set. Activate &amp;#39;provisioning&amp;#39; option on the server side&amp;quot;);
		return;
	}

	coap_send_request(COAP_METHOD_PUT, (const struct sockaddr *)&amp;amp;unique_local_addr, sensor_option, payload, sizeof(payload), NULL);&lt;/pre&gt;&lt;/p&gt;
&lt;div&gt;This is the code section from the log print to&amp;nbsp;coap_send_request (...) in my code. Can you think of any ways that&amp;nbsp;&lt;span&gt;coap_send_request&amp;nbsp;(..) may not have been called? if IP6 is missing, I would get a LOG_WRN, which I dont get.&lt;/span&gt;&lt;/div&gt;
&lt;div&gt;&lt;span&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div&gt;&lt;span&gt;Cheers,&lt;/span&gt;&lt;/div&gt;
&lt;div&gt;&lt;span&gt;Kaushalya&lt;/span&gt;&lt;/div&gt;
&lt;p&gt;&lt;/p&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;</description></item><item><title>RE: Mesh back and forth seems to break connection</title><link>https://devzone.nordicsemi.com/thread/479544?ContentTypeID=1</link><pubDate>Thu, 18 Apr 2024 20:21:48 GMT</pubDate><guid isPermaLink="false">137ad170-7792-4731-bb38-c0d22fbe4515:c888a7eb-25ea-4cec-9feb-1c7412da83fc</guid><dc:creator>Edvin</dc:creator><description>&lt;p&gt;Hello Kaushalya,&lt;/p&gt;
&lt;p&gt;I don&amp;#39;t think it is possible to rearrange the packets like this, no. If anything, I think you woudl have to edit the raw .pcapng file.&amp;nbsp;&lt;/p&gt;
[quote user="kaushalyasat"]&lt;p&gt;Also as I mentioned earlier, we dont see any other error messages in my RTT viewer. When I look into &amp;#39;&lt;span&gt;coap_send_request (...)&lt;/span&gt;&amp;#39;, it has &amp;#39;&lt;span&gt;coap_init_request (...)&lt;/span&gt;&amp;#39; and &amp;#39;coap_send_message(...)&amp;#39; functions. Both these has error logs printed if something goes wrong. So can we assume that there were no errors reported during a packet transmission?&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;[/quote]
&lt;p&gt;That depends on the implementation. What error messages did you see? Are they printed from within the implementation of coap_send_message()? Or from the function that checked their return value?&lt;/p&gt;
&lt;p&gt;You also need to consider the possibility that for some reason these functions aren&amp;#39;t called at all, due to some other error.&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;Best regards,&lt;/p&gt;
&lt;p&gt;Edvin&lt;/p&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;</description></item><item><title>RE: Mesh back and forth seems to break connection</title><link>https://devzone.nordicsemi.com/thread/479318?ContentTypeID=1</link><pubDate>Thu, 18 Apr 2024 01:13:49 GMT</pubDate><guid isPermaLink="false">137ad170-7792-4731-bb38-c0d22fbe4515:b1e07249-9a6d-4d12-b278-b661395d9c42</guid><dc:creator>kaushalyasat</dc:creator><description>&lt;p&gt;Hi Edvin,&lt;/p&gt;
&lt;p&gt;I am building another FW with return values gets logged. Unfortunately we have to wait until this problem creeps up again since we dont know how to recreate it.&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Is there a way to rearrange wirechark packets so that we can push a MLE packet up? Then the subsequent logs can get extended address from that.&lt;/p&gt;
&lt;p&gt;Also as I mentioned earlier, we dont see any other error messages in my RTT viewer. When I look into &amp;#39;&lt;span&gt;coap_send_request (...)&lt;/span&gt;&amp;#39;, it has &amp;#39;&lt;span&gt;coap_init_request (...)&lt;/span&gt;&amp;#39; and &amp;#39;coap_send_message(...)&amp;#39; functions. Both these has error logs printed if something goes wrong. So can we assume that there were no errors reported during a packet transmission?&amp;nbsp;&lt;/p&gt;
&lt;p&gt;how to reset the network? What APIs should I use?&lt;/p&gt;
&lt;p&gt;Cheers,&lt;/p&gt;
&lt;p&gt;Kaushalya&lt;/p&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;</description></item><item><title>RE: Mesh back and forth seems to break connection</title><link>https://devzone.nordicsemi.com/thread/479124?ContentTypeID=1</link><pubDate>Wed, 17 Apr 2024 07:20:21 GMT</pubDate><guid isPermaLink="false">137ad170-7792-4731-bb38-c0d22fbe4515:d18fc08d-480a-45f2-9c2b-9a7980415847</guid><dc:creator>Edvin</dc:creator><description>[quote user="kaushalyasat"]Can you shed some light?[/quote]
&lt;p&gt;I believe the takeaway from this thread is that if the sniffer didn&amp;#39;t pick up the packets where the nodes joined the network (for the first time), so that it uses it&amp;#39;s extended address and was assigned a short address, the sniffer doesn&amp;#39;t know how to map the short RLOC16 addresses to the extended addresses, and hence, it can&amp;#39;t decrypt the packets to/from these devices.&amp;nbsp;&lt;/p&gt;
[quote user="kaushalyasat"]Here I have not handled the return value from&amp;nbsp;coap_send_request (),[/quote]
&lt;p&gt;So is it possible to check these return values? If you don&amp;#39;t see the packets, it may mean that they are never sent. And if that is the case, then a clue is probably found in the return value from this function.&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
[quote user="kaushalyasat"]How can we further debug this?[/quote]
&lt;p&gt;Check the return value for coap_send_request() when the packets aren&amp;#39;t sent correctly.&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Then, try to reset the entire network, or at least factory reset the sensor devices, so that they do a new provisioning sequence when you turn them on. Then you need to enable the sniffer before you provision the devices, so that the sniffer can pick up the extended addresses being used before they are assigned an RLOC16 (short) address. You can experiment with this in a small scale in your office. Set up a small network with two devices, and try starting the sniffer before the provisioning process, and after, and compare the results.&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;Best regards,&lt;/p&gt;
&lt;p&gt;Edvin&lt;/p&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;</description></item><item><title>RE: Mesh back and forth seems to break connection</title><link>https://devzone.nordicsemi.com/thread/479094?ContentTypeID=1</link><pubDate>Wed, 17 Apr 2024 04:36:45 GMT</pubDate><guid isPermaLink="false">137ad170-7792-4731-bb38-c0d22fbe4515:6f874e84-17dd-4df5-b2d3-b61a63562060</guid><dc:creator>kaushalyasat</dc:creator><description>&lt;p&gt;Also from the log I think I can see that the sensors (SEDs) which has disappeared from the network actually is connected to the network from the sensors point of view. I can see my log message just before&amp;nbsp; calling the&amp;nbsp;&lt;span&gt;coap_send_request (). Following is the code section.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;&lt;pre class="ui-code" data-mode="text"&gt;...
LOG_INF (&amp;quot;ZS %d, RSSI %d, LQI %d, LQO %d, FW %04x&amp;quot;, the_sensor_device-&amp;gt;zoneState, RSSI, linkQalIn, linkQualOut, FWRevNum);
...
coap_send_request(COAP_METHOD_PUT, (const struct sockaddr *)&amp;amp;unique_local_addr, sensor_option, payload, sizeof(payload), NULL);
...
&lt;/pre&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;Here I have not handled the return value from&amp;nbsp;coap_send_request (), which is my bad. But I dont get any error logs from this. But I cant see the packet being transmitted in my wireshark logs. So I have the feeling that this is something related to either CoAP stack or the OpenThread stack.&amp;nbsp;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;There could be possibility that I dont see the actual packet in wireshark log because I dont have sufficient data to filter these. As I said, I dont know the RLOC of these sensors, I only know their MAC. But I cant easily filter based on the MAC as I cant see it in any frames.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;How can we further debug this?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;Cheers,&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;Kaushalya&lt;/span&gt;&lt;/p&gt;
&lt;div&gt;&lt;/div&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;</description></item><item><title>RE: Mesh back and forth seems to break connection</title><link>https://devzone.nordicsemi.com/thread/479092?ContentTypeID=1</link><pubDate>Wed, 17 Apr 2024 04:27:12 GMT</pubDate><guid isPermaLink="false">137ad170-7792-4731-bb38-c0d22fbe4515:f28f020c-2cd3-4932-8ccf-4f92ec56c406</guid><dc:creator>kaushalyasat</dc:creator><description>&lt;p&gt;Also I came across this thread from an old devzone ticket.&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;a href="https://devzone.nordicsemi.com/f/nordic-q-a/78423/nrf-sniffer-integration-for-802-15-4-in-a-python-scipt-pcap-file-problems"&gt;nRF Sniffer integration for 802.15.4 in a python scipt (Pcap file problems)&lt;/a&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Here Nordic engineer mentions that extended address is required in a packet to decrypt and this can be &amp;#39;fixed&amp;#39; by moving packets with extended address to the top. I tried doing this with the attached log, but I am not 100% sure how to do it.&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Can you shed some light?&lt;/p&gt;
&lt;p&gt;Thanks,&lt;/p&gt;
&lt;p&gt;Kaushalya&lt;/p&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;</description></item></channel></rss>