Mesh back and forth seems to break connection

Hi,

We have one customer having two CoAP hosts and some CoAP clients in the form of wireless sensors. The sensors are paired to a single host. The pairing is actually in the app level, where the sensor discovers the network IP of the host in pairing host. All the devices have the same PANID and network key.

Recently we have seen a scenario where some sensors seemingly stopped communication with the paired host. By looking at the RSSI graphs, we thought is this caused by a sensor constantly swinging back and forth between two hosts (one host acting as a router). We dont have access to the CLI interface of the hosts as this is a remote site. We see the Sensor RSSI reported back. This is its RSSI with the router/leader immediately connected to at the time.  

Any ideas?

Cheers,

Kaushalya

Parents Reply Children
  • Hi,

    We have managed to find this issue in a test system. Fortunately, we can now run sniffer. We are loosing some customers over this issue. So this is paramount we find the root cause of this.

    What we have see so far is SEDs disconnecting from Leader. When we check the child table, we can see some SEDs have dropped off from it. Now we dont know what caused it. It certainly not a straight off bug, as these sensors have been working for 1-2 months. 

    Out SEDs send the pol period timeout is set to 100. I think this is poll every10sec. RSSI measurement seems ok as well. Why it disconnects from the Leader for 240sec is the question.

    Also another interesting observation we made was that when this disconnection happens, we lost the data from all sensors pretty much the same time. Also when we power cycle the host, all the SEDs start sending data again. This is why we think this is a host side issue. 

    We are organizing a wireshark session for this. I can update more detailed picture, hopefully over the weekend.

    This is a paramount issue for us now as we have lost some clients because of this. We need any help you can provide asap.

    Thanks,

    Kaushalya

  • Under what conditions a Leader/Router drop a child off its child table ? I thought only when no poll for the polling interval from a child and when a child gets connected via a router. But in this instance the router table is empty as well.

  • Can you see in the sniffer trace when this happens? (I can't). Also, does the logs say anything? Either from the devices dropping out or it's parents?

    BR,
    Edvin

  • Hi Edvin,

    Unfortunately this log was captured after we saw the sensors are no longer sending data. The sensors used to work fine for about 3-4 months. So this is an extreme random and rare case. Now we have set up the debug console to the host radio and the wireshark sniffer permanently in the same place to see if we can capture when it happens.

    Under what conditions the Leader/Router drop a child from its child table?

    Thanks,

    Kaushalya

Related