This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

Losing connection with Thread CoAP Google Cloud example

Dear Nordic Support Team,

I haven't seen any forum posts directly applicable to the issue I'm seeing so I hope you can point me in the right direction.

The basic Thread CoAP Google Cloud example has been working fine, but I noticed a couple of peculiarities. I'm a still using the Thread SDK v4.0.0 as I didn't see there were specific items in the 4.1.0 SDK release to address any of these issues. 

1) The first issue is that when I leave the network on for some period of time, it seems to not be able to send DTLS application data packets anymore. For example, I would press the buttons to try and increment or decrement the count, but I only see MAC level exchanges. I can see from the sniffer that there are no "DTLS Application Data" exchanges. 

In order to reconnect I have to power cycle the Thread FTD. It then re-establishes the DTLS connection, then everything is back to normal. How I can I avoid having to do this periodically?

2) The second issue is that the network does not seem to recover from a power cycle of the border router. This happens when I have the network unplugged for quite some time. My expectation is that the network should resume from where it left off, but that has not been my experience. I have to re-form the border router using the OTBR web interface, then do a re-flash of the FTD to let it join again. At this point, the network is back and works as per normal.

How can I solve these two issues.

Thanks for your help in advance.

Parents
  • Hi,

    When you say Thread FTD, are you referring to the NCP device connected to the border router, the CoAP Google Cloud device, or some other device in the network?

    Can you check the wpantund status ('sudo wpanctl status') on the border router when both these issues occur? Can you also post the output of 'ipaddr' and 'router table' commands on a CLI node in the network (CLI should also be enabled in the CoAP Google Cloud example), and the 'ifconfig' output from the border router? (you can remove/hide any global IPv6 addresses for privacy reasons)

    You can also check the syslog of the border router (step 3 in this section, remove the '| grep ot-cli-ftd' part of the command), to see if anything can give you a clue to what is causing the issues.

    Best regards,
    Jørgen

  • border router ifconfig outputborder router wpanctl statusUbiqua sniffer outputFTD ipaddr and router table output

    Here are the various output that you requested. I could not attach the syslog as a file to the web portal so here it is: 

    Jun 23 06:25:02 raspberrypi liblogging-stdlog: [origin software="rsyslogd" swVersion="8.24.0" x-pid="336" x-info="http://www.rsyslog.com"] rsyslogd was HUPed
    Jun 23 06:32:44 raspberrypi systemd[1]: Starting Daily apt upgrade and clean activities...
    Jun 23 06:32:45 raspberrypi systemd[1]: Started Daily apt upgrade and clean activities.
    Jun 23 06:32:45 raspberrypi systemd[1]: apt-daily-upgrade.timer: Adding 28min 37.530978s random time.
    Jun 23 06:32:45 raspberrypi systemd[1]: apt-daily-upgrade.timer: Adding 18min 51.275962s random time.
    Jun 23 06:39:03 raspberrypi dhclient[1612]: DHCPREQUEST of 10.200.242.104 on eth0 to 10.200.242.135 port 67
    Jun 23 06:39:03 raspberrypi dhclient[1612]: DHCPACK of 10.200.242.104 from 10.200.242.135
    Jun 23 06:39:03 raspberrypi NetworkManager[382]: <info> [1592894343.7534] dhcp4 (eth0): address 10.200.242.104
    Jun 23 06:39:03 raspberrypi NetworkManager[382]: <info> [1592894343.7536] dhcp4 (eth0): plen 24 (255.255.255.0)
    Jun 23 06:39:03 raspberrypi NetworkManager[382]: <info> [1592894343.7537] dhcp4 (eth0): gateway 10.200.242.135
    Jun 23 06:39:03 raspberrypi NetworkManager[382]: <info> [1592894343.7538] dhcp4 (eth0): server identifier 10.200.242.135
    Jun 23 06:39:03 raspberrypi NetworkManager[382]: <info> [1592894343.7538] dhcp4 (eth0): lease time 86400
    Jun 23 06:39:03 raspberrypi dbus[363]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.freedesktop.nm-dispatcher.service'
    Jun 23 06:39:03 raspberrypi NetworkManager[382]: <info> [1592894343.7539] dhcp4 (eth0): hostname 'raspberrypi'
    Jun 23 06:39:03 raspberrypi NetworkManager[382]: <info> [1592894343.7540] dhcp4 (eth0): nameserver '10.200.242.135'
    Jun 23 06:39:03 raspberrypi NetworkManager[382]: <info> [1592894343.7541] dhcp4 (eth0): state changed bound -> bound
    Jun 23 06:39:03 raspberrypi systemd[1]: Starting Network Manager Script Dispatcher Service...
    Jun 23 06:39:03 raspberrypi dhclient[1612]: bound to 10.200.242.104 -- renewal in 32596 seconds.
    Jun 23 06:39:03 raspberrypi dbus[363]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher'
    Jun 23 06:39:03 raspberrypi systemd[1]: Started Network Manager Script Dispatcher Service.
    Jun 23 06:39:03 raspberrypi nm-dispatcher: req:1 'dhcp4-change' [eth0]: new request (3 scripts)
    Jun 23 06:39:03 raspberrypi nm-dispatcher: req:1 'dhcp4-change' [eth0]: start running ordered scripts...
    Jun 23 06:39:03 raspberrypi ap-helper[19465]: Unsupported action: 'dhcp4-change'
    Jun 23 06:39:03 raspberrypi dhcpv6-helper[19467]: Unsupported action: 'dhcp4-change'
    Jun 23 07:17:01 raspberrypi CRON[19489]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)
    Jun 23 08:17:01 raspberrypi CRON[19527]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)
    Jun 23 09:17:01 raspberrypi CRON[19567]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)
    Jun 23 10:17:01 raspberrypi CRON[19605]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)
    Jun 23 11:17:01 raspberrypi CRON[19645]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)
    Jun 23 11:52:24 raspberrypi MQTT-SNGateway[1523]: 20200623 050630.000 #033[0m#033[0;33mADVERTISE #033[0m#033[0;37m---> #033[0m#033[0;32mClients #033[0m#033[0;34m 05 00 01 03 84#033[0m#033[0;37m
    Jun 23 11:52:24 raspberrypi MQTT-SNGateway[1523]: 20200623 052132.000 #033[0m#033[0;33mADVERTISE #033[0m#033[0;37m---> #033[0m#033[0;32mClients #033[0m#033[0;34m 05 00 01 03 84#033[0m#033[0;37m
    Jun 23 11:52:24 raspberrypi MQTT-SNGateway[1523]: 20200623 053634.000 #033[0m#033[0;33mADVERTISE #033[0m#033[0;37m---> #033[0m#033[0;32mClients #033[0m#033[0;34m 05 00 01 03 84#033[0m#033[0;37m
    Jun 23 11:52:24 raspberrypi MQTT-SNGateway[1523]: 20200623 055136.000 #033[0m#033[0;33mADVERTISE #033[0m#033[0;37m---> #033[0m#033[0;32mClients #033[0m#033[0;34m 05 00 01 03 84#033[0m#033[0;37m
    Jun 23 11:52:24 raspberrypi MQTT-SNGateway[1523]: 20200623 060638.000 #033[0m#033[0;33mADVERTISE #033[0m#033[0;37m---> #033[0m#033[0;32mClients #033[0m#033[0;34m 05 00 01 03 84#033[0m#033[0;37m
    Jun 23 11:52:24 raspberrypi MQTT-SNGateway[1523]: 20200623 062140.000 #033[0m#033[0;33mADVERTISE #033[0m#033[0;37m---> #033[0m#033[0;32mClients #033[0m#033[0;34m 05 00 01 03 84#033[0m#033[0;37m
    Jun 23 11:52:24 raspberrypi MQTT-SNGateway[1523]: 20200623 063642.000 #033[0m#033[0;33mADVERTISE #033[0m#033[0;37m---> #033[0m#033[0;32mClients #033[0m#033[0;34m 05 00 01 03 84#033[0m#033[0;37m
    Jun 23 11:52:24 raspberrypi MQTT-SNGateway[1523]: 20200623 065144.000 #033[0m#033[0;33mADVERTISE #033[0m#033[0;37m---> #033[0m#033[0;32mClients #033[0m#033[0;34m 05 00 01 03 84#033[0m#033[0;37m
    Jun 23 11:52:24 raspberrypi MQTT-SNGateway[1523]: 20200623 070646.000 #033[0m#033[0;33mADVERTISE #033[0m#033[0;37m---> #033[0m#033[0;32mClients #033[0m#033[0;34m 05 00 01 03 84#033[0m#033[0;37m
    Jun 23 11:52:24 raspberrypi MQTT-SNGateway[1523]: 20200623 072148.000 #033[0m#033[0;33mADVERTISE #033[0m#033[0;37m---> #033[0m#033[0;32mClients #033[0m#033[0;34m 05 00 01 03 84#033[0m#033[0;37m
    Jun 23 11:52:24 raspberrypi MQTT-SNGateway[1523]: 20200623 073650.000 #033[0m#033[0;33mADVERTISE #033[0m#033[0;37m---> #033[0m#033[0;32mClients #033[0m#033[0;34m 05 00 01 03 84#033[0m#033[0;37m
    Jun 23 11:52:24 raspberrypi MQTT-SNGateway[1523]: 20200623 075152.000 #033[0m#033[0;33mADVERTISE #033[0m#033[0;37m---> #033[0m#033[0;32mClients #033[0m#033[0;34m 05 00 01 03 84#033[0m#033[0;37m
    Jun 23 11:52:24 raspberrypi MQTT-SNGateway[1523]: 20200623 080654.000 #033[0m#033[0;33mADVERTISE #033[0m#033[0;37m---> #033[0m#033[0;32mClients #033[0m#033[0;34m 05 00 01 03 84#033[0m#033[0;37m
    Jun 23 11:52:24 raspberrypi MQTT-SNGateway[1523]: 20200623 082156.000 #033[0m#033[0;33mADVERTISE #033[0m#033[0;37m---> #033[0m#033[0;32mClients #033[0m#033[0;34m 05 00 01 03 84#033[0m#033[0;37m
    Jun 23 11:52:24 raspberrypi MQTT-SNGateway[1523]: 20200623 083658.000 #033[0m#033[0;33mADVERTISE #033[0m#033[0;37m---> #033[0m#033[0;32mClients #033[0m#033[0;34m 05 00 01 03 84#033[0m#033[0;37m
    Jun 23 11:52:24 raspberrypi MQTT-SNGateway[1523]: 20200623 085200.000 #033[0m#033[0;33mADVERTISE #033[0m#033[0;37m---> #033[0m#033[0;32mClients #033[0m#033[0;34m 05 00 01 03 84#033[0m#033[0;37m
    Jun 23 11:52:24 raspberrypi MQTT-SNGateway[1523]: 20200623 090702.000 #033[0m#033[0;33mADVERTISE #033[0m#033[0;37m---> #033[0m#033[0;32mClients #033[0m#033[0;34m 05 00 01 03 84#033[0m#033[0;37m
    Jun 23 11:52:24 raspberrypi MQTT-SNGateway[1523]: 20200623 092204.000 #033[0m#033[0;33mADVERTISE #033[0m#033[0;37m---> #033[0m#033[0;32mClients #033[0m#033[0;34m 05 00 01 03 84#033[0m#033[0;37m
    Jun 23 11:52:24 raspberrypi MQTT-SNGateway[1523]: 20200623 093706.000 #033[0m#033[0;33mADVERTISE #033[0m#033[0;37m---> #033[0m#033[0;32mClients #033[0m#033[0;34m 05 00 01 03 84#033[0m#033[0;37m
    Jun 23 11:52:24 raspberrypi MQTT-SNGateway[1523]: 20200623 095208.000 #033[0m#033[0;33mADVERTISE #033[0m#033[0;37m---> #033[0m#033[0;32mClients #033[0m#033[0;34m 05 00 01 03 84#033[0m#033[0;37m
    Jun 23 11:52:24 raspberrypi MQTT-SNGateway[1523]: 20200623 100710.000 #033[0m#033[0;33mADVERTISE #033[0m#033[0;37m---> #033[0m#033[0;32mClients #033[0m#033[0;34m 05 00 01 03 84#033[0m#033[0;37m
    Jun 23 11:52:24 raspberrypi MQTT-SNGateway[1523]: 20200623 102212.000 #033[0m#033[0;33mADVERTISE #033[0m#033[0;37m---> #033[0m#033[0;32mClients #033[0m#033[0;34m 05 00 01 03 84#033[0m#033[0;37m
    Jun 23 11:52:24 raspberrypi MQTT-SNGateway[1523]: 20200623 103714.000 #033[0m#033[0;33mADVERTISE #033[0m#033[0;37m---> #033[0m#033[0;32mClients #033[0m#033[0;34m 05 00 01 03 84#033[0m#033[0;37m
    Jun 23 11:52:24 raspberrypi MQTT-SNGateway[1523]: 20200623 105216.000 #033[0m#033[0;33mADVERTISE #033[0m#033[0;37m---> #033[0m#033[0;32mClients #033[0m#033[0;34m 05 00 01 03 84#033[0m#033[0;37m
    Jun 23 11:52:24 raspberrypi MQTT-SNGateway[1523]: 20200623 110718.000 #033[0m#033[0;33mADVERTISE #033[0m#033[0;37m---> #033[0m#033[0;32mClients #033[0m#033[0;34m 05 00 01 03 84#033[0m#033[0;37m
    Jun 23 11:52:24 raspberrypi MQTT-SNGateway[1523]: 20200623 112220.000 #033[0m#033[0;33mADVERTISE #033[0m#033[0;37m---> #033[0m#033[0;32mClients #033[0m#033[0;34m 05 00 01 03 84#033[0m#033[0;37m
    Jun 23 11:52:24 raspberrypi MQTT-SNGateway[1523]: 20200623 113722.000 #033[0m#033[0;33mADVERTISE #033[0m#033[0;37m---> #033[0m#033[0;32mClients #033[0m#033[0;34m 05 00 01 03 84#033[0m#033[0;37m
    Jun 23 12:17:01 raspberrypi CRON[19685]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)
    Jun 23 13:17:01 raspberrypi CRON[19724]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)
    Jun 23 13:49:58 raspberrypi systemd[1]: Starting Daily apt download activities...
    Jun 23 13:50:00 raspberrypi systemd[1]: Started Daily apt download activities.
    Jun 23 13:50:00 raspberrypi systemd[1]: apt-daily.timer: Adding 5h 30min 59.960197s random time.
    Jun 23 13:50:00 raspberrypi systemd[1]: apt-daily.timer: Adding 6h 28min 40.323674s random time.
    Jun 23 14:17:01 raspberrypi CRON[19808]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)
    Jun 23 15:17:01 raspberrypi CRON[19847]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)
    Jun 23 15:42:19 raspberrypi dhclient[1612]: DHCPREQUEST of 10.200.242.104 on eth0 to 10.200.242.135 port 67
    Jun 23 15:42:19 raspberrypi dhclient[1612]: DHCPACK of 10.200.242.104 from 10.200.242.135
    Jun 23 15:42:19 raspberrypi NetworkManager[382]: <info> [1592926939.2025] dhcp4 (eth0): address 10.200.242.104
    Jun 23 15:42:19 raspberrypi NetworkManager[382]: <info> [1592926939.2027] dhcp4 (eth0): plen 24 (255.255.255.0)
    Jun 23 15:42:19 raspberrypi NetworkManager[382]: <info> [1592926939.2028] dhcp4 (eth0): gateway 10.200.242.135
    Jun 23 15:42:19 raspberrypi NetworkManager[382]: <info> [1592926939.2029] dhcp4 (eth0): server identifier 10.200.242.135
    Jun 23 15:42:19 raspberrypi NetworkManager[382]: <info> [1592926939.2030] dhcp4 (eth0): lease time 86400
    Jun 23 15:42:19 raspberrypi dbus[363]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.freedesktop.nm-dispatcher.service'
    Jun 23 15:42:19 raspberrypi NetworkManager[382]: <info> [1592926939.2031] dhcp4 (eth0): hostname 'raspberrypi'
    Jun 23 15:42:19 raspberrypi NetworkManager[382]: <info> [1592926939.2032] dhcp4 (eth0): nameserver '10.200.242.135'
    Jun 23 15:42:19 raspberrypi NetworkManager[382]: <info> [1592926939.2033] dhcp4 (eth0): state changed bound -> bound
    Jun 23 15:42:19 raspberrypi systemd[1]: Starting Network Manager Script Dispatcher Service...
    Jun 23 15:42:19 raspberrypi dhclient[1612]: bound to 10.200.242.104 -- renewal in 37260 seconds.
    Jun 23 15:42:19 raspberrypi dbus[363]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher'
    Jun 23 15:42:19 raspberrypi systemd[1]: Started Network Manager Script Dispatcher Service.
    Jun 23 15:42:19 raspberrypi nm-dispatcher: req:1 'dhcp4-change' [eth0]: new request (3 scripts)
    Jun 23 15:42:19 raspberrypi nm-dispatcher: req:1 'dhcp4-change' [eth0]: start running ordered scripts...
    Jun 23 15:42:19 raspberrypi ap-helper[19876]: Unsupported action: 'dhcp4-change'
    Jun 23 15:42:19 raspberrypi dhcpv6-helper[19878]: Unsupported action: 'dhcp4-change'
    Jun 23 16:17:01 raspberrypi CRON[19899]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)
    Jun 23 16:31:47 raspberrypi systemd[1]: Created slice User Slice of pi.
    Jun 23 16:31:47 raspberrypi systemd[1]: Starting User Manager for UID 1000...
    Jun 23 16:31:47 raspberrypi systemd[1]: Started Session c1 of user pi.
    Jun 23 16:31:47 raspberrypi systemd[19922]: Listening on GnuPG cryptographic agent (access for web browsers).
    Jun 23 16:31:47 raspberrypi systemd[19922]: Listening on GnuPG cryptographic agent and passphrase cache (restricted).
    Jun 23 16:31:47 raspberrypi systemd[19922]: Reached target Paths.
    Jun 23 16:31:47 raspberrypi systemd[19922]: Listening on GnuPG cryptographic agent (ssh-agent emulation).
    Jun 23 16:31:47 raspberrypi systemd[19922]: Reached target Timers.
    Jun 23 16:31:47 raspberrypi systemd[19922]: Listening on GnuPG cryptographic agent and passphrase cache.
    Jun 23 16:31:47 raspberrypi systemd[19922]: Reached target Sockets.
    Jun 23 16:31:47 raspberrypi systemd[19922]: Reached target Basic System.
    Jun 23 16:31:47 raspberrypi systemd[19922]: Reached target Default.
    Jun 23 16:31:47 raspberrypi systemd[19922]: Startup finished in 54ms.
    Jun 23 16:31:47 raspberrypi systemd[1]: Started User Manager for UID 1000.
    Jun 23 17:12:08 raspberrypi systemd[1]: Started Session c2 of user pi.

  • Hi,

    Our Thread developers have reproduced this behavior. After about 30 seconds of inactivity/no data sent from the device to the cloud, it seems that the cloud does not respond to the data anymore. We suspect that this is related to some sort of timeout in the Google Cloud CoAP proxy, which closes the connection without generating a DTLS Close message. We have not yet found the exact config that causes this behavior, but the CoAP proxy code and config files are available here

    I will try to look more into this today and tomorrow, and get back to you if I find some configuration that prevents this from happening.

    Best regards,
    Jørgen

  • Hi,

    Sorry for the slow progress on this issue. I had some troubles with setup and reproduction due to some firewall problem. I now have it setup and can consistently reproduce the behavior you are describing. I still have not found the root cause of the dropped DTLS session, but I have confirmed that this is the problem from the Google Cloud Logs:

    I will get back to you when I have more details.

    Best regards,
    Jørgen

  • Hi Jørgen, 

    No worries! Thanks for the last couple of updates and for working to reproduce the problem consistently. So one thing that I did do is decrease the timeout interval in the sample application for temperature measurement to 1 second. I thought that this would keep the connection open but you still see the same behavior. 

    Thanks in advance for continuing to poke at this.

  • I seem to be having more luck with using this setting for the memory block size in the sdk_config.h: 

    #ifndef MEMORY_MANAGER_XLARGE_BLOCK_SIZE
    #define MEMORY_MANAGER_XLARGE_BLOCK_SIZE 3100
    #endif

    A slight bump over the original number.

  • Darn. A few hours later after writing that it stopped working again. 

Reply Children
  • Hi Jorgen, since this is a demo proxy we've tried to pursue standing up our own CoAP proxy instance using californium but ran into a couple of snags there. What's the play here? ;)

  • Hi,

    I did run a test today with sending a "keep-alive message" every 5 seconds (the temperature value at the default interval). This has been running fine on my setup now for ~4 hours. I see you did a similar test before, but I'm not sure if you changed the example to send the temperature to the cloud if it did not change? By default, a CoAP message will only be sent when the temperature changes:

    if (m_temp != temp)
    {
        m_temp = temp;
        coap_publish();
    }

    If you remove the if, the message will be sent every interval.

    Regarding the CoAP proxy demo, I checked internally with our developers if anyone has experience with Californium and/or the Google proxy demo, but I could not find anyone. I believe that working together with Google support for getting help with this is your best approach. They are the ones that have made the demo and should be able to help you with any issues with setup and configuration. They may also help provide some details on why the cloud/proxy does not inform the end-node about closing the DTLS session. If they require some changes on our end to resolve the issue, we are of course here to help out!

    Best regards,
    Jørgen

  • Thanks for your latest reply and all the assistance. I think I will close this ticket for now since we have tried to used some other methods instead of the google proxy demo since that one seems like a dead end. 

Related