This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

Losing connection with Thread CoAP Google Cloud example

Dear Nordic Support Team,

I haven't seen any forum posts directly applicable to the issue I'm seeing so I hope you can point me in the right direction.

The basic Thread CoAP Google Cloud example has been working fine, but I noticed a couple of peculiarities. I'm a still using the Thread SDK v4.0.0 as I didn't see there were specific items in the 4.1.0 SDK release to address any of these issues. 

1) The first issue is that when I leave the network on for some period of time, it seems to not be able to send DTLS application data packets anymore. For example, I would press the buttons to try and increment or decrement the count, but I only see MAC level exchanges. I can see from the sniffer that there are no "DTLS Application Data" exchanges. 

In order to reconnect I have to power cycle the Thread FTD. It then re-establishes the DTLS connection, then everything is back to normal. How I can I avoid having to do this periodically?

2) The second issue is that the network does not seem to recover from a power cycle of the border router. This happens when I have the network unplugged for quite some time. My expectation is that the network should resume from where it left off, but that has not been my experience. I have to re-form the border router using the OTBR web interface, then do a re-flash of the FTD to let it join again. At this point, the network is back and works as per normal.

How can I solve these two issues.

Thanks for your help in advance.

  • 30s inactivity ... no active session

    Try to check, if the endpoint (ip-address + port) of the device changes in the proxies logs. I guess so. That's not a new phenomenon, networks are full of NATs, and many of them are using a UDP timeout of 30s.

    To overcome that.

    1. force a full dtls handshake (renegotiate will not work)
    2. or a resumption handshake
    3. or use DTLS 1.2 Connection ID

    The 1 will come with the downside, that the handshake itself takes time and bandwidth. The 2 improves that (for PSK just a little), and the 3. works brilliant, but requires a implementation, which supports that. I'm not sure, if google still uses a over aged Californium (2.0.0-M11, 2 years old), the current Californium release 2.3.0 offers a DTLS 1.2 Connection ID implementation. Anyway, that would require also a client implementation.

    (The 4. option would be "v-coaps-border router", but would even require a specification :-) )

  • Hi Jorgen, since this is a demo proxy we've tried to pursue standing up our own CoAP proxy instance using californium but ran into a couple of snags there. What's the play here? ;)

  • Hi,

    I did run a test today with sending a "keep-alive message" every 5 seconds (the temperature value at the default interval). This has been running fine on my setup now for ~4 hours. I see you did a similar test before, but I'm not sure if you changed the example to send the temperature to the cloud if it did not change? By default, a CoAP message will only be sent when the temperature changes:

    if (m_temp != temp)
    {
        m_temp = temp;
        coap_publish();
    }

    If you remove the if, the message will be sent every interval.

    Regarding the CoAP proxy demo, I checked internally with our developers if anyone has experience with Californium and/or the Google proxy demo, but I could not find anyone. I believe that working together with Google support for getting help with this is your best approach. They are the ones that have made the demo and should be able to help you with any issues with setup and configuration. They may also help provide some details on why the cloud/proxy does not inform the end-node about closing the DTLS session. If they require some changes on our end to resolve the issue, we are of course here to help out!

    Best regards,
    Jørgen

  • Thanks for your latest reply and all the assistance. I think I will close this ticket for now since we have tried to used some other methods instead of the google proxy demo since that one seems like a dead end. 

Related