DTLS causes re-registration on LwM2M using mobile network

Hello Everyone,

Summary

Chip:

nRF52840

OS:

nRF Connect / Zephyr

Problem:

Mobile network connections cause LwM2M (with DTLS) to perform re-registration if the update interval is longer than ~3 minutes. 

Details

We're using LwM2M (with DTLS) to monitor / control our nRF52840 uC (connected via an openthread network [OT]).

Working Condition

When the OT boarder router is connected via a fixed connection (within a building) we can set an LwM2M update interval of 5 minutes with no problems. Registration occurs once and updates occur after that point.

Error Condition:

When the OT boarder router is connected via a mobile connection (aka sim) we can't set an LwM2M update interval to more than ~2-3 minutes. If we do set a longer interval for LwM2M all update requests timeout.

This causes the device to perform re-registrations, which has the following effects:

  • Increase data usage
  • Dropping in and out of the LwM2M server as the connection interval is longer than the expected life time.

Additionally if i disable DTLS encryption then LwM2M may have longer update intervals. 

Assumption of the issue

I'm assuming the issue is that the mobile operators network is closing / deleting the NAT entry after 2-3 minutes of no use. Which means the LwM2M server cannot identify the client via the IP+port, forcing the device to re-register / negotiate the DTLS encryption.

From what i've read the following solutions are plausible:

  • Replace DTLS encryption in for OSCORE.
    • Zephyr doesn't seem to have support for OSCORE yet, there is a module for it but its not in the LwM2M stack at-least.
  • Using DTLS 1.2 on the device and server. This allows the connection to be identified by the connection id CID.
    • I'm not sure what version of DTLS Zephyr uses.
  • Sending empty requests every 2 minutes to keep the port open.

Any help or advise on this issue would be great.

Thanks for your time!

Parents
  • DTLS 1.2 is RFC6347, that is not generally including DTLS 1.2 CID (RFC 9146).

    Currently I know three DTLS 1.2 CID implementations Eclipse/Californium (Java, server/client), Eclipse/tinydtls (C, feature branch, client only), and mbedTLS (release begin of this year, C, client/server). AFAIK, zephyr uses mbedTLS, but I'm not sure, which version and if CID is enabled. I setup a demo with tinydtls  zephyr-coaps-client (nRF9160, coap only, not lwm2m), that works pretty well.

    Unfortunately it requires more then just using DTLS 1.2 CID, because some upper layer stuff uses the ip-address to identify the other peer as well and that must be adapted also.

  • Cheers for the above. NRF v2.3.0 seems to use mbedtls v3.1.0, which was released in Dec 17, 2021. However the release notes do state:

    The identifier of the CID TLS extension can be configured by defining MBEDTLS_TLS_EXT_CID at compile time.

    But it doesn't seem like Zephyr has enabled it yet:

    --

    Reading what you said does this mean CID may improve the situation but doesn't guarantee to fix it? 

    If this is true, then using OSCORE instead of DTLS would be the only real fix? 

  • Had the weekend to let your comments sink in. We're using a commercial SaaS LwM2M server, so i'm unsure on the exact implementation.

    But the problem could be to do with the LwM2M server. If the server identifies the clients using IP/port then this issue is apparent. 

    However if the server uses say, the DTLS session id to identify the client, then this could get around the issue of changing ip/port. In which case DTLS can aid in the issue of identifying devices when the ip/port changes.

    So even if Zephyr did support DTLS CIDs, the DTLS implementation would have to take advantage of it.

    If DTLS provides a session id what gain do we get from CIDs?

    How far am i off the mark?

  • > We're using a commercial SaaS LwM2M server,

    I guess, they will know, how it works.

    The DTLS session id is only used by the client in its "ClientHello" during a handshake. Without DTLS 1.2 CID you will the need frequently resumption handshakes. That was more or less the situation 5 years ago. At that time I introduced something as the "auto resumption timeout" into Californium. Anyway, even a resumption handshake is a handshake and is therefore more overhead. And you need to ensure, that both sides supports it. It's more common, that the server, which are aware of that ip-address change, use the dtls-principal. That works even if the resumption handshake falls back to a full handshake. And it works also for DTLS 1.2 CID.

    In difference to the DTLS session id, the DTLS 1.2 CID is send in every encrypted message,  therefore it works instantly and doesn't  require new (resumption) handshakes.

  • I will ask them.

    Can i ask what you mean by the "dtls-principal"? I cant find reference for that term in the RFCs/online.

    When using PSK encryption/mode for DTLS is the PSK-Identity sent in every request? In which case if the identity is unique to the device, that's another sure way of identifying devices?

  • > the "dtls-principal"

    That's the representation of the identity of the other peer. "Principal" is the used term for that in java. e.g. the "PSK identity" would be a candidate for the "principal". Or the "subject" in x509. Assuming, that both are unique per client (I think to remember, that at least for the PSK identity this is required for LwM2M).

    > is the PSK-Identity sent in every request? 

    No. It's only send in the "ClientKeyExchange" during a full-handshake.

    The DTLS records are defined in RFC 6347 - Section 4.3.1 .  An application data record contains the "DTLSCiphertext" with the encrypted application data (that is mostly a nonce + encrypted application data + mac) in the "fragment".

  • So without DTLS CIDs there's no way of identifying the connection/device other than ip/port.

Reply Children
Related