nRF9160 - sent UDP message do not longer reach destination after 4 weeks

Running a Thingy:91 now for 4 weeks from batter with one message exchange per hour, it stopped working (NB-IoT mode).

The client (Thingy:91) uses "sendto" and that function returns success.

The client's log:

[18:01:38.586,822] <inf> COAP_CLIENT: CoAP request prepared, token 0xfbe72850, 139 bytes
[18:01:38.594,146] <inf> COAP_CLIENT: send_to_peer 175
[18:01:38.618,927] <inf> COAP_CLIENT: LTE modem wakes up
[18:01:40.208,496] <inf> COAP_CLIENT: RRC mode: Connected
[18:01:40.594,757] <inf> COAP_CLIENT: 1/1/729-1621 ms: connected => sent
[18:01:44.595,489] <inf> COAP_CLIENT: CoAP request resend, timeout 6
[18:01:44.596,832] <inf> COAP_CLIENT: resent_to_peer 175
[18:01:51.597,991] <inf> COAP_CLIENT: CoAP request resend, timeout 12
[18:01:51.607,635] <inf> COAP_CLIENT: resent_to_peer 175
[18:02:04.609,924] <inf> COAP_CLIENT: CoAP request resend, timeout 24
[18:02:04.619,537] <inf> COAP_CLIENT: resent_to_peer 175
[18:02:17.364,807] <inf> COAP_CLIENT: RRC mode: Idle after 37156 ms (12745 ms inactivity)
[18:02:25.377,655] <inf> COAP_CLIENT: LTE modem sleeps
[18:02:32.625,030] <inf> COAP_CLIENT: 729/-1ms/-1ms: failure

It starts with passing the message to "sento", which returns success with 175.

Then the modem wakes up and reports RRC connected. Though no response is received, the modem resends the message also with "sendto", which also returns success with 175.

On the server side the ip-capture shows, that no UDP traffic form that modem comes in, but from other clients it is still working.

The status page of the SIM-card provider shows as last event "PDP Context deleted".

For me this look like the modem seems to have trouble with the "PDP Context deleted" but didn't report that in the return code of "sendto".

Any proposal how to overcome that?

(And no, I'm not able to provide a capture of a 4 weeks run ;-)! But I hope, that the Nordic development team does also long term tests and so have already the experience with that.)

  • Hi Achim,

    We are looking into this and will get back to you.

  • Hi Achim, sorry for the delay.

    Something to point out is that both sendto (and send) returns success if the data is transferred to the modem. As UDP does not have guaranteed delivery, a successful sendto will not guarantee that the data will reach or have reached the server.

    Could this be a data quota issue?

    What happens if you reset the device?

  • > As UDP does not have guaranteed delivery, a successful sendto will not guarantee that the data will reach or have reached the server

    You see the retransmission in the log, or? And you read, that the server received the messages from other devices? And that the SIM-card provider shows as last event "PDP Context deleted"?

    > Could this be a data quota issue?

    No.

    > What happens if you reset the device?

    It works again.

    > As UDP ... not guarantee

    In my experience (more than 5 years with UDP, CoAP/DTLS !) UDP with retransmission as CoAP CON, is much more reliable than TCP!

    The only culprit in all this years have been bugs in software developed by peoples, who then telling, that it may be caused by "unreliable UDP". OK, sometimes it's a firewall, which blocks all UDP traffic, but that has changed a lot in the last years.

    Beside of that, UDP works brilliant!

    Let me ask:

    Does Nordic have any example, which could be tested to run for 4 weeks?

    From battery? Without watchdog reset?

    Did you do that test on your own? Successfully? 

    I already propose, that Nordic demonstrates the "best practice", but beside of videos, paper, and a lot of words, I miss a real word demonstration of your best practice.

    Best practices for cellular iot

  • Not sure, but it seems to happens again and again:

    I try to provide precise information and I would welcome, if an modem expert of Nordic spends some time in. But it's no problem at all, if this is not possible. I'm only evaluating.

  • Achim Kraus said:
    Not sure, but it seems to happens again and again

    Does this only happen after 4 weeks, or could it also happen after a shorter time interval?

Related