CoAP request fails after nRF9151 is idle for a few minutes (LTE-M)

Hello,

Looking for some input on this LTE-M connectivity issue we are seeing.

Hardware: nRF9151, custom board, same result seen on nRF9151DK.
nRF SDK Version: NCS v3.1.0-preview4 (using latest to pull in some useful changes to Zephyr CoAP client library)
Modem FW: mfw_nrf91x1_2.0.2

We are developing an application that uses CoAP + DTLS over LTE-M, communicating with a privately hosted CoAP server.
We upload files (33kB) from the device (client) to the server using CoAP POST requests (CoAP block-wise if that is relevant).
We have proven this to work, uploading a 33kB file in 17 x 2kB chunks.

The issue arises when the device is left to sit idle for a few minutes after completing an upload.
Next time we trigger a file upload, it fails. The request is not received by the server.
Multiple re-transmissions are triggered by the CoAP lib, but none are successful.

I have captured a modem trace showing this sequence of events -
1. Application boots up, connects to LTE-M network (line 1-42).
2. Connect to CoAP server - DTLS handshake occurs (line 43-52).
3. 1st file upload is triggered. 17 Confirmable CoAP POST request are sent. ACK is received for each (line 53-184).
4. nRF9151 is left idle for a few mins (5-10mins) (it is a co-processor to another main processor). Application does not make any explicit changes to modem, no AT commands sent etc.
5. Another CoAP upload is triggered (line 208) No response received. Re-transmission attempted multiple times. CoAP retry limit is reached and upload is cancelled.

I studied to modem trace in Wireshark to try to understand what changes were being made at the RRC and NAS_EPS level, between the first successful upload and the next failed upload. I could not see anything that looked problematic. The modem seems to be able to get back into RRC_CONNECTED state, but quickly returns to RRC_IDLE. I cannot understand why.

I would appreciate if you could look at the attached modem trace and see if there are any pieces of information that could explain the failed upload attempts.

I am located in Ireland, using Three Ireland network.
Parents
  • The modem trace shows, that you don't enable DTLS 1.2 CID.

    As long as you don't use a "private network" (e.g. IPsec Tunnel or VPN from your mobile provider to your server), your IP traffic will go through something similar a NAT and that removes the public address mapping after a quiet phase. The next message are then mapped to a new source address and that fails to decrypt on the server side. DTLS 1.2 CID overcomes that.

    Check, if your server supports CID. Maybe just by enable it in the modem with

    int cid = NRF_SO_SEC_DTLS_CID_SUPPORTED;
    err = setsockopt(sock, SOL_TLS, TLS_DTLS_CID, &cid, sizeof(cid));

    (I'm not sure, if this is still well, it was that with NCS 2.4.0).

    and test, if that already works with your server.

    Otherwise you may close your socket after the transfer and reopen it on the next transfer. Though you have anyway a lot of data, the additional handshake doesn't make it worse.

    a 33kB file in 17 x 2kB chunks

    CoAP blockwise (RFC 7959) doesn't support 2k blocks (1k max), as UDP would usually also not support that without UDP fragmentation, which is considered to be not too reliable.

  • Hello Achim,

    Many thanks for your response. Apologies for the delay in responding - I was on vacation.

    I have confirmed that adding DTLS CID has fixed the issue! We are using the Californium as our server by the way and it worked without any server changes.

    I didn't explain the upload size very well -  I meant that we are sending a 33kB file in 17 individual CoAP transactions. Each transaction is sending 2kB data, but using block-wise chunks of 512 bytes. So each transaction contains 4 blocks. We re-assemble the file on the server manually. This was chosen to limit the amount of RAM we need to store the data before uploading (it is sent from the main processor to nRF9151 in 2kB chunks).

    Thanks a lot for your input and expertise - saved me a lot of debugging Slight smile

    Jason

Reply Children
No Data
Related