nRF Cloud — CoAP transport (device shadow)
Date of incident: 2026-06-22, ~13:04 UTC — still unresolved at 20:42 UTC (7h38m+)
Device: nRF9151, modem firmware mfw_nrf91x1_2.0.4, NCS v3.2.3
SIM: Soracom roaming on One NZ, PLMN 53001, Band 28, LTE-M
Summary
Our device experienced a sustained 7+ hour period where every CoAP shadow GET to coap.nrfcloud.com timed out with no response. The DTLS Connection ID session resumed correctly in ~12ms every cycle, confirming the DTLS gateway was alive. The failure seems to be at the CoAP application layer — the shadow service (or internal routing to it) was not responding.
Observed behaviour (235 consecutive cycles over 7h):
→ DTLS CID resume: 12ms ✓ (gateway healthy)
→ CoAP GET /state/delta: sent
→ 1st retransmit: +3s (no ACK from server)
→ 2nd retransmit: +6s
→ 3rd retransmit: +12s
→ 4th retransmit: +21s
→ Timeout, no more retries: +47s
→ nrf_cloud_coap error: -116 (ENOTCONN)
→ Disconnect → wait 30s → reconnect → repeat
Log excerpt (one representative cycle):
[08:31:28] cloud_connection: Connected to nRF Cloud
[08:31:31] net_coap: Timeout, retrying send
[08:31:36] net_coap: Timeout, retrying send
[08:31:46] net_coap: Timeout, retrying send
[08:32:04] net_coap: Timeout, retrying send
[08:32:47] net_coap: Timeout, no more retries left
[08:32:47] nrf_cloud_coap: Shadow response processing error: -116
[08:32:47] shadow_support_coap: Failed to request shadow delta: -116
[08:32:47] cloud_connection: Communication error detected.
[08:32:47] cloud_connection: Disconnected from nRF Cloud
[08:32:47] cloud_connection: Retrying in 30 seconds...
What this rules out:
- Device firmware bug — 0 crashes, 0 reboots across 235 cycles;
DISCONNECT_ON_FAILED_REQUEST=yworking correctly - Network/SIM issue — DTLS CID resume at 12ms every cycle; RSRP -87dBm, CE level 0
- DTLS gateway issue — gateway is responding to CID resumes; failure is at CoAP payload layer
Impact:
- 235 GNSS location fixes discarded (CoAP POST fails while shadow GET fails)
- No cloud shadow updates for entire duration
Configuration:
CONFIG_NRF_CLOUD_COAP=y
CONFIG_NRF_CLOUD_COAP_DTLS_CID=y
CONFIG_NRF_CLOUD_COAP_DISCONNECT_ON_FAILED_REQUEST=y
CONFIG_LTE_RAI_REQ=y
Question
- Is the shadow GET the first request that requires a backend roundtrip after DTLS resume, or should a healthy backend always respond within the default CoAP timeout?