nRFCloud CoAP messages not received

Hello DevZone


I am currently facing an issues with device already installed at my customer.
Devices are based on nRF9160, firmware based on SDK 2.8.0 and modem 1.3.6.
Once deviced are installed, they do not move.

Device works like this:
Sleep 15 min
Wakeup
Take a measurement
If modem was switched off (no PSM):
    -> Reconnect to nRFCloud: 

nrf_cloud_coap_disconnect() ;	// From any previous sesssio. Just to be sure
nrf_cloud_coap_init() ;
err = nrf_cloud_coap_connect(FIRMWARE_VERSION) ;

Send sensor measurement to nRFCloud via CoAP
If PSM is available:
    -> Go IDLE / PSM
If not:
    -> Switch off modem to disconnect from network


I have ~4000 units deployed with similar HW and FW, and everything is working as expected.
However, I have two devices facing the same problem: the messages sent by the devices are only partially received on nRFCloud. 
(there might be more device, but not detected yet)


The two reported cases share one common thing : the MNO doesn't offer PSM with timings acceptable for the device operation.
Therefore, the device disconnects every time from the network after sending.


One device is located in Martinique (French Antille) and has access to two LTE-M networks: Orange and SFR. Both with good coverage (ConEval and RSRP are good), and not providing PSM.
If the device connects to SFR, everything works fine. However, if it connects to Orange, many messages are lost. It also seems that the more time, the more loss.
At initially though that it was a "bad network", or a "bad antenna" or some restriction between my MNVO and the local provider.
Then, the second case occured, in Italy.
The device connects to Vodafone in LTE-M, no PSM.
Looking at the device log, all messages are sent properly.
On the SIM card side, the amount of data consumed and the connect/disconnect also show a normal behavior.
However, not all messages are received on nRFCloud.
This time, since I have several other devices connected to the same network (but not the same cell), so I am quite confident that there is no MVNO/local operator issue.



CoAP messages are sent using

nrf_cloud_coap_json_message_send(msg, false, false)
, that always return 0. Payload is pretty small, with msg being a string with approx 20 chars.
I understand that using the confirmable feature would be more secure. This has not been done so far, as the firmware was previously based on SDK 2.5.0 and 2.6.0, where this feature was not available.

What could be causing the issue ?
I understand that without confirmable, some messages might be lost. However, we are here talking of more than 50% messages lost, even in good network conditions.

On the image attached, I am expecting a continuous flow of 4 messages / hour (and a bit more during the night).


My first idea is that the modem gets disconnected before the message is fully sent. However, this is not consistent with the data consumed by the SIM card (at least, it's not obvious)

Is there any explanation to this ? Are there reported "bad network cells", that could explain this behavior ?

Thanks for your help.

  • Hi,

    Thanks for the update, and that power concern is completely valid. Using confirmable=true is a good next step. It can increase modem on time slightly (waiting for CoAP ACK / retries), so checking power impact on one test device is the right approach.

    A modem trace captures low-level modem/network communication, which helps us see where the failure happens. A practical way is to build a trace enabled firmware using snippets and capture with Cellular Monitor or nrfutil trace lte. If the affected unit is deployed and physical/debug access is not possible, that is understandable. In that case, could you try getting the device and reproducing this issue in the lab on a same unit with the same operator, and capture a trace there? 

    Optional longer-term path: if you already use Memfault (or plan to), it can be used for remote observability and modem trace related workflows in production.

    Best Regards,
    Syed Maysum

Related