Receiving MQTT Shadow Data before NRF_CLOUD_EVT_READY

Hello,

I'm currently debugging an issue we are having with MQTT device shadow interaction on the nrf9160. The issue presents when we push a large amount of config JSON to the device shadow. This typically happens when we are setting up a new device for the first time and need to set all the parameters in the "config" object via the REST API. The issue is when this large delta occurs the device will try and sync it via MQTT and it gets stuck in a connect/disconnect loop. 

I'm currently looking at once specific case that I can get to repeat multiple times. Where the following events occur in sequence:

1. NRF_CLOUD_EVT_TRANSPORT_CONNECTED occurs 

2. NRF_CLOUD_EVT_RX_DATA_SHADOW occurs with the large delta (in my case about 1.3Kb)

3.  NRF_CLOUD_EVT_USER_ASSOCIATED occurs

4.  NRF_CLOUD_EVT_TRANSPORT_DISCONNECTED occurs for reason  -> NRF_CLOUD_DISCONNECT_MISC

5. The LTE connection is reset (not a reboot) and it attempts to connect again and goes through the same sequence.

Now the device never gets a chance to send the shadow delta back up to the cloud because our firmware waits until NRF_CLOUD_EVT_READY before we send any data back to nrfCloud. And because we don't acknowledge the shadow delta it gets sent again the next time we connect, resulting in an endless loop.  

The question that comes to mind is why we are getting shadow data before receiving, NRF_CLOUD_EVT_READY? The documentation here seems to imply this shouldn't happen?

Any insight on how to debug this further would be greatly appreciated.

Thanks!

  • Hi Eric,

    This issue seems to happen before you push a large amount of config JSON to the device shadow.

    Are you able to repeat the issue with other official nRF Cloud samples like Cellular: nRF Cloud multi-service?

    Best regards,

    Charlie

  • Charlie,

    Our code is based on the nRF Cloud MQTT multi-service sample (from SDK v2.4.2) and has basically the same core connection management. I also attempted loading the unmodified sample onto the same hardware in question above and got basically the same results see below logs:

     - - - - - - - - - - - - - - - - TARGET RESET - - - - - - - - - - - - - - - - -
    *** Booting Zephyr OS build v3.3.99-ncs1-1 ***
    [00:00:00.250,946] \033[0m<dbg> led_control: start_led_pattern: LED Pattern Requested\033[0m
    [00:00:00.250,976] \033[0m<inf> connection: Setting up modem...\033[0m
    [00:00:10.185,699] \033[0m<inf> main: nRF Cloud MQTT multi-service sample has started.\033[0m
    [00:00:10.185,729] \033[0m<dbg> connection: await_lte_connection: Awaiting LTE Connection\033[0m
    [00:00:10.185,791] \033[0m<dbg> led_control: led_animation_thread_fn: LED Management Started\033[0m
    [00:00:10.185,791] \033[0m<dbg> led_control: led_animation_thread_fn: LED Pattern Started\033[0m
    [00:00:10.185,882] \033[0m<dbg> connection: consume_device_message: Consuming an enqueued device message\033[0m
    [00:00:10.199,737] \033[0m<dbg> connection: setup_modem: Modem library initialized.\033[0m
    [00:00:10.199,737] \033[0m<inf> connection: Setting up nRF Cloud library...\033[0m
    [00:00:10.207,855] \033[0m<inf> connection: Setting up LTE...\033[0m
    [00:00:10.208,557] \033[0m<inf> connection: Starting connection to LTE network...\033[0m
    [00:00:10.257,537] \033[0m<inf> connection: Connecting to LTE network. This may take several minutes...\033[0m
    [00:00:10.257,537] \033[0m<inf> connection: Waiting for connection to LTE network...\033[0m
    [00:00:10.257,568] \033[0m<dbg> led_control: start_led_pattern: LED Pattern Requested\033[0m
    [00:00:10.257,598] \033[0m<dbg> connection: await_lte_connection: Awaiting LTE Connection\033[0m
    [00:00:11.645,507] \033[0m<dbg> connection: lte_event_handler: LTE_EVENT: Network registration status 2, Searching\033[0m
    [00:00:11.645,568] \033[0m<dbg> connection: lte_event_handler: LTE_EVENT: LTE cell changed: Cell ID: 118021391, Tracking area: 24838\033[0m
    [00:00:11.645,629] \033[0m<dbg> connection: lte_event_handler: LTE_EVENT: Active LTE mode changed: LTE-M\033[0m
    [00:00:16.202,575] \033[0m<dbg> connection: lte_event_handler: LTE_EVENT: RRC mode: Connected\033[0m
    [00:00:18.461,151] \033[0m<dbg> connection: lte_event_handler: LTE_EVENT: Modem domain event, type: Search done\033[0m
    [00:00:18.466,369] \033[0m<dbg> connection: lte_event_handler: LTE_EVENT: Network registration status 5, Registered Roaming\033[0m
    [00:00:18.466,705] \033[0m<inf> connection: Connected to LTE network\033[0m
    [00:00:18.466,735] \033[0m<inf> connection: Connecting to nRF Cloud...\033[0m
    [00:00:18.466,735] \033[0m<inf> connection: Next connection retry in 30 seconds\033[0m
    [00:00:18.466,766] \033[0m<dbg> connection: await_cloud_connected: Awaiting Cloud Connection\033[0m
    [00:00:18.466,827] \033[0m<dbg> connection: await_cloud_ready: Awaiting Cloud Ready\033[0m
    [00:00:18.466,857] \033[0m<dbg> connection: cloud_event_handler: NRF_CLOUD_EVT_TRANSPORT_CONNECTING\033[0m
    [00:00:18.567,382] \033[0m<dbg> connection: lte_event_handler: LTE_EVENT: PSM parameter update: TAU: 3240, Active time: -1\033[0m
    [00:00:18.570,678] \033[0m<dbg> connection: await_cloud_ready: Awaiting Cloud Ready\033[0m
    [00:00:23.246,063] \033[1;31m<err> nrf_cloud_transport: MQTT input error: -128\033[0m
    [00:00:23.246,154] \033[1;31m<err> nrf_cloud_transport: Error disconnecting from cloud: -128\033[0m
    [00:00:23.246,185] \033[0m<dbg> connection: cloud_event_handler: NRF_CLOUD_EVT_TRANSPORT_DISCONNECTED\033[0m
    [00:00:23.246,215] \033[0m<inf> connection: nRF Cloud failed to become ready. Resetting connection.\033[0m
    [00:00:23.246,246] \033[0m<inf> connection: Disconnecting from nRF Cloud\033[0m
    [

  • But this log shows your device is not connected with the nRF Cloud at all.

    Have you correctly provisioned your device to nRF Cloud?  There is guidance about nRF Cloud Access Provisioning - Software and Protocols - Cellular IoT guides - Nordic DevZone (nordicsemi.com). You need to follow the flowchart "Own CA" branch for your product.

    Best regards,

    Charlie

  • Charlie,

    This device is provisioned. It was connecting to nrfCloud and functioning fine prior to us pushing the update to the shadow via MQTT. As I mentioned this i a very repeatable issue and I have done it on multiple devices that have been provisioned and working correctly prior to pushing the config update to REST. 

    Additionally, on our custom firmware as mentioned in the first post we are getting the shadow JSON that we pushed to the device we just never get the NRF_CLOUD_EVT_READY event and therefor don't process the shadow JSON. I imagine it wouldn't be possible to get the shadow for the device if it wasn't provisioned correctly on the cloud.  

    Eric

  • Charlie,

    Here is a log from the exact same device running our custom firmware (I loaded it again after running the MQTT sample mentioned above), note I have not changed the state of nrfCloud (shadow) or the device certs between firmware loads.

    You can can see that we get shadow data before receiving the NRF_CLOUD_EVT_READY event. We simply get NRF_CLOUD_EVT_TRANSPORT_DISCONNECTED occurs for reason - NRF_CLOUD_DISCONNECT_MISC (3)

Related