This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

I haven't been able to run lte_ble_gateway for more than 2 days

Hi, 

I am have programmed my Thingy 91 device with lte_ble_gateway program to collect data from 5 Bluetooth devices and send them to the cloud. This program runs fine, however it does not appear to continue to run for more than 2 days, after that the program just goes silent. No output, no error message no nothing. 

Can someone point me in the right direction please?

Parents
  • Hi,

     

    What are the traces coming from the thingy:91? You say it does not output anything, but where did it stop?

    What makes the device function again, a reset of the nRF9160 or a power cycle of the full design?

     

    Kind regards,

    Håkon

  • Hi

    I don't know what do you mean by traces, but I get no output at serial line, and no output at nRF Connect for Cloud. It simply stops doing what it is suppose to do. 

    So far to get it to function again, I switch it off and on again manually. 

    I hope this explains the problem. 

    Regards 

    Marshed 

  • Here are some more logs 

    # SEGGER J-Link RTT Viewer V6.82g Terminal Log File
    # Compiled: 17:00:55 on Aug 28 2020
    # Logging started @ 02 Sep 2020 15:59:59
    00> --- 803 messages dropped ---
    00> [00:02:07.727,355] <dbg> net_mqtt.mqtt_input: (0x20024ea8): state:0x00000006
    00> --- 743 messages dropped ---
    00> [00:03:07.738,403] <dbg> net_mqtt_rx.mqtt_read_message_chunk: (0x20024ea8): [CID 0x20023b0c]: Transport read error: -11
    00> --- 734 messages dropped ---
    00> [00:06:04.770,751] <dbg> net_mqtt.mqtt_input: (0x20024ea8): state:0x00000006
    00> --- 727 messages dropped ---
    00> [00:07:03.223,541] <dbg> net_mqtt_rx.mqtt_read_message_chunk: (0x20024ea8): [CID 0x20023b0c]: Transport read error: -11
    00> --- 736 messages dropped ---
    00> [00:10:03.314,453] <dbg> net_mqtt.mqtt_input: (0x20024ea8): state:0x00000006
    00> --- 755 messages dropped ---
    00> [00:12:03.250,976] <dbg> net_mqtt.mqtt_input: (0x20024ea8): state:0x00000006
    00> --- 733 messages dropped ---
    00> [00:14:03.358,398] <dbg> net_mqtt.mqtt_input: (0x20024ea8): state:0x00000006
    00> --- 733 messages dropped ---
    00> [00:15:03.369,415] <dbg> net_mqtt_rx.mqtt_read_message_chunk: (0x20024ea8): [CID 0x20023b0c]: Transport read error: -11
    00> --- 366 messages dropped ---
    00> [00:16:04.880,615] <dbg> net_mqtt.mqtt_input: (0x20024ea8): state:0x00000006
    00> --- 68 messages dropped ---
    00> [00:16:07.381,134] <dbg> net_mqtt_rx.mqtt_read_message_chunk: (0x20024ea8): [CID 0x20023b0c]: Transport read error: -11
    00> --- 1099 messages dropped ---
    00> [00:20:05.924,743] <dbg> net_mqtt.mqtt_input: (0x20024ea8): state:0x00000006
    00> --- 684 messages dropped ---
    00> [00:20:57.934,326] <dbg> net_mqtt_rx.mqtt_read_message_chunk: (0x20024ea8): [CID 0x20023b0c]: Transport read error: -11
    00> --- 422 messages dropped ---
    00> [00:22:03.446,258] <dbg> net_mqtt.mqtt_input: (0x20024ea8): state:0x00000006
    00> --- 734 messages dropped ---
    00> [00:24:03.468,292] <dbg> net_mqtt_rx.mqtt_read_message_chunk: (0x20024ea8): [CID 0x20023b0c]: Transport read error: -11
    00> --- 702 messages dropped ---
    00> [00:26:04.990,478] <dbg> net_mqtt.mqtt_input: (0x20024ea8): state:0x00000006
    00> --- 787 messages dropped ---
    00> [00:28:03.512,237] <dbg> net_mqtt_rx.mqtt_read_message_chunk: (0x20024ea8): [CID 0x20023b0c]: Transport read error: -11
    00> --- 734 messages dropped ---
    00> [00:30:04.034,240] <dbg> net_mqtt.mqtt_input: (0x20024ea8): state:0x00000006
    00> --- 364 messages dropped ---
    00> [00:31:01.044,708] <dbg> net_mqtt_rx.mqtt_read_message_chunk: (0x20024ea8): [CID 0x20023b0c]: Transport read error: -11
    00> --- 435 messages dropped ---
    00> [00:33:06.067,565] <dbg> net_mqtt.mqtt_input: (0x20024ea8): state:0x00000006
    00> --- 732 messages dropped ---
    00> [00:35:58.099,060] <dbg> net_mqtt.mqtt_input: (0x20024ea8): state:0x00000006
    00> --- 1107 messages dropped ---
    00> [00:37:03.388,336] <dbg> net_mqtt_rx.mqtt_read_message_chunk: (0x20024ea8): [CID 0x20023b0c]: Transport read error: -11
    00> --- 736 messages dropped ---
    00> [00:40:03.644,042] <dbg> net_mqtt.mqtt_input: (0x20024ea8): state:0x00000006
    00> --- 698 messages dropped ---
    00> [00:40:58.154,052] <dbg> net_mqtt_rx.mqtt_read_message_chunk: (0x20024ea8): [CID 0x20023b0c]: Transport read error: -11
    00> --- 791 messages dropped ---
    00> [00:43:04.177,062] <dbg> net_mqtt.mqtt_input: (0x20024ea8): state:0x00000006
    00> --- 732 messages dropped ---
    00> [00:45:58.208,923] <dbg> net_mqtt.mqtt_input: (0x20024ea8): state:0x00000006
    00> --- 365 messages dropped ---
    00> [00:46:05.437,927] <dbg> net_mqtt.mqtt_input: (0x20024ea8): state:0x00000006
    00> --- 434 messages dropped ---
    00> [00:47:06.221,435] <dbg> net_mqtt_rx.mqtt_read_message_chunk: (0x20024ea8): [CID 0x20023b0c]: Transport read error: -11
    00> --- 734 messages dropped ---
    00> [00:49:06.454,467] <dbg> net_mqtt.mqtt_input: (0x20024ea8): state:0x00000006
    00> --- 684 messages dropped ---
    00> [00:50:58.263,916] <dbg> net_mqtt_rx.mqtt_read_message_chunk: (0x20024ea8): [CID 0x20023b0c]: Transport read error: -11
    00> --- 787 messages dropped ---
    00> [00:53:03.476,165] <dbg> net_mqtt.mqtt_input: (0x20024ea8): state:0x00000006
    00> --- 365 messages dropped ---
    00> [00:55:03.308,715] <dbg> net_mqtt.mqtt_input: (0x20024ea8): state:0x00000006
    00> --- 701 messages dropped ---
    00> [00:56:04.820,007] <dbg> net_mqtt.mqtt_input: (0x20024ea8): state:0x00000006
    00> --- 421 messages dropped ---
    00> [00:57:03.330,749] <dbg> net_mqtt_rx.mqtt_read_message_chunk: (0x20024ea8): [CID 0x20023b0c]: Transport read error: -11
    00> --- 734 messages dropped ---
    00> [01:00:03.363,647] <dbg> net_mqtt.mqtt_input: (0x20024ea8): state:0x00000006
    00> --- 700 messages dropped ---
    00> [01:00:57.873,687] <dbg> net_mqtt_rx.mqtt_read_message_chunk: (0x20024ea8): [CID 0x20023b0c]: Transport read error: -11
    00> --- 60 messages dropped ---
    00> [01:01:04.874,938] <dbg> net_mqtt.mqtt_input: (0x20024ea8): state:0x00000006
    00> --- 419 messages dropped ---
    00> [01:02:05.889,739] <dbg> net_mqtt_rx.mqtt_read_message_chunk: (0x20024ea8): [CID 0x20023b0c]: Transport read error: -11
    00> --- 734 messages dropped ---
    00> [01:04:06.411,743] <dbg> net_mqtt.mqtt_input: (0x20024ea8): state:0x00000006
    00> --- 684 messages dropped ---
    00> [01:05:57.932,220] <dbg> net_mqtt_rx.mqtt_read_message_chunk: (0x20024ea8): [CID 0x20023b0c]: Transport read error: -11
    00> --- 422 messages dropped ---
    00> [01:07:03.553,070] <dbg> net_mqtt.mqtt_input: (0x20024ea8): state:0x00000006
    00> --- 732 messages dropped ---
    00> [01:09:03.466,186] <dbg> net_mqtt_rx.mqtt_read_message_chunk: (0x20024ea8): [CID 0x20023b0c]: Transport read error: -11
    00> --- 702 messages dropped ---
    00> [01:10:58.487,182] <dbg> net_mqtt.mqtt_input: (0x20024ea8): state:0x00000006
    00> --- 420 messages dropped ---
    00> [01:13:03.510,070] <dbg> net_mqtt.mqtt_input: (0x20024ea8): state:0x00000006
    00> --- 733 messages dropped ---
    00> [01:15:03.532,043] <dbg> net_mqtt.mqtt_input: (0x20024ea8): state:0x00000006
    00> --- 701 messages dropped ---
    00> [01:15:58.542,114] <dbg> net_mqtt.mqtt_input: (0x20024ea8): state:0x00000006
    00> --- 73 messages dropped ---
    00> [01:17:06.054,504] <dbg> net_mqtt.mqtt_input: (0x20024ea8): state:0x00000006
    00> --- 760 messages dropped ---
    00> [01:19:06.076,477] <dbg> net_mqtt.mqtt_input: (0x20024ea8): state:0x00000006
    00> --- 733 messages dropped ---
    00> [01:20:06.087,493] <dbg> net_mqtt_rx.mqtt_read_message_chunk: (0x20024ea8): [CID 0x20023b0c]: Transport read error: -11
    00> --- 376 messages dropped ---
    00> [01:22:03.608,947] <dbg> net_mqtt.mqtt_input: (0x20024ea8): state:0x00000006
    00> --- 1097 messages dropped ---
    00> [01:24:03.630,981] <dbg> net_mqtt_rx.mqtt_read_message_chunk: (0x20024ea8): [CID 0x20023b0c]: Transport read error: -11
    00> --- 1094 messages dropped ---
    00> [01:27:02.663,787] <dbg> net_mqtt.mqtt_input: (0x20024ea8): state:0x00000006
    00> --- 245 messages dropped ---
    00> [01:29:02.185,607] <dbg> net_mqtt.mqtt_input: (0x20024ea8): state:0x00000006
    00> --- 493 messages dropped ---
    00> [01:30:57.706,848] <dbg> net_mqtt.mqtt_input: (0x20024ea8): state:0x00000006
    00> --- 255 messages dropped ---
    00> [01:31:00.707,427] <dbg> net_mqtt_rx.mqtt_read_message_chunk: (0x20024ea8): [CID 0x20023b0c]: Transport read error: -11
    00> --- 56 messages dropped ---
    00> [01:32:04.719,116] <dbg> net_mqtt.mqtt_input: (0x20024ea8): state:0x00000006
    00> --- 490 messages dropped ---
    00> [01:33:04.730,133] <dbg> net_mqtt_rx.mqtt_read_message_chunk: (0x20024ea8): [CID 0x20023b0c]: Transport read error: -11
    00> --- 494 messages dropped ---
    00> [01:35:57.761,749] <dbg> net_mqtt.mqtt_input: (0x20024ea8): state:0x00000006
    00> --- 260 messages dropped ---
    00> [01:36:04.763,092] <dbg> net_mqtt_rx.mqtt_read_message_chunk: (0x20024ea8): [CID 0x20023b0c]: Transport read error: -11
    00> --- 492 messages dropped ---
    00> [01:39:02.295,471] <dbg> net_mqtt.mqtt_input: (0x20024ea8): state:0x00000006
    00> --- 721 messages dropped ---
    00> [01:40:57.816,741] <dbg> net_mqtt_rx.mqtt_read_message_chunk: (0x20024ea8): [CID 0x20023b0c]: Transport read error: -11
    00> --- 288 messages dropped ---
    00> [01:43:02.339,416] <dbg> net_mqtt.mqtt_input: (0x20024ea8): state:0x00000006
    00> --- 492 messages dropped ---
    00> [01:45:02.361,389] <dbg> net_mqtt.mqtt_input: (0x20024ea8): state:0x00000006
    00> --- 473 messages dropped ---
    00> [01:45:57.371,490] <dbg> net_mqtt_rx.mqtt_read_message_chunk: (0x20024ea8): [CID 0x20023b0c]: Transport read error: -11
    00> --- 330 messages dropped ---
    00> [01:47:05.383,880] <dbg> net_mqtt.mqtt_input: (0x20024ea8): state:0x00000006
    00> --- 245 messages dropped ---
    00> [01:49:04.905,883] <dbg> net_mqtt.mqtt_input: (0x20024ea8): state:0x00000006
    00> --- 493 messages dropped ---
    00> [01:50:57.426,391] <dbg> net_mqtt.mqtt_input: (0x20024ea8): state:0x00000006
    00> --- 261 messages dropped ---
    00> [01:51:04.927,886] <dbg> net_mqtt_rx.mqtt_read_message_chunk: (0x20024ea8): [CID 0x20023b0c]: Transport read error: -11
    00> --- 492 messages dropped ---
    00> [01:54:02.460,266] <dbg> net_mqtt.mqtt_input: (0x20024ea8): state:0x00000006
    00> --- 492 messages dropped ---
    00> [01:55:02.471,282] <dbg> net_mqtt_rx.mqtt_read_message_chunk: (0x20024ea8): [CID 0x20023b0c]: Transport read error: -11
    00> --- 270 messages dropped ---
    00> [01:57:01.993,225] <dbg> net_mqtt.mqtt_input: (0x20024ea8): state:0x00000006
    00> --- 487 messages dropped ---
    00> [01:58:02.004,241] <dbg> net_mqtt_rx.mqtt_read_message_chunk: (0x20024ea8): [CID 0x20023b0c]: Transport read error: -11
    00> [01:58:02.504,180] <dbg> net_mqtt.mqtt_input: (0x20024ea8): state:0x00000006
    00> [01:58:02.504,241] <dbg> net_mqtt_rx.mqtt_read_message_chunk: (0x20024ea8): [CID 0x20023b0c]: Transport read error: -11
    00> [01:58:03.004,364] <dbg> net_mqtt.mqtt_input: (0x20024ea8): state:0x00000006
    00> [01:58:03.004,425] <dbg> net_mqtt_rx.mqtt_read_message_chunk: (0x20024ea8): [CID 0x20023b0c]: Transport read error: -11
    00> [01:58:03.504,364] <dbg> net_mqtt.mqtt_input: (0x20024ea8): state:0x00000006
    00> [01:58:03.504,425] <dbg> net_mqtt_rx.mqtt_read_message_chunk: (0x20024ea8): [CID 0x20023b0c]: Transport read error: -11
    00> [01:58:04.004,547] <dbg> net_mqtt.mqtt_input: (0x20024ea8): state:0x00000006
    00> [01:58:04.004,608] <dbg> net_mqtt_rx.mqtt_read_message_chunk: (0x20024ea8): [CID 0x20023b0c]: Transport read error: -11
    00> [01:58:04.504,547] <dbg> net_mqtt.mqtt_input: (0x20024ea8): state:0x00000006
    00> [01:58:04.504,608] <dbg> net_mqtt_rx.mqtt_read_message_chunk: (0x20024ea8): [CID 0x20023b0c]: Transport read error: -11
    00> [01:58:05.004,730] <dbg> net_mqtt.mqtt_input: (0x20024ea8): state:0x00000006
    00> [01:58:05.004,791] <dbg> net_mqtt_rx.mqtt_read_message_chunk: (0x20024ea8): [CID 0x20023b0c]: Transport read error: -11
    00> [01:58:05.504,730] <dbg> net_mqtt.mqtt_input: (0x20024ea8): state:0x00000006
    00> [01:58:05.504,791] <dbg> net_mqtt_rx.mqtt_read_message_chunk: (0x20024ea8): [CID 0x20023b0c]: Transport read error: -11
    00> [01:58:06.004,913] <dbg> net_mqtt.mqtt_input: (0x20024ea8): state:0x00000006
    00> [01:58:06.004,974] <dbg> net_mqtt_rx.mqtt_read_message_chunk: (0x20024ea8): [CID 0x20023b0c]: Transport read error: -11
    00> [01:58:06.504,913] <dbg> net_mqtt.mqtt_input: (0x20024ea8): state:0x00000006
    00> [01:58:06.504,974] <dbg> net_mqtt_rx.mqtt_read_message_chunk: (0x20024ea8): [CID 0x20023b0c]: Transport read error: -11
    00> [01:58:06.505,065] <dbg> net_mqtt.client_write: (0x20024ea8): [0x20023b0c]: Transport writing 2 bytes.
    00> [02:00:59.352,874] <dbg> net_mqtt.mqtt_publish: (0x20020d00): [CID 0x20023b0c]:[State 0x06]: >> Topic size 0x00000045, Data size 0x00000032
    
    # Logging stopped @  3 Sep 2020  9: 1:39
    

    Regards

    Marshed

  • Hi,

     

    The code receives an recv() error of -11, meaning EAGAIN:

    https://github.com/eblot/newlib/blob/master/newlib/libc/include/sys/errno.h#L41

     

    The device does not seem to be faulting, but it does seem to call a receive function in a while-loop. I am not sure where this is called from, as the logs are dropping several lines in between, which might show more information. However; it is clear that the problem is due to the IP communication some how.

     

    The interesting thing is that both your logs end in a client_write, which is a send() operation. I assume it completely hangs here?

    If that is the case, could you try your application on the master branch, just to see if the same thing happens there? We believe that there's been a bugfix in bsdlib that might help this specific scenario, which hasn't been tagged out to a release yet.

     

    An alternative to trying on the master branch (in case of any conflicts in your application/tree) is to manually copy bsdlib v0.7.9 (currently in master: https://github.com/nrfconnect/sdk-nrfxlib/tree/master/bsdlib/lib/cortex-m33/hard-float) and overwrite it with the one you already have in path/to/ncs/nrfxlib/bsdlib/lib/cortex-m33/hard-float/, and then test it.

     

    Kind regards,

    Håkon

  • Hi, 

    the problem comes from here 

    while (true) {
    	nrf_cloud_process();
    	send_aggregated_data();
    	k_sleep(K_MSEC(10));
    	k_cpu_idle();
    }
     more specifically on the 
    nrf_cloud_process();
     This function is suppose to call 
    mqtt_input(&nct.client);
    mqtt_live(&nct.client);

    however, with 

    #define CONFIG_MQTT_KEEPALIVE 60
     and nRF Cloud keep-alive time of 60s, a slight delay of the 
    err_code = mqtt_ping(client);
     
    means the ping is sent to a disconnected MQTT, and the program gets stuck here. A reduction of  
    CONFIG_MQTT_KEEPALIVE
     
    is a work around this problem, however this does not solve the problem that if a ping is sent to a disconnected MQTT, the program gets stuck somewhere. 

    Can you take a look at this function and see how we can modify it so that if a ping is sent to a disconnected MQTT, the program has a way out?

    Regards

    Marshed

  • Dear Håkon, 

    bsdlib v0.7.9 does not solve this problem but instead it send the program into a constant starting up loop. Can you please take a look at the answer I posted 3 weeks ago and find a way forward? reducing the keepalive time is not a permanent solution. 

    Regards

    Marshed 

  • Hi Marshed,

     

    Marshed said:
    bsdlib v0.7.9 does not solve this problem but instead it send the program into a constant starting up loop.

    My apologies, I hadn't checked which ncs version you're running on. A straight swap isn't always supported.

    Did you re-base your application to master? Or a straight copy/replace of the library? Which version of ncs are you currently running? If its an older one, it points to an incompatibility with your current ncs version.

    What is your normal KEEPALIVE configured to?

     

    Kind regards,

    Håkon

Reply
  • Hi Marshed,

     

    Marshed said:
    bsdlib v0.7.9 does not solve this problem but instead it send the program into a constant starting up loop.

    My apologies, I hadn't checked which ncs version you're running on. A straight swap isn't always supported.

    Did you re-base your application to master? Or a straight copy/replace of the library? Which version of ncs are you currently running? If its an older one, it points to an incompatibility with your current ncs version.

    What is your normal KEEPALIVE configured to?

     

    Kind regards,

    Håkon

Children
  • Hi Håkon, 

    I am using v1.2.0. I did a straight copy/replace of the library. I am using v1.2.0

    I have now configured my KEEPALIVE to 30s, and it is working fine, however if for any reason my MQTT get disconnected before the 30s, and I send 

    err_code = mqtt_ping(client);
    my program will hang. This is what I think need to be solved. 

    In other words, when you call  

    mqtt_live(&nct.client);
    and you are already disconnected from the MQTT the program get stuck.

    Shouldn't the program try to reconnect? or at least give error message and come out of this loop?

    Regards 

    Marshed

  • Dear Håkon, 

    I have tested the system with longer KEEPALIVE time 360 seconds, and the ping does not return any error although the device will be disconnected from the cloud by that time. Still waiting for solution for this as my network keeps on disconnecting with no flag raised and hence no ability to re-connect. 

    Please assist. 

    Regards

    Marshed 

Related