This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

data_publish failure in simple_mqtt example based application

Hi,
I'm using mqtt_simple example based application to send BLE tracking data through LTE on custom boards. UART is used for connection between MCUs. Power is ON constantly and the environment devices are used in has zones with no LTE coverage. Most of devices work fine and reconnect and restore functioning after errors and absence of LTE signal. But on several boards I get a strange error. Code execution sometimes does not return from data_publish() function to main() and transmission does not restore. As I found doing debugging it's actually function mqtt_client_tcp_write_msg the one that does not return.
After this error ISR handler for UART keeps being called but LTE transmission doesn't happen because program execution doesn't return to main().
So two questions:
1) How can I do workaround for described error with "not returning to main()"? We had some problems with HW (soldering problem) and maybe this issues are connected but anyway I hope some SW workaround is possible.
2) I'm thinking about doing reconnection mechanism better (in power consumption terms). Check for registration in network and (maybe) check for quality of signal (AT+CESQ) before trying to client_init() with some period ~ seconds. Does it make sence? Or maybe there is another way to do it better?

Any help highly appreciated

Here is my main() code with cleaned some debugging garbage

void main(void)
{
    
    uart_init();

    uart_irq_callback_set(uart1, uart_cb);
    uart_irq_rx_enable(uart1);

    int err;

    printk("The MQTT simple sample started\n");

    modem_configure();

    while(1)
    {	
        printk("client_init\n");

        client_init(&client);

        printk("mqtt_connect\n");
	    err = mqtt_connect(&client);
	    if (err != 0) {
		    printk("ERROR: mqtt_connect %d\n", err);
		    continue;
 	    }


	    err = fds_init(&client);
	    if (err != 0) {
		    printk("ERROR: fds_init %d\n", err);
            err = mqtt_disconnect(&client);
	        if (err) {
		        printk("Could not disconnect MQTT client. Error: %d\n", err);
	        }
            cntr_reconnect++;
		    continue;
	    }

	    while (1) {
		    err = poll(&fds, 1, 1);
		    if (err < 0) {
			    printk("ERROR: poll %d\n", errno);
			    f_break_reason = 1;
                        break;                        
		    }

            if (mqtt_keepalive_time_left(&client) < 10)
            {
		        err = mqtt_live(&client);
		        if ((err != 0) && (err != -EAGAIN)) {
			        printk("ERROR: mqtt_live %d\n", err);
                    f_break_reason = 2;
			        break;
  		        }
            }

		    if ((fds.revents & POLLIN) == POLLIN) {
			    err = mqtt_input(&client);
			    if (err != 0) {
				    printk("ERROR: mqtt_input %d\n", err);
                    f_break_reason = 3;
				    break;
			    }
		    }
                
            if (fPostDataReady && connected){
                formPostData (&track_data, &post_data);
                packID++;
                err = data_publish(&client, MQTT_QOS_0_AT_MOST_ONCE/*MQTT_QOS_2_EXACTLY_ONCEMQTT_QOS_1_AT_LEAST_ONCE*/, post_data, strlen(post_data));
                if (err != 0) {
				    printk("ERROR: data_publish %d\n", err);
                    f_break_reason = 4;
				    break;
		        }
                fPostDataReady = 0;
            }

		    if ((fds.revents & POLLERR) == POLLERR) {
			    printk("POLLERR\n");
                f_break_reason = 5;
			    break;
		    }

		    if ((fds.revents & POLLNVAL) == POLLNVAL) {
			    printk("POLLNVAL\n");
                f_break_reason = 6;
			    break;
		    }
                
            if ((fds.revents & POLLHUP) == POLLHUP) {
			    printk("Socket error: POLLHUP\n");
			    f_break_reason = 7;
			    break;
		    }
	    }
	    printk("Disconnecting MQTT client...\n");
        err = mqtt_disconnect(&client);
	    if (err) {
		    printk("Could not disconnect MQTT client. Error: %d\n", err);
	    }
        cntr_reconnect++;
    }
}

  • Hi, and sorry for the late reply.

     

    1) How can I do workaround for described error with "not returning to main()"? We had some problems with HW (soldering problem) and maybe this issues are connected but anyway I hope some SW workaround is possible.

     Could you take a modem trace, so we can see what happens on the modem side?

    What NCS and modem versions are you using?

    To work around it, you could try to use non-blocking sockets, or use a send-timeout (requires NCS v1.4.0), but it might be that there is a problem in the modem which requires more advanced handling that just sending in a non-blocking fashion and hoping for the best. Hence why I requested a modem trace.

     

    2) I'm thinking about doing reconnection mechanism better (in power consumption terms). Check for registration in network and (maybe) check for quality of signal (AT+CESQ) before trying to client_init() with some period ~ seconds. Does it make sence? Or maybe there is another way to do it better?

    Checking that you are registered to the network might be a good idea. One way to do that is to register an event handler with the lte_lc (link control) library. If you then also use the lte_lc library to connect to the network, you should be notified if you loose connection, and when it is re-established.

    The signal quality is not as useful. If you are not connected to the network, you will not get any information about the signal quality, and if you are connected to the network, you should be able to send data. However, it can be useful to judge how much energy will be used to send data. However, I am not sure how easy it will be to implement that logic properly based on the signal quality alone.

    Best regards,

    Didrik

Related