Sporadic Reboots During Idle Mode on nRF9160

I am using a Mikroe development board with the nRF9160 to establish an LTE connection and send MQTT messages to a broker. The application successfully connects to the network, establishes the MQTT connection, and sends messages with a 60-second interval between each message. During this interval, the RRC (Radio Resource Control) mode enters Idle.

However, I am experiencing sporadic reboots when the modem is in Idle mode. These reboots happen irregularly, sometimes after a few minutes or even hours. Here’s a sample log:

*** Booting nRF Connect SDK v2.7.0-5cb85570ca43 ***
*** Using Zephyr OS v3.6.99-100befc70c74 ***
[00:00:00.354,156] <inf> lte: lteInit ..
[00:00:03.477,233] <inf> lte: RRC mode: Connected
[00:00:05.015,319] <inf> lte: Network registration status: Connected - home network
[00:00:05.015,441] <inf> lte: Connected to LTE network
[00:00:05.115,509] <inf> mqtt: mqttInit ..
[00:00:05.385,498] <inf> mqtt: IPv4 Address found 91.121.93.94
[00:00:05.385,925] <inf> mqtt: client_id = nrf-358******
[00:00:05.385,986] <inf> mqtt: Connection to broker using mqtt_connect
[00:00:05.971,221] <inf> mqtt: MQTT client connected
[00:00:05.971,252] <inf> mqtt: Subscribing on "stlab/down/cmd"
[00:00:06.271,270] <inf> mqtt: SUBACK packet id: 1234
[00:00:18.187,927] <inf> lte: RRC mode: Idle
[00:00:30.254,638] <inf> mqtt: Publishing "on" on "stlab/up/status"
...
[00:35:00.510,650] <inf> mqtt: Publishing "on" on "stlab/up/status"
[00:35:00.603,118] <inf> lte: RRC mode: Connected
[00:35:12.477,630] <inf> lte: RRC mode: Idle
[00:35:30.520,050] <inf> mqtt: Publishing "on" on "stlab/up/status"
[00:35:30.614,532] <inf> lte: RRC mode: Connected
[00:35:42.397,613] <inf> lte: RRC mode: Idle
*** Booting nRF Connect SDK v2.7.0-5cb85570ca43 ***
*** Using Zephyr OS v3.6.99-100befc70c74 ***
[00:00:00.354,156] <inf> lte: lteInit ..
[00:00:03.477,233] <inf> lte: RRC mode: Connected
[00:00:05.015,319] <inf> lte: Network registration status: Connected - home network
[00:00:05.015,441] <inf> lte: Connected to LTE network
[00:00:05.115,509] <inf> mqtt: mqttInit ..
[00:00:05.385,498] <inf> mqtt: IPv4 Address found 91.121.93.94
...


As you can see, the application boots again after some time when the modem enters Idle mode. This occurs intermittently, and I haven’t identified a clear pattern.

My questions are:

  1. What could be causing these reboots, particularly when the modem is in Idle mode?
  2. Is it normal for the application to restart if there's an issue with the modem?
  3. How can I debug this issue and pinpoint the cause of the restarts? Are there specific steps or diagnostics I should be focusing on?

Any insights or advice on how to address this would be greatly appreciated!

My prj.conf is:

#
# Copyright (c) 2020 Nordic Semiconductor ASA
#
# SPDX-License-Identifier: LicenseRef-Nordic-5-Clause
#

# Logging
CONFIG_LOG=y

# Button and LED support
CONFIG_DK_LIBRARY=y

# Newlib
CONFIG_NEWLIB_LIBC=y

# Networking
CONFIG_NETWORKING=y
CONFIG_NET_NATIVE=n
CONFIG_NET_SOCKETS_OFFLOAD=y
CONFIG_NET_SOCKETS=y
CONFIG_NET_SOCKETS_POSIX_NAMES=y

# Memory
CONFIG_MAIN_STACK_SIZE=4096
CONFIG_HEAP_MEM_POOL_SIZE=4096

# Modem library
#WWCONFIG_SOC_SERIES_NRF91X=y
CONFIG_TRUSTED_EXECUTION_NONSECURE=y
CONFIG_NRF_MODEM_LIB=y

# LTE link control
CONFIG_LTE_LINK_CONTROL=y
CONFIG_LTE_NETWORK_MODE_LTE_M_NBIOT=y

# MQTT
# STEP 2.1 - Enable and configure the MQTT library
CONFIG_MQTT_LIB=y
CONFIG_MQTT_CLEAN_SESSION=y

# Application
# STEP 2.2 - Configure the broker name, TCP port, topic names, and message
CONFIG_MQTT_PUB_TOPIC="stlab/up/status"
CONFIG_MQTT_SUB_TOPIC="stlab/down/cmd"
CONFIG_BUTTON_EVENT_PUBLISH_MSG="Hi from the nRF91 Series device"
CONFIG_MQTT_BROKER_HOSTNAME="test.mosquitto.org"
CONFIG_MQTT_BROKER_PORT=1883
CONFIG_MQTT_PUBLISH_PERIOD_S=30


My lte source is:
#include "lte.h"
#include <stdio.h>
#include <ncs_version.h>
#include <zephyr/kernel.h>
#include <zephyr/net/socket.h>
#include <zephyr/logging/log.h>
#include <modem/nrf_modem_lib.h>
#include <modem/lte_lc.h>

/* Semaphore for LTE connection */
static K_SEM_DEFINE(lte_connected, 0, 1);

LOG_MODULE_REGISTER(lte, LOG_LEVEL_INF);

/**
 * @brief LTE event handler
 *
 * This function is called by the modem on LTE events.
 *
 * @param evt LTE event
 */
static void lte_handler(const struct lte_lc_evt *const evt) {
     switch (evt->type) {
     case LTE_LC_EVT_NW_REG_STATUS:
        if ((evt->nw_reg_status != LTE_LC_NW_REG_REGISTERED_HOME) &&
            (evt->nw_reg_status != LTE_LC_NW_REG_REGISTERED_ROAMING)) {
            break;
        }
		LOG_INF("Network registration status: %s",
				evt->nw_reg_status == LTE_LC_NW_REG_REGISTERED_HOME ?
				"Connected - home network" : "Connected - roaming");
		k_sem_give(&lte_connected);
        break;
	case LTE_LC_EVT_RRC_UPDATE:
		LOG_INF("RRC mode: %s", evt->rrc_mode == LTE_LC_RRC_MODE_CONNECTED ?
				"Connected" : "Idle");
		break;
     default:
             break;
     }
}

/**
 * @brief Configure the modem and connect to the LTE network.
 *
 * This function initializes the modem library and the LTE link control library,
 * and then connects to the LTE network using the lte_lc_connect_async function.
 * It waits for the connection to complete using a semaphore.
 *
 * @return 0 if successful, a negative value if an error occurred.
 */
void lteInit(void) {
    LOG_INF("lteInit ..");

	int err;

	err = nrf_modem_lib_init();
	if (err) {
		LOG_ERR("Failed to initialize the modem library, error: %d", err);
		return;
	}

	/* lte_lc_init deprecated in >= v2.6.0 */
	#if NCS_VERSION_NUMBER < 0x20600
	err = lte_lc_init();
	if (err) {
		LOG_ERR("Failed to initialize LTE link control library, error: %d", err);
		return;
	}
	#endif

	err = lte_lc_connect_async(lte_handler);
	if (err) {
		LOG_ERR("Error in lte_lc_connect_async, error: %d", err);
		return;
	}

	k_sem_take(&lte_connected, K_FOREVER);
	LOG_INF("Connected to LTE network");

	return;
}


and my mqtt source is:
#include "mqtt.h"

/* Buffers for MQTT client. */
static uint8_t rx_buffer[CONFIG_MQTT_MESSAGE_BUFFER_SIZE];
static uint8_t tx_buffer[CONFIG_MQTT_MESSAGE_BUFFER_SIZE];
static uint8_t payload_buf[CONFIG_MQTT_PAYLOAD_BUFFER_SIZE];

/* MQTT client instance. */
static struct mqtt_client client;

/* MQTT Broker details. */
static struct sockaddr_storage broker;

/* File descriptor structure used by poll. */
static struct pollfd fds;

/* Stack size and priority for the MQTT connection thread */
#define MQTT_CONNECTION_THREAD_STACK_SIZE 2048
#define MQTT_CONNECTION_THREAD_PRIORITY 6

K_THREAD_STACK_DEFINE(mqttConnection_Stack, MQTT_CONNECTION_THREAD_STACK_SIZE);
struct k_thread mqttConnection_Thread;

/* Stack size and priority for the MQTT publisher thread */
#define MQTT_PUBLISHER_THREAD_STACK_SIZE 1024
#define MQTT_PUBLISHER_THREAD_PRIORITY 7

/* Thread definition */
static void mqttPublishThread(void);

/* Create the thread */
K_THREAD_DEFINE(mqttPublish_Thread, MQTT_PUBLISHER_THREAD_STACK_SIZE,
                mqttPublishThread, NULL, NULL, NULL,
                MQTT_PUBLISHER_THREAD_PRIORITY, 0, 0);

LOG_MODULE_REGISTER(mqtt, LOG_LEVEL_INF);

static bool mqtt_connected = false;
K_MUTEX_DEFINE(mqtt_mutex);

/**
 * @brief Reads the received payload from the MQTT server.
 *
 * @param c      MQTT client instance.
 * @param length Length of the payload to read.
 *
 * @return 0 on success, negative errno on error.
 *
 * @details If the payload is larger than the payload buffer, it is truncated to
 *          fit into the buffer. If the payload is longer than the buffer, it
 *          is read in chunks of the buffer size until it fits. The function
 *          will return -EMSGSIZE if the payload is larger than the buffer, or
 *          -EIO if there is an error reading the payload.
 */
static int mqttGetReceivedPayload(struct mqtt_client *c, size_t length) {
	int ret;
	int err = 0;

	/* Return an error if the payload is larger than the payload buffer.
	 * Note: To allow new messages, we have to read the payload before returning.
	 */
	if (length > sizeof(payload_buf)) {
		err = -EMSGSIZE;
	}

	/* Truncate payload until it fits in the payload buffer. */
	while (length > sizeof(payload_buf)) {
		ret = mqtt_read_publish_payload_blocking(
				c, payload_buf, (length - sizeof(payload_buf)));
		if (ret == 0) {
			return -EIO;
		} else if (ret < 0) {
			return ret;
		}

		length -= ret;
	}

	ret = mqtt_readall_publish_payload(c, payload_buf, length);
	if (ret) {
		return ret;
	}

	return err;
}

/**
 * @brief Subscribe to a topic.
 *
 * @param c The MQTT client instance.
 *
 * @return 0 on success, negative error code on failure.
 *
 * @details This function subscribes to the topic specified in the
 *          CONFIG_MQTT_SUB_TOPIC setting. 
 *
 *          The function will return 0 on success, or a negative error
 *          code on failure.
 */
static int mqttSubscribe(struct mqtt_client *const c) {
	struct mqtt_topic subscribe_topic = {
		.topic = {
			.utf8 = CONFIG_MQTT_SUB_TOPIC,
			.size = strlen(CONFIG_MQTT_SUB_TOPIC)
		},
		.qos = MQTT_QOS_1_AT_LEAST_ONCE
	};

	const struct mqtt_subscription_list subscription_list = {
		.list = &subscribe_topic,
		.list_count = 1,
		.message_id = 1234
	};

	LOG_INF("Subscribing on \"%s\"", CONFIG_MQTT_SUB_TOPIC);

	return mqtt_subscribe(c, &subscription_list);
}

/**
 * @brief Print a buffer to the log as a string.
 *
 * @param prefix A string to print before the buffer.
 * @param data   The buffer to print.
 * @param len    The length of the buffer.
 *
 * @details The buffer is null-terminated and the resulting string is
 *          printed using LOG_INF.
 */
static void mqttDataPrint(uint8_t *prefix, uint8_t *data, size_t len, uint8_t* topic) {
	char buf[len + 1];

	memcpy(buf, data, len);
	buf[len] = 0;
	LOG_INF("%s\"%s\" on \"%s\"", (char *)prefix, (char *)buf, topic);
}

/**
 * @brief Publish a message to an MQTT topic.
 *
 * @param c      MQTT client instance.
 * @param qos    QOS level of the message.
 * @param data   Buffer containing the payload.
 * @param len    Length of the payload buffer.
 *
 * @return 0 on success, negative error code on failure.
 *
 * @details Publish a message to the topic specified in the
 *          CONFIG_MQTT_PUB_TOPIC setting. The QOS level of the message
 *          is set to @p qos. The payload is taken from the buffer
 *          @p data, with length @p len.
 *
 *          The function will also return -EIO if there is an error
 *          reading the payload.
 */
int mqttDataPublish(struct mqtt_client *c, enum mqtt_qos qos,
	uint8_t *data, size_t len) {
	struct mqtt_publish_param param;

	param.message.topic.qos = qos;
	param.message.topic.topic.utf8 = CONFIG_MQTT_PUB_TOPIC;
	param.message.topic.topic.size = strlen(CONFIG_MQTT_PUB_TOPIC);
	param.message.payload.data = data;
	param.message.payload.len = len;
	param.message_id = sys_rand32_get();
	param.dup_flag = 0;
	param.retain_flag = 0;

	mqttDataPrint("Publishing ", data, len, CONFIG_MQTT_PUB_TOPIC);

	return mqtt_publish(c, &param);
}

/**
 * @brief MQTT event handler.
 *
 * This function is called by the MQTT client API when an event occurs.
 *
 * @param c MQTT client instance.
 * @param evt MQTT event structure.
 */
void mqttEvtHandler(struct mqtt_client *const c, const struct mqtt_evt *evt) {
	int err;

	switch (evt->type) {
		case MQTT_EVT_CONNACK:
			if (evt->result != 0) {
				LOG_ERR("MQTT connect failed: %d", evt->result);
				break;
			}

			LOG_INF("MQTT client connected");

			k_mutex_lock(&mqtt_mutex, K_FOREVER);
			mqtt_connected = true;
			k_mutex_unlock(&mqtt_mutex);

			mqttSubscribe(c);

			break;

		case MQTT_EVT_DISCONNECT:
			LOG_INF("MQTT client disconnected: %d", evt->result);

			k_mutex_lock(&mqtt_mutex, K_FOREVER);
			mqtt_connected = false;
			k_mutex_unlock(&mqtt_mutex);

			break;

		case MQTT_EVT_PUBLISH: {
			const struct mqtt_publish_param *p = &evt->param.publish;
			LOG_INF("MQTT PUBLISH result=%d",
				evt->result);

			err = mqttGetReceivedPayload(c, p->message.payload.len);
			
			//Send acknowledgment to the broker on receiving QoS1 publish message 
			if (p->message.topic.qos == MQTT_QOS_1_AT_LEAST_ONCE) {
				const struct mqtt_puback_param ack = {
					.message_id = p->message_id
				};
				/* Send acknowledgment. */
				mqtt_publish_qos1_ack(c, &ack);
			}

			if (err >= 0) {
				mqttDataPrint("Received: ", payload_buf, p->message.payload.len, (uint8_t *)p->message.topic.topic.utf8);
			// Payload buffer is smaller than the received data 
			} else if (err == -EMSGSIZE) {
				LOG_ERR("Received payload (%d bytes) is larger than the payload buffer size (%d bytes).",
					p->message.payload.len, sizeof(payload_buf));
			// Failed to extract data, disconnect 
			} else {
				LOG_ERR("get_received_payload failed: %d", err);
				LOG_INF("Disconnecting MQTT client...");

				err = mqtt_disconnect(c);
				if (err) {
					LOG_ERR("Could not disconnect: %d", err);
				}
			}
		} break;

		case MQTT_EVT_PUBACK:
			if (evt->result != 0) {
				LOG_ERR("MQTT PUBACK error: %d", evt->result);
				break;
			}
			//LOG_INF("PUBACK packet id: %u", evt->param.puback.message_id);
			break;

		case MQTT_EVT_SUBACK:
			if (evt->result != 0) {
				LOG_ERR("MQTT SUBACK error: %d", evt->result);
				break;
			}

			LOG_INF("SUBACK packet id: %u", evt->param.suback.message_id);
			break;

		case MQTT_EVT_PINGRESP:
			if (evt->result != 0) {
				LOG_ERR("MQTT PINGRESP error: %d", evt->result);
			}
			break;

		default:
			LOG_INF("Unhandled MQTT event type: %d", evt->type);
			break;
	}
}

/**
 * @brief Initialize the MQTT broker address.
 *
 * @details This function resolves the hostname of the MQTT broker using
 *          getaddrinfo() and sets the address of the broker in the global
 *          'broker' variable.
 *
 * @return 0 on success, or a negative error code on failure.
 */
static int mqtt_broker_init(void) {
	int err;
	struct addrinfo *result;
	struct addrinfo *addr;
	struct addrinfo hints = {
		.ai_family = AF_INET,
		.ai_socktype = SOCK_STREAM
	};

	err = getaddrinfo(CONFIG_MQTT_BROKER_HOSTNAME, NULL, &hints, &result);
	if (err) {
		LOG_ERR("getaddrinfo failed: %d", err);
		return -ECHILD;
	}

	addr = result;

	/* Look for address of the broker. */
	while (addr != NULL) {
		/* IPv4 Address. */
		if (addr->ai_addrlen == sizeof(struct sockaddr_in)) {
			struct sockaddr_in *broker4 =
				((struct sockaddr_in *)&broker);
			char ipv4_addr[NET_IPV4_ADDR_LEN];

			broker4->sin_addr.s_addr =
				((struct sockaddr_in *)addr->ai_addr)
				->sin_addr.s_addr;
			broker4->sin_family = AF_INET;
			broker4->sin_port = htons(CONFIG_MQTT_BROKER_PORT);

			inet_ntop(AF_INET, &broker4->sin_addr.s_addr,
				  ipv4_addr, sizeof(ipv4_addr));
			LOG_INF("IPv4 Address found %s", (char *)(ipv4_addr));

			break;
		} else {
			LOG_ERR("ai_addrlen = %u should be %u or %u",
				(unsigned int)addr->ai_addrlen,
				(unsigned int)sizeof(struct sockaddr_in),
				(unsigned int)sizeof(struct sockaddr_in6));
		}

		addr = addr->ai_next;
	}

	/* Free the address. */
	freeaddrinfo(result);

	return err;
}

/**
 * @brief Get the client id to use for the MQTT connection.
 *
 * @details If CONFIG_MQTT_CLIENT_ID is set, that value is used.
 *          Otherwise, the function attempts to obtain the IMEI of the device
 *          using the AT+CGSN command and generates a client id string
 *          of the form "nrf-<imei>".
 *
 * @return The client id to use for the MQTT connection.
 */
static const uint8_t* mqttClientIdGet(void) {
	static uint8_t client_id[MAX(sizeof(CONFIG_MQTT_CLIENT_ID),
				     CLIENT_ID_LEN)];

	if (strlen(CONFIG_MQTT_CLIENT_ID) > 0) {
		snprintf(client_id, sizeof(client_id), "%s",
			 CONFIG_MQTT_CLIENT_ID);
		goto exit;
	}

	char imei_buf[CGSN_RESPONSE_LENGTH + 1];
	int err;

	err = nrf_modem_at_cmd(imei_buf, sizeof(imei_buf), "AT+CGSN");
	if (err) {
		LOG_ERR("Failed to obtain IMEI, error: %d", err);
		goto exit;
	}

	imei_buf[IMEI_LEN] = '\0';

	snprintf(client_id, sizeof(client_id), "nrf-%.*s", IMEI_LEN, imei_buf);
	LOG_INF("client_id = %s", (char *)(client_id));

exit:
	LOG_DBG("client_id = %s", (char *)(client_id));

	return client_id;
}

/**
 * @brief Initialize the MQTT client
 *
 * This function initializes the MQTT client instance. It resolves the configured
 * hostname and initializes the MQTT broker structure. It configures the MQTT
 * client with the broker details and the event handler. It also configures the
 * MQTT buffers and the transport type.
 *
 * @param client MQTT client instance to be initialized
 *
 * @return 0 on success, negative error code on failure
 */
int mqttClientInit(struct mqtt_client *client) {
	int err;
	/* Initializes the client instance. */
	mqtt_client_init(client);

	/* Resolves the configured hostname and initializes the MQTT broker structure */
	err = mqtt_broker_init();
	if (err) {
		LOG_ERR("Failed to initialize broker connection");
		return err;
	}

	/* MQTT client configuration */
	client->broker = &broker;
	client->evt_cb = mqttEvtHandler;
	client->client_id.utf8 = mqttClientIdGet();
	client->client_id.size = strlen(client->client_id.utf8);
	client->password = NULL;
	client->user_name = NULL;
	client->protocol_version = MQTT_VERSION_3_1_1;

	/* MQTT buffers configuration */
	client->rx_buf = rx_buffer;
	client->rx_buf_size = sizeof(rx_buffer);
	client->tx_buf = tx_buffer;
	client->tx_buf_size = sizeof(tx_buffer);

	/* We are not using TLS in Exercise 1 */
	client->transport.type = MQTT_TRANSPORT_NON_SECURE;

	return err;
}


/**
 * @brief Initialize pollfd for MQTT client
 *
 * @param c MQTT client instance
 * @param fds pollfd structure to be initialized
 *
 * @return 0 on success, -ENOTSUP if MQTT client is configured to use TLS
 *
 * This function initializes the pollfd structure with the file descriptor
 * of the MQTT client's TCP socket. The events are set to POLLIN.
 */
int mqttFdsInit(struct mqtt_client *c, struct pollfd *fds) {
	if (c->transport.type == MQTT_TRANSPORT_NON_SECURE) {
		fds->fd = c->transport.tcp.sock;
	} else {
		return -ENOTSUP;
	}

	fds->events = POLLIN;

	return 0;
}

/**
 * @brief Initialize the MQTT client and start the main loop.
 *
 * This function initializes the MQTT client and starts the main loop where it
 * will connect to the broker, wait for incoming data, and send keepalive
 * messages. If the connection is lost, it will try to reconnect.
 *
 * @note If the function returns, it means that an error occurred.
 */
void mqttInit(void) {
	LOG_INF("mqttInit ..");
	int err;

	err = mqttClientInit(&client);
	if (err) {
		LOG_ERR("Failed to initialize MQTT client: %d", err);
		return;
	}

    k_thread_create(&mqttConnection_Thread, mqttConnection_Stack, MQTT_CONNECTION_THREAD_STACK_SIZE,
                    mqttConnectionThread, NULL, NULL, NULL,
                    MQTT_CONNECTION_THREAD_PRIORITY, 0, K_NO_WAIT);
}

void mqttConnectionThread(void *p1, void *p2, void *p3) {

	int err;
	uint16_t connect_attempt = 0;

	while (1) {
		do_connect:
			if (connect_attempt++ > 0) {
				LOG_INF("Reconnecting in %d seconds...",
					CONFIG_MQTT_RECONNECT_DELAY_S);
				k_sleep(K_SECONDS(CONFIG_MQTT_RECONNECT_DELAY_S));
			}

			LOG_INF("Connection to broker using mqtt_connect");
			err = mqtt_connect(&client);
			if (err) {
				LOG_ERR("Error in mqtt_connect: %d", err);
				goto do_connect;
			}

			err = mqttFdsInit(&client,&fds);
			if (err) {
				LOG_ERR("Error in mqttFdsInit: %d", err);
				return;
			}

			while (1) {
				err = poll(&fds, 1, mqtt_keepalive_time_left(&client));
				if (err < 0) {
					LOG_ERR("Error in poll(): %d", errno);
					break;
				}

				err = mqtt_live(&client);
				if ((err != 0) && (err != -EAGAIN)) {
					LOG_ERR("Error in mqtt_live: %d", err);
					break;
				}

				if ((fds.revents & POLLIN) == POLLIN) {
					err = mqtt_input(&client);
					if (err != 0) {
						LOG_ERR("Error in mqtt_input: %d", err);
						break;
					}
				}

				if ((fds.revents & POLLERR) == POLLERR) {
					LOG_ERR("POLLERR");
					break;
				}

				if ((fds.revents & POLLNVAL) == POLLNVAL) {
					LOG_ERR("POLLNVAL");
					break;
				}
			}

			LOG_INF("Disconnecting MQTT client");

			err = mqtt_disconnect(&client);
			if (err) {
				LOG_ERR("Could not disconnect MQTT client: %d", err);
			}
		k_sleep(K_MSEC(50));
		goto do_connect;
	}
}

/**
 * @brief Thread to publish a message periodically to the configured topic.
 *
 * @details This function is run in a separate thread and will publish a message
 *          to the configured topic every n seconds, if the MQTT client is
 *          connected.
 */
static void mqttPublishThread(void) {
    while (1) {
        k_mutex_lock(&mqtt_mutex, K_FOREVER);
        if (mqtt_connected) {
            char status[] = "on";
            int err = mqttDataPublish(&client, MQTT_QOS_1_AT_LEAST_ONCE,
                                      status, sizeof(status)-1);
            if (err) {
                LOG_ERR("Failed to publish message: %d", err);
            }
        }
        k_mutex_unlock(&mqtt_mutex);

        k_sleep(K_SECONDS(CONFIG_MQTT_PUBLISH_PERIOD_S));
    }
}

Parents
  • Such reboot are in my experience either caused by an program fault, a watchdog reset or a "power on" reset, when the power supply/battery doesn't work proper.

    > 3. How can I debug this issue and pinpoint the cause of the restarts? Are there specific steps or diagnostics I should be focusing on?

    The reset cause may help to narrow the scope.

    prj.conf:

    CONFIG_HWINFO=y

    src:
    hwinfo_get_reset_cause

    > 1. What could be causing these reboots, particularly when the modem is in Idle mode?

    Once you know, the reset cause, the answer may be easier. e.g. if it's the watchdog and you use an 60s watchdog interval, then you may fix it by extending the a little, say 70s. If it's a power on reset, then charge the battery.

    > 2.  Is it normal for the application to restart if there's an issue with the modem?

    No, therefore I guess, there is an issue with your application.

  • I am now using a 5V, 1A power supply with a step-down converter to achieve a 3.3V output.

    I briefly adapted your code into my main source, and here is the output I got:


    #include <stdio.h>
    #include <ncs_version.h>
    #include <zephyr/kernel.h>
    #include <zephyr/logging/log.h>
    
    #include <zephyr/drivers/hwinfo.h>
    #include <zephyr/drivers/watchdog.h>
    #include <zephyr/kernel.h>
    #include <zephyr/logging/log.h>
    #include <zephyr/logging/log_ctrl.h>
    #include <zephyr/sys/__assert.h>
    #include <zephyr/sys/reboot.h>
    
    #include "lte.h"
    #include "mqtt.h"
    
    LOG_MODULE_REGISTER(main, LOG_LEVEL_INF);
    
    static atomic_t shutdown_delay = ATOMIC_INIT(-1);
    static atomic_t reboot_cause = ATOMIC_INIT(-1);
    
    
    static atomic_t read_reset_cause = ATOMIC_INIT(0);
    
    static const struct device *const wdt = DEVICE_DT_GET_OR_NULL(DT_ALIAS(watchdog0));
    static int wdt_channel_id = -1;
    
    static volatile uint32_t reset_cause = 0;
    static volatile int32_t reset_error = 0;
    
    #define ERROR_CODE_INIT_NO_LTE 0x0000
    #define ERROR_CODE_INIT_NO_DTLS 0x1000
    #define ERROR_CODE_INIT_NO_SUCCESS 0x2000
    #define ERROR_CODE_OPEN_SOCKET 0x3000
    #define ERROR_CODE_TOO_MANY_FAILURES 0x4000
    #define ERROR_CODE_MODEM_FAULT 0x5000
    #define ERROR_CODE_REBOOT_CMD 0x6000
    #define ERROR_CODE_REBOOT_MANUAL 0x7000
    #define ERROR_CODE_UPDATE 0x8000
    #define ERROR_CODE_LOW_VOLTAGE 0x9000
    #define ERROR_CODE_REINIT_CMD 0xA000
    
    #define ERROR_CODE(BASE, ERR) ((BASE & 0xf000) | (ERR & 0xfff))
    #define ERROR_CLASS(ERR) (ERR & 0xf000)
    #define ERROR_DETAIL(ERR) (ERR & 0xfff)
    
    #define ERROR_CLASS_IS_REBOOOT(ERR) (ERROR_CLASS(ERR) == ERROR_CODE_REBOOT_CMD || ERROR_CLASS(ERR) == ERROR_CODE_REINIT_CMD)
    
    #define FLAG_REBOOT_RETRY 1
    #define FLAG_REBOOT_LOW_VOLTAGE 2
    #define FLAG_REBOOT 4
    #define FLAG_RESET 8
    #define FLAG_POWER_ON 16
    
    #define REBOOT_HISTORY 4
    #define REBOOT_TIME_SIZE 6
    #define REBOOT_CODE_SIZE sizeof(uint16_t)
    #define REBOOT_SIZE (REBOOT_TIME_SIZE + REBOOT_CODE_SIZE)
    
    static uint8_t reboot_codes[REBOOT_SIZE * REBOOT_HISTORY];
    
    static K_MUTEX_DEFINE(settings_mutex);
    
    int appl_settings_get_reboot_code(size_t index, int64_t *time, uint16_t *code)
    {
       int res = 0;
       uint8_t buf[sizeof(reboot_codes)];
    
       index *= REBOOT_SIZE;
       if (index + REBOOT_SIZE <= sizeof(buf)) {
          k_mutex_lock(&settings_mutex, K_FOREVER);
          memmove(buf, reboot_codes, sizeof(reboot_codes));
          k_mutex_unlock(&settings_mutex);
          if (sys_get_be48(&buf[index])) {
             if (time) {
                uint64_t time_s = sys_get_be48(&buf[index]);
                if (time_s == 1) {
                   *time = 0;
                } else {
                   *time = time_s * MSEC_PER_SEC;
                }
             }
             if (code) {
                *code = sys_get_be16(&buf[index + REBOOT_TIME_SIZE]);
             }
             res = 1;
          }
       }
    
       return res;
    }
    
    uint32_t appl_reset_cause(int *flags, uint16_t *reboot_code)
    {
       uint32_t cause = 0;
       if (atomic_cas(&read_reset_cause, 0, 1)) {
          reset_error = hwinfo_get_reset_cause(&cause);
          if (!reset_error) {
             hwinfo_clear_reset_cause();
             if (!cause) {
                // the nRF9160 uses 0 (no reset cause) to indicate POR
                uint32_t supported = 0;
                hwinfo_get_supported_reset_cause(&supported);
                if (!(supported & RESET_POR)) {
                   LOG_INF("nRF9160 no reset cause, add POR");
                   cause = RESET_POR;
                }
             }
             reset_cause = cause;
          }
       }
       LOG_INF("Reset cause 0x%04x", reset_cause);
       if (reset_cause) {
          // supported flags: 0x1b3
          if (reset_cause & RESET_PIN) {
             LOG_INF("PIN");
             if (flags) {
                *flags |= FLAG_RESET;
             }
          }
          if (reset_cause & RESET_SOFTWARE) {
             uint16_t code = 0;
             int rc = appl_settings_get_reboot_code(0, NULL, &code);
             int reboot = ERROR_CLASS(code);
             int detail = ERROR_DETAIL(code);
             if (rc > 0 && reboot == ERROR_CODE_TOO_MANY_FAILURES) {
                LOG_INF("Reboot 1.");
                if (!detail) {
                   code = ERROR_CODE(ERROR_CODE_TOO_MANY_FAILURES, 1);
                }
                if (flags) {
                   *flags |= FLAG_REBOOT_RETRY;
                }
             } else if (rc > 0 && reboot == ERROR_CODE_INIT_NO_SUCCESS) {
                LOG_INF("Reboot %u.", detail);
                if (flags) {
                   *flags |= FLAG_REBOOT_RETRY;
                }
             } else if (rc > 0 && reboot == ERROR_CODE_LOW_VOLTAGE) {
                LOG_INF("Reboot low voltage.");
                if (flags) {
                   *flags |= FLAG_REBOOT_LOW_VOLTAGE;
                }
             } else {
                LOG_INF("Reboot");
                if (flags) {
                   *flags |= FLAG_REBOOT;
                }
             }
             if (rc > 0 && reboot_code) {
                *reboot_code = code;
             }
          }
          if (reset_cause & RESET_POR) {
             LOG_INF("Power-On");
             if (flags) {
                *flags |= FLAG_POWER_ON;
             }
          }
          if (reset_cause & RESET_WATCHDOG) {
             LOG_INF("WATCHDOG");
          }
          if (reset_cause & RESET_DEBUG) {
             LOG_INF("DEBUG");
          }
          if (reset_cause & RESET_LOW_POWER_WAKE) {
             LOG_INF("LOWPOWER");
          }
          if (reset_cause & RESET_CPU_LOCKUP) {
             LOG_INF("CPU");
          }
       } else {
          LOG_INF("none");
       }
       return reset_cause;
    }
    
    
    int main(void) {
    
    	int reset_cause = 0;
    	uint16_t reboot_cause = 0;
    
    	appl_reset_cause(&reset_cause, &reboot_cause);
    
    	k_sleep(K_MSEC(100));
    
    	lteInit();	
    
    	k_sleep(K_MSEC(100));
    
    	mqttInit();
    
    	return 0;
    }

    *** Booting nRF Connect SDK v2.7.0-5cb85570ca43 ***
    *** Using Zephyr OS v3.6.99-100befc70c74 ***
    [00:00:00.253,936] <inf> main: nRF9160 no reset cause, add POR
    [00:00:00.253,936] <inf> main: Reset cause 0x0008
    [00:00:00.253,967] <inf> main: Power-On
    [00:00:00.354,034] <inf> lte: lteInit ..
    *** Booting nRF Connect SDK v2.7.0-5cb85570ca43 ***
    *** Using Zephyr OS v3.6.99-100befc70c74 ***
    [00:00:00.253,784] <inf> main: nRF9160 no reset cause, add POR
    [00:00:00.253,784] <inf> main: Reset cause 0x0008
    [00:00:00.253,814] <inf> main: Power-On
    [00:00:00.353,881] <inf> lte: lteInit ..
    *** Booting nRF Connect SDK v2.7.0-5cb85570ca43 ***
    *** Using Zephyr OS v3.6.99-100befc70c74 ***
    [00:00:00.253,631] <inf> main: nRF9160 no reset cause, add POR
    [00:00:00.253,631] <inf> main: Reset cause 0x0008
    [00:00:00.253,662] <inf> main: Power-On
    [00:00:00.353,729] <inf> lte: lteInit ..
    *** Booting nRF Connect SDK v2.7.0-5cb85570ca43 ***
    *** Using Zephyr OS v3.6.99-100befc70c74 ***
    [00:00:00.253,814] <inf> main: nRF9160 no reset cause, add POR
    [00:00:00.253,814] <inf> main: Reset cause 0x0008
    [00:00:00.253,845] <inf> main: Power-On
    [00:00:00.353,912] <inf> lte: lteInit ..

    Could this still be happening because of a power supply issue?

    ◆ RESET_POR

    #define RESET_POR   BIT(3)

    #include <zephyr/drivers/hwinfo.h>

    Power-on reset (POR)

  • If you have doubts, add a logging statement at line 100 and print the "cause". If that's 0, it's a POR. And ensure, that you not hwinfo_clear_reset_cause and then read it again ;-). The times in the log itself after booting are not that useful. Maybe you enable the "serial terminal" to add times, if possible (I use linux/GTKTerm, which offers that).

    If that verifies, that you read 0 as reset cause, then it's a POR.

    Do you use a custom board? Do you have the capacitors close to the modem? Do you supply both VDD and VDD_GPIO from the regulated 3.3V output (you may, up to a certain voltage, supply VDD direct and only VDD_GPIO via the regulated output)?

    I'm not sure, why a POR happens in Idle. I would guess, it's triggered by a wakeup.

    Anyway, I guess you have some ideas to investigate.

  • Thank you for your reply.

    As you mentioned in the previous conversation #331671, I am using too a Mikroe LTE IoT 4 Click. I reviewed the schematic and compared it with the one provided in the datasheet, and everything seems correct.

    The strange part is that sometimes my application enters a restart loop, while other times the restart can take hours to happen. For instance, yesterday, the system restarted every 10 seconds and sometimes it’s not happening anymore.

    When a run my application, i got the output:

    *** Booting nRF Connect SDK v2.7.0-5cb85570ca43 ***
    *** Using Zephyr OS v3.6.99-100befc70c74 ***
    I: Current reset cause: 00000003
    I: RESET_PIN
    I: RESET_SOFTWARE
    I: lteInit ..
    *** Booting nRF Connect SDK v2.7.0-5cb85570ca43 ***
    *** Using Zephyr OS v3.6.99-100befc70c74 ***
    I: No reset cause or error reading reset cause (error: 0)
    I: lteInit ..
    *** Booting nRF Connect SDK v2.7.0-5cb85570ca43 ***
    *** Using Zephyr OS v3.6.99-100befc70c74 ***
    I: No reset cause or error reading reset cause (error: 0)
    I: lteInit ..
    *** Booting nRF Connect SDK v2.7.0-5cb85570ca43 ***
    *** Using Zephyr OS v3.6.99-100befc70c74 ***
    I: No reset cause or error reading reset cause (error: 0)
    I: lteInit ..
    *** Booting nRF Connect SDK v2.7.0-5cb85570ca43 ***
    *** Using Zephyr OS v3.6.99-100befc70c74 ***
    I: No reset cause or error reading reset cause (error: 0)
    I: lteInit ..
    *** Booting nRF Connect SDK v2.7.0-5cb85570ca43 ***
    *** Using Zephyr OS v3.6.99-100befc70c74 ***
    I: No reset cause or error reading reset cause (error: 0)
    I: lteInit ..
    I: RRC mode: Connected
    I: Network registration status: Connected - home network
    I: Connected to LTE network
    I: mqttInit ..
    I: IPv4 Address found 20.82.16.164
    I: client_id = nrf-358447177549400
    I: Connection to broker using mqtt_connect
    I: MQTT client connected
    I: Subscribing on "stlab/down/cmd"
    I: SUBACK packet id: 1234
    I: RRC mode: Idle
    I: Publishing "1" on "stlab/up/status"
    I: RRC mode: Connected
    I: RRC mode: Idle
    I: Publishing "1" on "stlab/up/status"
    I: RRC mode: Connected
    *** Booting nRF Connect SDK v2.7.0-5cb85570ca43 ***
    *** Using Zephyr OS v3.6.99-100befc70c74 ***
    I: No reset cause or error reading reset cause (error: 0)
    I: lteInit ..
    *** Booting nRF Connect SDK v2.7.0-5cb85570ca43 ***
    *** Using Zephyr OS v3.6.99-100befc70c74 ***
    I: No reset cause or error reading reset cause (error: 0)
    I: lteInit ..
    I: RRC mode: Connected
    I: RRC mode: Idle
    I: RRC mode: Connected
    I: RRC mode: Idle
    I: RRC mode: Connected
    I: RRC mode: Idle
    *** Booting nRF Connect SDK v2.7.0-5cb85570ca43 ***
    *** Using Zephyr OS v3.6.99-100befc70c74 ***
    I: No reset cause or error reading reset cause (error: 0)
    I: lteInit ..
    *** Booting nRF Connect SDK v2.7.0-5cb85570ca43 ***
    *** Using Zephyr OS v3.6.99-100befc70c74 ***
    I: No reset cause or error reading reset cause (error: 0)
    I: lteInit ..
    *** Booting nRF Connect SDK v2.7.0-5cb85570ca43 ***
    *** Using Zephyr OS v3.6.99-100befc70c74 ***
    I: No reset cause or error reading reset cause (error: 0)
    I: lteInit ..
    *** Booting nRF Connect SDK v2.7.0-5cb85570ca43 ***
    *** Using Zephyr OS v3.6.99-100befc70c74 ***
    I: No reset cause or error reading reset cause (error: 0)
    I: lteInit ..
    *** Booting nRF Connect SDK v2.7.0-5cb85570ca43 ***
    *** Using Zephyr OS v3.6.99-100befc70c74 ***
    I: No reset cause or error reading reset cause (error: 0)
    I: lteInit ..
    *** Booting nRF Connect SDK v2.7.0-5cb85570ca43 ***
    *** Using Zephyr OS v3.6.99-100befc70c74 ***
    I: No reset cause or error reading reset cause (error: 0)
    I: lteInit ..
    *** Booting nRF Connect SDK v2.7.0-5cb85570ca43 ***
    *** Using Zephyr OS v3.6.99-100befc70c74 ***
    I: No reset cause or error reading reset cause (error: 0)
    I: lteInit ..
    *** Booting nRF Connect SDK v2.7.0-5cb85570ca43 ***
    *** Using Zephyr OS v3.6.99-100befc70c74 ***
    I: No reset cause or error reading reset cause (error: 0)
    I: lteInit ..
    I: RRC mode: Connected
    *** Booting nRF Connect SDK v2.7.0-5cb85570ca43 ***
    *** Using Zephyr OS v3.6.99-100befc70c74 ***
    I: No reset cause or error reading reset cause (error: 0)
    I: lteInit ..
    *** Booting nRF Connect SDK v2.7.0-5cb85570ca43 ***
    *** Using Zephyr OS v3.6.99-100befc70c74 ***
    I: No reset cause or error reading reset cause (error: 0)
    I: lteInit ..
    I: RRC mode: Connected
    *** Booting nRF Connect SDK v2.7.0-5cb85570ca43 ***
    *** Using Zephyr OS v3.6.99-100befc70c74 ***
    I: No reset cause or error reading reset cause (error: 0)
    I: lteInit ..
    I: RRC mode: Connected
    I: Network registration status: Connected - home network
    I: Connected to LTE network
    I: mqttInit ..
    I: IPv4 Address found 20.82.16.164
    I: client_id = nrf-358447177549400
    I: Connection to broker using mqtt_connect
    I: MQTT client connected
    I: Subscribing on "stlab/down/cmd"
    I: SUBACK packet id: 1234
    I: RRC mode: Idle
    I: Publishing "1" on "stlab/up/status"
    I: RRC mode: Connected
    I: RRC mode: Idle
    I: Publishing "1" on "stlab/up/status"
    I: RRC mode: Connected
    I: RRC mode: Idle
    I: Publishing "1" on "stlab/up/status"
    I: RRC mode: Connected
    I: RRC mode: Idle
    I: Publishing "1" on "stlab/up/status"
    I: RRC mode: Connected
    I: RRC mode: Idle
    I: Publishing "1" on "stlab/up/status"
    I: RRC mode: Connected
    I: RRC mode: Idle
    I: Publishing "1" on "stlab/up/status"
    I: RRC mode: Connected
    I: RRC mode: Idle
    ...

    The reboots don't return the code mapping.

    I ran this code earlier today, but no reboot occurred like now.

    I observed that when I use a 5-second interval between messages, reboots occur very sporadically. On this case, the modem doesn't go into RCC Idle.

    My setup is:

    Let me know if you'd like any further adjustments!

    main.c

    #include <stdio.h>
    #include <ncs_version.h>
    #include <zephyr/kernel.h>
    #include <zephyr/logging/log.h>
    
    #include <zephyr/drivers/hwinfo.h>
    #include <zephyr/drivers/watchdog.h>
    #include <zephyr/kernel.h>
    #include <zephyr/logging/log.h>
    #include <zephyr/logging/log_ctrl.h>
    #include <zephyr/sys/__assert.h>
    #include <zephyr/sys/reboot.h>
    
    #include "lte.h"
    #include "mqtt.h"
    
    LOG_MODULE_REGISTER(main, LOG_LEVEL_INF);
    
    static atomic_t read_reset_cause = ATOMIC_INIT(0);
    
    static volatile uint32_t reset_cause = 0;
    static volatile int32_t reset_error = 0;
    
    uint32_t appl_reset_cause(int *flags, uint16_t *reboot_code) {
        uint32_t cause = 0;
    
        if (atomic_cas(&read_reset_cause, 0, 1)) {
            reset_error = hwinfo_get_reset_cause(&cause);
            if (reset_error == 0 && cause) {
                LOG_INF("Current reset cause: %08x", cause);
    
                if (cause & RESET_PIN) {
                    LOG_INF("RESET_PIN");
                }
                if (cause & RESET_SOFTWARE) {
                    LOG_INF("RESET_SOFTWARE");
                }
                if (cause & RESET_BROWNOUT) {
                    LOG_INF("RESET_BROWNOUT");
                }
                if (cause & RESET_POR) {
                    LOG_INF("RESET_POR");
                }
                if (cause & RESET_WATCHDOG) {
                    LOG_INF("WATCHDOG");
                }
                if (cause & RESET_DEBUG) {
                    LOG_INF("DEBUG");
                }
                if (cause & RESET_SECURITY) {
                    LOG_INF("RESET_SECURITY");
                }
                if (cause & RESET_LOW_POWER_WAKE) {
                    LOG_INF("LOWPOWER");
                }
                if (cause & RESET_CPU_LOCKUP) {
                    LOG_INF("CPU");
                }
                if (cause & RESET_PARITY) {
                    LOG_INF("RESET_PARITY");
                }
                if (cause & RESET_PLL) {
                    LOG_INF("RESET_PLL");
                }
                if (cause & RESET_CLOCK) {
                    LOG_INF("RESET_CLOCK");
                }
                if (cause & RESET_HARDWARE) {
                    LOG_INF("RESET_HARDWARE");
                }
                if (cause & RESET_USER) {
                    LOG_INF("RESET_USER");
                }
                if (cause & RESET_TEMPERATURE) {
                    LOG_INF("RESET_TEMPERATURE");
                }
            } else {
                LOG_INF("No reset cause or error reading reset cause (error: %d)", reset_error);
            }
        }
        
        return cause;
    }
    
    int main(void) {
    
    	int reset_cause = 0;
    	uint16_t reboot_cause = 0;
    
    	appl_reset_cause(&reset_cause, &reboot_cause);
    
    	k_sleep(K_MSEC(100));
    
    	lteInit();	
    
    	k_sleep(K_MSEC(100));
    
    	mqttInit();
    
    	return 0;
    }

  • I didn't recognize, that the other ticket is yours as well.

    LTE/NB-IOT connection with MIKROE LTE IOT 4 CLICK

    The good news:

    My MIKROE LTE IOT 4 click runs stable over weeks.

    I guess, your wiring and so the power supply is the issues.

    For my experiments I found a USB + power supply board, which works well (see my last comment in the other ticket).

    I also send you a private message. If you like, just exchange ours apps and see, how they work on the boards.

Reply Children
No Data
Related