Memory allocation problem after successive MQTT connections with TLS

After repeated mqtt connection-disconnection processes,I have the problem of code -12 and it is impossible to connect with the MQTT broker after that, the issue is similar to this post.

The mqtt connection code was based on the google_iot_mqtt example because we were using "GCP IoT Core" as a broker at the beginning of the project development.

I was able to verify in the debug logs that the error is thrown in the mbed TLS library as you can see in the log below.

00> [00:02:35.804,199] <inf> my_mqtt:   Acquired time:  4 -  9 - 2022 11 :  7
00> [00:02:35.804,199] <inf> my_mqtt: ntp time got successfully
00> [00:02:38.399,017] <inf> my_mqtt: Connecting to host: <omitted>
00> [00:02:38.399,200] <dbg> net_sock_tls: tls_alloc: (): Allocated TLS context, 0x20006bb8
00> [00:02:38.399,566] <dbg> net_sock: zsock_socket_internal: (): socket: ctx=0x20007838, fd=1
00> [00:02:39.043,365] <dbg> net_sock_tls: tls_debug: (): ssl_msg.c:1814: |2| ssl->f_send() returned 114 (-0xffffff8e)
00> 
00> --- 12 messages dropped ---
00> [00:02:39.043,426] <dbg> net_sock_tls: tls_debug: (): ssl_msg.c:1842: |2| <= flush output
00> 
00> [00:02:39.043,487] <dbg> net_sock_tls: tls_debug: (): ssl_msg.c:2514: |2| <= write record
00> 
00> [00:02:39.043,548] <dbg> net_sock_tls: tls_debug: (): ssl_msg.c:2346: |2| <= write handshake message
00> 
00> [00:02:39.043,609] <dbg> net_sock_tls: tls_debug: (): ssl_cli.c:1346: |2| <= write client hello
00> 
00> [00:02:39.043,670] <dbg> net_sock_tls: tls_debug: (): ssl_msg.c:1789: |2| => flush output
00> 
00> [00:02:39.043,731] <dbg> net_sock_tls: tls_debug: (): ssl_msg.c:1801: |2| <= flush output
00> 
00> [00:02:39.043,792] <dbg> net_sock_tls: tls_debug: (): ssl_cli.c:4192: |2| client state: 2
00> 
00> [00:02:39.043,853] <dbg> net_sock_tls: tls_debug: (): ssl_cli.c:1919: |2| => parse server hello
00> 
00> [00:02:39.043,914] <dbg> net_sock_tls: tls_debug: (): ssl_msg.c:3546: |2| => read record
00> 
00> [00:02:39.043,975] <dbg> net_sock_tls: tls_debug: (): ssl_msg.c:1573: |2| => fetch input
00> 
00> [00:02:39.044,067] <dbg> net_sock_tls: tls_debug: (): ssl_msg.c:1728: |2| in_left: zu, nb_want: zu
00> 
00> [00:02:39.807,617] <dbg> net_sock_tls: tls_debug: (): ssl_cli.c:4192: |2| client state: 3
00> 
00> --- 14 messages dropped ---
00> [00:02:39.807,678] <dbg> net_sock_tls: tls_debug: (): ssl_tls.c:2291: |2| => parse certificate
00> 
00> [00:02:39.807,739] <dbg> net_sock_tls: tls_debug: (): ssl_msg.c:3546: |2| => read record
00> 
00> [00:02:39.807,800] <dbg> net_sock_tls: tls_debug: (): ssl_msg.c:1573: |2| => fetch input
00> 
00> [00:02:39.807,891] <dbg> net_sock_tls: tls_debug: (): ssl_msg.c:1728: |2| in_left: zu, nb_want: zu
00> 
00> [00:02:39.808,013] <dbg> net_sock_tls: tls_debug: (): ssl_msg.c:1753: |2| in_left: zu, nb_want: zu
00> 
00> [00:02:39.808,135] <dbg> net_sock_tls: tls_debug: (): ssl_msg.c:1756: |2| ssl->f_recv(_timeout)() returned 5 (-0xfffffffb)
00> 
00> [00:02:39.808,197] <dbg> net_sock_tls: tls_debug: (): ssl_msg.c:1776: |2| <= fetch input
00> 
00> [00:02:39.808,258] <dbg> net_sock_tls: tls_debug: (): ssl_msg.c:1573: |2| => fetch input
00> 
00> [00:02:39.808,319] <dbg> net_sock_tls: tls_debug: (): ssl_msg.c:1728: |2| in_left: zu, nb_want: zu
00> 
00> [00:02:39.808,502] <dbg> net_sock_tls: tls_debug: (): ssl_msg.c:1753: |2| in_left: zu, nb_want: zu
00> 
00> [00:02:39.808,624] <dbg> net_sock_tls: tls_debug: (): ssl_msg.c:1756: |2| ssl->f_recv(_timeout)() returned 244 (-0xffffff0c)
00> 
00> [00:02:39.944,793] <dbg> net_sock: zsock_received_cb: (rx_q[0]): ctx=0x20007838, pkt=0x200213fc, st=0, user_data=(nil)
00> [00:02:39.951,324] <dbg> net_sock_tls: tls_debug: (): ssl_msg.c:1753: |2| in_left: zu, nb_want: zu
00> 
00> [00:02:39.951,446] <dbg> net_sock_tls: tls_debug: (): ssl_msg.c:1756: |2| ssl->f_recv(_timeout)() returned 341 (-0xfffffeab)
00> 
00> [00:02:40.356,781] <dbg> net_sock: zsock_received_cb: (rx_q[0]): ctx=0x20007838, pkt=0x200213fc, st=0, user_data=(nil)
00> [00:02:40.362,792] <dbg> net_sock_tls: tls_debug: (): ssl_msg.c:1753: |2| in_left: zu, nb_want: zu
00> 
00> [00:02:40.362,915] <dbg> net_sock_tls: tls_debug: (): ssl_msg.c:1756: |2| ssl->f_recv(_timeout)() returned 341 (-0xfffffeab)
00> 
00> [00:02:40.496,673] <dbg> net_sock: zsock_received_cb: (rx_q[0]): ctx=0x20007838, pkt=0x200213fc, st=0, user_data=(nil)
00> [00:02:40.502,624] <dbg> net_sock_tls: tls_debug: (): ssl_msg.c:1753: |2| in_left: zu, nb_want: zu
00> 
00> [00:02:40.502,716] <dbg> net_sock_tls: tls_debug: (): ssl_msg.c:1756: |2| ssl->f_recv(_timeout)() returned 341 (-0xfffffeab)
00> 
00> [00:02:40.838,104] <dbg> net_sock: zsock_received_cb: (rx_q[0]): ctx=0x20007838, pkt=0x200213fc, st=0, user_data=(nil)
00> [00:02:40.843,963] <dbg> net_sock_tls: tls_debug: (): ssl_msg.c:1753: |2| in_left: zu, nb_want: zu
00> 
00> [00:02:40.844,085] <dbg> net_sock_tls: tls_debug: (): ssl_msg.c:1756: |2| ssl->f_recv(_timeout)() returned 341 (-0xfffffeab)
00> 
00> [00:02:41.363,220] <dbg> net_sock_tls: tls_debug: (): ssl_tls.c:1769: |2| <= write certificate
00> 
00> --- 61 messages dropped ---
00> [00:02:41.363,281] <dbg> net_sock_tls: tls_debug: (): ssl_msg.c:1789: |2| => flush output
00> 
00> [00:02:41.363,342] <dbg> net_sock_tls: tls_debug: (): ssl_msg.c:1801: |2| <= flush output
00> 
00> [00:02:41.363,403] <dbg> net_sock_tls: tls_debug: (): ssl_cli.c:4192: |2| client state: 8
00> 
00> [00:02:41.363,464] <dbg> net_sock_tls: tls_debug: (): ssl_cli.c:3488: |2| => write client key exchange
00> 
00> [00:02:41.388,488] <dbg> net_sock_tls: tls_debug: (): ssl_cli.c:2730: |1| mbedtls_rsa_pkcs1_encrypt() returned -17040 (-0x4290)
00> 
00> [00:02:41.388,549] <dbg> net_sock_tls: tls_debug: (): ssl_tls.c:5323: |2| <= handshake
00> 
00> [00:02:41.388,549] <err> net_sock_tls: TLS handshake error: -4290
00> [00:02:41.393,859] <dbg> net_sock: z_impl_zsock_close: (): close: ctx=0x20006bb8, fd=0
00> [00:02:41.393,920] <dbg> net_sock_tls: tls_debug: (): ssl_msg.c:5378: |2| => write close notify
00> 
00> [00:02:41.393,981] <dbg> net_sock_tls: tls_debug: (): ssl_msg.c:5394: |2| <= write close notify
00> 
00> [00:02:41.399,475] <dbg> net_sock: z_impl_zsock_close: (): close: ctx=0x20007838, fd=1
00> [00:02:41.399,536] <err> my_mqtt: could not connect, error -113
00> [00:02:44.964,019] <inf> my_mqtt: Connecting to host: <omitted>
00> [00:02:44.964,202] <dbg> net_sock_tls: tls_alloc: (): Allocated TLS context, 0x20006bb8
00> [00:02:44.964,599] <dbg> net_sock: zsock_socket_internal: (): socket: ctx=0x20007838, fd=1
00> [00:02:45.578,674] <dbg> net_sock_tls: tls_debug: (): ssl_tls.c:3280: |1| alloc(zu bytes) failed
00> 
00> [00:02:45.578,704] <dbg> net_sock: z_impl_zsock_close: (): close: ctx=0x20006bb8, fd=0
00> [00:02:45.579,254] <dbg> net_sock: z_impl_zsock_close: (): close: ctx=0x20007838, fd=1
00> [00:02:45.585,144] <err> my_mqtt: could not connect, error -12
00> [00:02:49.154,113] <inf> my_mqtt: Connecting to host: <omitted>
00> [00:02:49.154,296] <dbg> net_sock_tls: tls_alloc: (): Allocated TLS context, 0x20006bb8
00> [00:02:49.154,663] <dbg> net_sock: zsock_socket_internal: (): socket: ctx=0x20007838, fd=1
00> [00:02:49.737,396] <dbg> net_sock_tls: tls_debug: (): ssl_tls.c:3280: |1| alloc(zu bytes) failed
00> 
00> [00:02:49.737,426] <dbg> net_sock: z_impl_zsock_close: (): close: ctx=0x20006bb8, fd=0
00> [00:02:49.737,976] <dbg> net_sock: z_impl_zsock_close: (): close: ctx=0x20007838, fd=1
00> [00:02:49.743,774] <err> my_mqtt: could not connect, error -12

Also, I'm attaching the code snippet used in the MQTT communication

void mqtt_startup()
{
	mqttRetries = 5;

	broker4 = (struct sockaddr_in *)&broker;
	client = &client_ctx;

    if(my_mqtt_useTls && my_mqtt_first_exec)
    {
        err = tls_credential_add(1, TLS_CREDENTIAL_CA_CERTIFICATE,
                    my_ca_certificate,
                    sizeof(my_ca_certificate));


        if ((err < 0) && (err != -EEXIST)) {
            LOG_ERR("Failed to register public certificate: %d", err);
        }

		my_mqtt_first_exec = false;
    }



	while (mqttRetries) 
	{
		if(k_sem_take(&my_mqtt_sem_disconnect,K_NO_WAIT) == 0)
		{
			//should stop connection process
			return;
		}

		hints.ai_family = AF_INET;
		hints.ai_socktype = SOCK_STREAM;
		hints.ai_protocol = 0;
		dnssCnt = 0;
		
		#ifndef CONFIG_MQTT_HOST_USE_HOST_WITHOUT_DNS

		while ((err = getaddrinfo(my_broker_address, my_broker_port, &hints,
					  &haddr)) && dnssCnt < 3)
		{
			LOG_ERR("Unable to get address for broker, retrying");
			dnssCnt++;
		}

		if (err != 0)
		{
			LOG_ERR("Unable to get address for broker, error %d",err);
			if(mqttExternalCb != NULL)
			{
				my_mqtt_cb_t params;
				params.event = MY_MQTT_CONNECTION_FAILED;
				mqttExternalCb(&params);
			}
			return;
		}
		
		LOG_INF("DNS resolved for %s:%d",my_broker_address,my_broker_port);

		net_ipaddr_copy(&broker4->sin_addr,&net_sin(haddr->ai_addr)->sin_addr);

		#else

		broker4->sin_family = AF_INET;
		broker4->sin_port = htons(my_broker_port);
		zsock_inet_pton(AF_INET, my_broker_address, &broker4->sin_addr);

		#endif
		mqtt_client_init(client);

		if(my_mqtt_useTls)
        {
            now = my_time_get_timestamp(NULL);

            err = jwt_init_builder(&jb, token, sizeof(token));
            if (err != 0) {
                LOG_ERR("Error with JWT token initialization: %d",err);
                return;
            }

            err = jwt_add_payload(&jb, now + 60 * 60, now,
                        CONFIG_CLOUD_AUDIENCE);
            if (err != 0) {
                LOG_ERR("Error with JWT token payload including: %d",err);
                return;
            }

            err = jwt_sign(&jb, zepfull_private_der,
                    zepfull_private_der_len);

            if (err != 0) {
                LOG_ERR("Error with JWT token signing: %d",err);
                return;
            }
		
        }


		/* MQTT client configuration */
		client->broker = &broker;
		client->evt_cb = mqtt_evt_handler;
		client->client_id.utf8 = client_id;
		client->client_id.size = strlen(client_id);
        client->password = &password;

        if(my_mqtt_useTls)
        {
		    password.size = jwt_payload_len(&jb);
        }
        else
        {
		    password.size = 1;

        }

		client->user_name = &username;
		client->protocol_version = MQTT_VERSION_3_1_1;

		/* MQTT buffers configuration */
		client->rx_buf = rx_buffer;
		client->rx_buf_size = sizeof(rx_buffer);
		client->tx_buf = tx_buffer;
		client->tx_buf_size = sizeof(tx_buffer);

        if(my_mqtt_useTls)
        {
            /* MQTT transport configuration */
            client->transport.type = MQTT_TRANSPORT_SECURE;


            struct mqtt_sec_config *tls_config =
                    &client->transport.tls.config;

            tls_config->peer_verify = TLS_PEER_VERIFY_OPTIONAL; 
            tls_config->cipher_list = NULL;
            tls_config->sec_tag_list = m_sec_tags;
            tls_config->sec_tag_count = ARRAY_SIZE(m_sec_tags);
            tls_config->hostname = my_broker_address;
            
        }
        else
        {
            client->transport.type = MQTT_TRANSPORT_NON_SECURE;
        }

		LOG_INF("Connecting to host: %s", my_broker_address);
		err = mqtt_connect(client);
		if (err != 0) {
			LOG_ERR("could not connect, error %d", err);
			//mqtt_disconnect(client);
			mqtt_abort(client);
			mqttRetries--;
			k_msleep(ALIVE_TIME);
			continue;
		}

		if (wait_for_input(5 * MSEC_PER_SEC) > 0) 
		{
			mqtt_input(client);
			if (!connected) 
			{
				LOG_ERR("failed to connect to mqtt_broker");
				//mqtt_disconnect(client);
				mqtt_abort(client);
				mqttRetries--;
				k_msleep(ALIVE_TIME);
				continue;
			} else {
				break;
			}
		} else {
			LOG_ERR("failed to connect to mqtt broker");
			//mqtt_disconnect(client);
			mqtt_abort(client);
			mqttRetries--;
			k_msleep(ALIVE_TIME);
			continue;
		}
	}

	if (!connected) {
		LOG_ERR("Failed to connect to client, aborting");

		if(mqttExternalCb != NULL)
		{
			my_mqtt_cb_t params;
			params.event = MY_MQTT_CONNECTION_FAILED;
			mqttExternalCb(&params);
		}

		return;
	}
	else
	{
		my_mqtt_connected = true;
	}


	mqtt_live(client);

	next_alive = k_uptime_get() + ALIVE_TIME;

	while (1) {
		
		 // check if we have mqtt message to publish
		if(k_msgq_get(&mqtt_msg_queue, &auxPubStructure, K_NO_WAIT) == 0)
		{


			/* initialize publish structure */
			pub_data.message.topic.topic.utf8 = auxPubStructure.topic;
			pub_data.message.topic.topic.size = strlen(auxPubStructure.topic);
			pub_data.message.topic.qos = MQTT_QOS_1_AT_LEAST_ONCE;
			pub_data.message.payload.data = (uint8_t *)auxPubStructure.payload;
			pub_data.message.payload.len = strlen(auxPubStructure.payload); //manual
			pub_data.message_id = 1U;
			pub_data.dup_flag = 0U;
			pub_data.retain_flag = 1U;

			LOG_INF("Publishing data");

			err = mqtt_publish(client, &pub_data);
			if (err) 
			{
				LOG_ERR("could not publish, error %d", err);
				break;
			}
			else
			{
				if(mqttExternalCb != NULL)
				{
					my_mqtt_cb_t params;
					params.event = MY_MQTT_PUBLISHED_DATA;
					mqttExternalCb(&params);
				}
			}

			memset(&auxPubStructure,0,sizeof(my_mqtt_pub_message_t));

		}

		/* idle and process messages */
		while (k_uptime_get() < next_alive) 
		{
			LOG_INF("... idling ...");
			if (wait_for_input(1 * MSEC_PER_SEC) > 0)
			{
				mqtt_input(client);
			}

			if(k_sem_take(&my_mqtt_sem_disconnect,K_NO_WAIT) == 0)
			{
				mqtt_disconnect(client);
				my_mqtt_connected = false;
				return;

			}

			if(!connected)
			{
				mqtt_abort(client);

				return;
			}
		}

		mqtt_live(client);
		next_alive += ALIVE_TIME;
	}
}

To speed up the understanding of the code, I will describe below the process executed in the mqtt_startup() function.

  1. configure the mqtt client;
  2. sign JWT
  3. Connect to broker;
  4. MQTT Idling

I'm attaching also the proj.conf file

#bluetooh
CONFIG_BT=y
CONFIG_BT_DEBUG_LOG=y
CONFIG_BT_DEVICE_NAME="MyDevice"


#Enable GPIO
CONFIG_GPIO=y

#I2C
CONFIG_I2C=y
CONFIG_NRFX_TWIM0=y

#LOG
CONFIG_LOG=y

CONFIG_USE_SEGGER_RTT=y
CONFIG_LOG_BACKEND_RTT=y
CONFIG_LOG_BACKEND_UART=n

CONFIG_RTT_CONSOLE=y
CONFIG_UART_CONSOLE=n


#log
CONFIG_SERIAL=y
CONFIG_UART_INTERRUPT_DRIVEN=y

#   --- Modem and PPP ---


# GSM modem support
CONFIG_MODEM=y
CONFIG_MODEM_GSM_PPP=y

# PPP networking support
CONFIG_NET_DRIVERS=y
CONFIG_NET_PPP=y
CONFIG_NET_L2_PPP=y
CONFIG_NETWORKING=y

CONFIG_PPP_NET_IF_NO_AUTO_START=y
CONFIG_GSM_PPP_AUTOSTART=n
CONFIG_NET_CONFIG_AUTO_INIT=n


CONFIG_NET_L2_PPP_TIMEOUT=10000

# Wait milliseconds before starting ppp handshakes
CONFIG_NET_L2_PPP_DELAY_STARTUP_MS=5000

# IPv4 enables PPP IPCP support
CONFIG_NET_IPV4=y
CONFIG_NET_IPV6=n

# Network management events
CONFIG_NET_CONNECTION_MANAGER=y

CONFIG_MODEM_SIM_NUMBERS=y
CONFIG_MODEM_CELL_INFO=y

CONFIG_NET_SHELL=y
CONFIG_MODEM_SHELL=y

CONFIG_GSM_MUX=n
CONFIG_UART_MUX=n
CONFIG_MODEM_GSM_APN="virtueyes.com.br"

CONFIG_NET_L2_PPP_OPTION_MRU=y
#CONFIG_GSM_MUX_MRU_DEFAULT_LEN=1024

#PPP
CONFIG_NET_PPP_MTU_MRU=2048
CONFIG_NET_PPP_RX_STACK_SIZE=2048
CONFIG_NET_PPP_RINGBUF_SIZE=512


CONFIG_NET_TCP_WORKQ_STACK_SIZE=2048
# Log buffers, modem and PPP
CONFIG_NET_LOG=y
CONFIG_MODEM_LOG_LEVEL_DBG=y
CONFIG_NET_PPP_LOG_LEVEL_DBG=n
CONFIG_MODEM_CONTEXT_VERBOSE_DEBUG=n

#   --- END of Modem and PPP ---


#enable suport to float pointer in print functions
CONFIG_CBPRINTF_FP_SUPPORT=y



# --- MQTT ---

#select transport stacks
CONFIG_NET_UDP=y
CONFIG_NET_TCP=y
CONFIG_NET_DHCPV4=y

#jwt
CONFIG_JWT=y
CONFIG_JWT_SIGN_RSA=y

#dns
CONFIG_DNS_SERVER_IP_ADDRESSES=y
CONFIG_DNS_SERVER1="8.8.8.8"
CONFIG_DNS_RESOLVER=y
CONFIG_DNS_RESOLVER_ADDITIONAL_BUF_CTR=2

#sntp
CONFIG_SNTP=y


CONFIG_PTHREAD_IPC=n 
CONFIG_NET_SOCKETS=y
CONFIG_MQTT_LIB=y
CONFIG_MQTT_LIB_TLS=y
CONFIG_NET_SOCKETS_SOCKOPT_TLS=y
CONFIG_NET_SOCKETS_TLS_MAX_CONTEXTS=1
CONFIG_MQTT_KEEPALIVE=30

CONFIG_NET_SOCKETS_POSIX_NAMES=y


# Enable MBEDTLS
CONFIG_MBEDTLS=y
CONFIG_MBEDTLS_BUILTIN=y
CONFIG_MBEDTLS_SHELL=y


CONFIG_MBEDTLS_ENABLE_HEAP=y
CONFIG_MBEDTLS_HEAP_SIZE=56240

CONFIG_MBEDTLS_USER_CONFIG_ENABLE=y
CONFIG_MBEDTLS_USER_CONFIG_FILE="user-tls-conf.h"

CONFIG_MBEDTLS_SSL_MAX_CONTENT_LEN=16384

CONFIG_MBEDTLS_DEBUG=y
CONFIG_MBEDTLS_DEBUG_LEVEL=2
CONFIG_MBEDTLS_MEMORY_DEBUG=y

CONFIG_MBEDTLS_KEY_EXCHANGE_ECDHE_ECDSA_ENABLED=y
CONFIG_MBEDTLS_ECP_ALL_ENABLED=y

CONFIG_MBEDTLS_PEM_CERTIFICATE_FORMAT=y


#json
CONFIG_CJSON_LIB=y
CONFIG_NEWLIB_LIBC_FLOAT_PRINTF=y
CONFIG_NEWLIB_LIBC=y
CONFIG_NEWLIB_LIBC_MIN_REQUIRED_HEAP_SIZE=8192


# --- END OF MQTT ---




#log
CONFIG_MODEM_LOG_LEVEL_DBG=n
CONFIG_NET_PPP_LOG_LEVEL_DBG=n
CONFIG_NET_L2_PPP_LOG_LEVEL_DBG=n
CONFIG_NET_MGMT_EVENT_LOG_LEVEL_DBG=y
CONFIG_NET_CONNECTION_MANAGER_LOG_LEVEL_DBG=n
CONFIG_NET_DHCPV4_LOG_LEVEL_DBG=n

CONFIG_NET_SOCKETS_LOG_LEVEL_DBG=y

CONFIG_MAIN_STACK_SIZE=4096


#DEBUG
CONFIG_SYS_HEAP_LISTENER=y
CONFIG_NEWLIB_LIBC_HEAP_LISTENER=y
CONFIG_SYS_HEAP_VALIDATE=y
CONFIG_SYS_HEAP_RUNTIME_STATS=y

Some considerations:

  1. The code shown above is executed on a exclusive thread with priority 10 and Stack size of 8096;
  2. I'm using the nRF Connect SDK 2.1.0;
  3. I don't have much experience with Zephyr;
  4. I'm using a nRF52840 module with Telit Modem;

I would like your guys help, and I would to know if the connect/disconnect approach that I'm doing is correct.

Regards

Related