NCS 1.6.1 at_cmd lib memory allocation and freeing question

Hi, 

I realise this is now quite an old version of the SDK and this particular library is no longer supported, but we have not yet migrated our project (I did try once, but seemed easier to completely rewrite it). 

In at_cmd.c line 362

    command.cmd = k_malloc(strlen(cmd) + 1);
	if (command.cmd == NULL) {
		return -ENOMEM;
	}
	strcpy(command.cmd, cmd);

	command.resp = NULL;
	command.callback = handler;
	command.flags = AT_CMD_BUF_CMD;

	ret = k_msgq_put(&commands, &command, K_FOREVER);
	if (ret) {
		return ret;
	}

	load_cmd_and_write();

Memory is allocated to command.cmd using k_malloc(). It is freed in "load_cmd_and_write()", but there is a chance k_msgq_put could return an error. Before it returns should there be a free?

ret = k_msgq_put(&commands, &command, K_FOREVER);
if (ret) {
	k_free(command.cmd);    // EDIT
	return ret;
}

I initially thought that would be a quick fix, but I saw there is a thread - "socket_thread_fn" that runs "load_cmd_and_write()" but I am not sure whether it will actually free this memory or not. When it is freed, it calls to free current_cmd.cmd, which is populated by k_msgq_get(), so it depends on how k_msgq_put behaves when it fails - however I cannot find the source code for this to work it out.

For background - I know there is an issue in either my code or the SDK with a memory leak, and there has been for years, but I have never been able to find it. I get a consistent reboot after a number of network/mqtt connections. The problem with finding it however, is that it takes a lot of connections to happen. If I have a schedule set up sending messages every 5 minutes, it takes about 36 hours to happen, if I set that to every 10 minutes 72 hours. It doesn't seem to make a difference if I set different stack sizes on my threads, and it never happens in the exact same bit of code, so I do think there is something in the sdk. 

Thanks, 

Damien

Parents
  • Hi, 

    Your current NCS version is very old. It is highly recommended that you upgrade to a newer NCS version. 

    Can you provide more information about your application? Is your application based on any of the NCS samples?

    For background - I know there is an issue in either my code or the SDK with a memory leak, and there has been for years, but I have never been able to find it. I get a consistent reboot after a number of network/mqtt connections. The problem with finding it however, is that it takes a lot of connections to happen. If I have a schedule set up sending messages every 5 minutes, it takes about 36 hours to happen, if I set that to every 10 minutes 72 hours. It doesn't seem to make a difference if I set different stack sizes on my threads, and it never happens in the exact same bit of code, so I do think there is something in the sdk. 

    Which modem firmware version do you use? To get more information about the reboots, could you please provide application log and modem trace?

    Best regards,
    Dejan

  • Hi, we use Modem FW 1.3.1. I had to take out the modem trace stuff as it took up a fair amount of quiescent current, and our product is dependant on low power (~20uA quiescent). From memory it  was originally based on the "mqtt-simple" example. I tried upgrading to 2.2 about a year or so ago, but I just couldn't get anything built. Half of the kconfigs didn't exist or had a slight change in name, and it seemed to break VSCode by not showing any error in the problems tab, so just had to look through the build log. 

    The nature of the product is one where we need to be absolutely sure things work and there are no unexpected bugs, so upgrading the sdk would mean months of testing for us, which I am not sure we have the time to do right now. There are bugs we have discovered on 1.6.1, but I have fixed them in my local copy and we are aware of any other minor issues. Take this one as an example - we get the datetime from an NTP server using the datetime library. We had a bug where if a unit hadn't rebooted in 28 days, the timestamp we received from the NTP server suddenly became 3 months in the future. It was due to a variable being defined as 16 bit rather than 64 bit, which was fixed in 1.8.0. 

    Thanks, 

    Damien 

Reply
  • Hi, we use Modem FW 1.3.1. I had to take out the modem trace stuff as it took up a fair amount of quiescent current, and our product is dependant on low power (~20uA quiescent). From memory it  was originally based on the "mqtt-simple" example. I tried upgrading to 2.2 about a year or so ago, but I just couldn't get anything built. Half of the kconfigs didn't exist or had a slight change in name, and it seemed to break VSCode by not showing any error in the problems tab, so just had to look through the build log. 

    The nature of the product is one where we need to be absolutely sure things work and there are no unexpected bugs, so upgrading the sdk would mean months of testing for us, which I am not sure we have the time to do right now. There are bugs we have discovered on 1.6.1, but I have fixed them in my local copy and we are aware of any other minor issues. Take this one as an example - we get the datetime from an NTP server using the datetime library. We had a bug where if a unit hadn't rebooted in 28 days, the timestamp we received from the NTP server suddenly became 3 months in the future. It was due to a variable being defined as 16 bit rather than 64 bit, which was fixed in 1.8.0. 

    Thanks, 

    Damien 

Children
No Data
Related