This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

SEC_TAG wiped from the modem, if low battery power

Hi,

Out of several Thingies 91 that we programmed with our software and security keys/tags two of them that had their battery drained seem to have lost their security tags.

The hardware version is 1.0.2 and software is based on SDK 1.3.1 (or perhaps it was 1.3.0).

The application does the following:

On startup, it forces deletion of all 5 keys in SEC_TAG 1 (calling it 1, 2, 3 for sake of example, but they are 5-digit unique tags) and programs CA Cert #1 for HTTP connection #1
On startup, it forces deletion of all 5 keys in SEC_TAG 2 and programs CA Cert #2 for HTTP connection #2
It makes use of, but does not modify keys in SEC_TAG 3, which contain CA Cert #3, device private key and cert for the MQTT connection. Each device has been previously provisioned with with these certs and keys with nRF Connect's LTE Link Monitor Security programming feature.
Once it starts, it connects MQTT and sends some data for 5 minutes.
After 5 minutes, it disconnects MQTT and disconnects form the carrier (lte_lc_offline)
If a button is pressed, the board reboots and repeats the above steps.
The power was drained over the course of perhaps several weeks when our team members just left the device on in LTE offline mode and likely cycled the app with a button press.

The boards that had issues have both been able to connect to MQTT previously multiple times. After the power drain, the MQTT connection stopped working and connect() was retiring -95, which indicates the keys missing. We have reviewed the SEC_TAG list on the modem and found the SEC_TAG 3 missing.

After re-reprogramming the SEC_TAG_3 certs and keys, the devices started working normally again. 

We have since modified our app to not reprogram existing SEC_TAG 1 and 2 upon startup in hopes that it was the actual writes to SEC_TAG 1 and 2 that caused SEC_TAG3 to be lost. I wonder if anyone observed something similar and if you can provide any insights into the problem. I was not able to find a similar problem or a solution on the forums.

Thanks,

Nik

  • What CMNG command sequence did you initially program the SEC_TAGs, and what is the CMNG command sequence used in startup?

  • No manual CMNG commands were ever issued for writes/reads. LTE Link Monitor (Certificate manager tab) was used to write SEC_TAG_3 once around August 2020. Startup involved the following code for both SEC_TAG_1 and SEC_TAG_2:

    modem_key_mgmt_delete() in a loop 1..5

    modem_key_mgmt_write() once for CA cert (location #1)

  • Can you run

    nrf9160_mdm_dfu --read 0x20000 0x30000 file.hex

    before and after the problem happens? The file system dump should tell what is going on in the file system.

  • Hi Hakon,

    This nrf9160_mdm_dfu is the one from https://github.com/NordicPlayground/nrf91-mdm-dfu, I presume?

    Can you please be more specific about "before" and "after"?

    Is a poweroff or a software reboot allowed between "before" and "after" the problem occurs?

    It took took us several months of using our software to produce this issue, and this happened to non-technical team members so they would have to ship their Thingies to me so I that I can examine them, and hopefully the battery doesn't drain fully during shipping. I would have to design test code specifically for this issue to cause the battery to drain and reproduce the problem. I also suspect that it may be even more complicated than that because I may have to intervene before the battery is actually drained. I feel that if I am to help here I'd need more information from you about what Nordic suspects could cause the problem in order to be able to even reproduce this.

    In either case, I think that this problem may be produced:

    1. Run arithmetic operations with random() in a loop for X seconds.
    2. Write SEC_TAG_1.
    3. Check is SEC_TAG_3 exists.
    4. If SEC_TAG_3 does not exist, sleep indefinitely.
    5. Otherwise goto #1

    Does that sound about right?

    Regards,

    Nik

  • By "before" I meant the situation before the battery drains and device powers OFF. So the working situation when MQTT connection works. From this file system dump we will most probably see that the SEC_TAG 3 has valid certificates.

    By "after" I meant the situation after the device powers OFF and then it is powered ON (after replacing battery?) when MQTT connection does not work anymore. It should be as close as possible to the power OFF and preferably without app activity (i.e. no more writing of any SEC_TAGs). This will hopefully show what happened with SEC_TAG 3.

    I understand that it is not easy to reproduce and these operations may be practically very difficult to perform.

    OR

    If problem is reproduced with the 5 steps above, "before" would be when SEC_TAG_3 exists and all is good and "after" means step 4, no SEC_TAG_3.

Related