nRF9160 Modem lock up workaround

Hi, 

I've noticed an issue with the modem locking up on the nRF9160DK. I'm not sure what causes it, but it always gets err 116 - ETIMEDOUT when initialising the modem and trying to connect to a network. 

The easy fix at the moment is to unplug the DK and plug it back in. If I do, the modem connects within seconds. However, my fear is when this is in our product, it will get into this lock up state and not be able to get out of it. 

My code is loosely based on the Simple MQTT sample. When the modem_configure() function fails it has a retry delay, but without power cycling the board, it will never connect. I tried adding in lte_lc_power_off() and lte_lc_power_on() with the delay inbetween, but this didn't help. 

do {
	err = modem_configure();
	if (err) {
		printk("Retrying in %d seconds\n", CONFIG_LTE_CONNECT_RETRY_DELAY_S);
		lte_lc_power_off();
		k_sleep(K_SECONDS(CONFIG_LTE_CONNECT_RETRY_DELAY_S));
		lte_lc_power_on();
	}
} while (err);

I definitely need a soft way reset the modem when it gets in that state, and I imagine someone has also come across this issue. As for why it gets into it, I am not sure yet, but will try and understand why. 

Thanks, 

Damien

Parents
  • Hello Damien, 

    What version of the nRF9160DK, nRF Connect SDK and modem FW are you running in your setup? Does it work with a clean version of Simple MQTT?

    Kind regards,
    Øyvind

  • Its the latest DK version, and I am using SDK 1.61.

    Not sure it if will work with a clean version of MQTT, but the only thing I've changed is the ability to use PSM. I did wonder if it was an issue where I start a debug session whilst the modem is in deep sleep or something. Even if it is something of that nature, I'd still be more comfortable with a software reset of the modem. 

  • Are you are able to share the log output from the device from start to when it receives the error?

    If this is related to PSM, what happens when you remove PSM? Have your tried e.g. Asset Tracker v2? 

  • Here is an example of what I see on VCOM port. 

    I: LTE Link Connected! 0
    +CEREG: 5,"6473","0014B70B",9,,,"00000101","00000111"
    Attempting to acquire time and date from date_time library...
    +CSCON: 0
    %XT3412: 4199999
    
    +CEREG: 5,"6473","0014B715",9,,,"00000101","00000111"
    %XMODEMSLEEP: 1,4189987
    
    *** Booting Zephyr OS build v2.6.0-rc1-ncs1  ***
    Flash regions           Domain          Permissions
    00 01 0x00000 0x10000   Secure          rwxl
    02 31 0x10000 0x100000  Non-Secure      rwxl
    
    Non-secure callable region 0 placed in flash region 1 with size 32.
    
    SRAM region             Domain          Permissions
    00 07 0x00000 0x10000   Secure          rwxl
    08 31 0x10000 0x40000   Non-Secure      rwxl
    
    Peripheral              Domain          Status
    00 NRF_P0               Non-Secure      OK
    01 NRF_CLOCK            Non-Secure      OK
    02 NRF_RTC0             Non-Secure      OK
    03 NRF_RTC1             Non-Secure      OK
    04 NRF_NVMC             Non-Secure      OK
    05 NRF_UARTE1           Non-Secure      OK
    06 NRF_UARTE2           Secure          SKIP
    07 NRF_TWIM2            Non-Secure      OK
    08 NRF_SPIM3            Non-Secure      OK
    09 NRF_TIMER0           Non-Secure      OK
    10 NRF_TIMER1           Non-Secure      OK
    11 NRF_TIMER2           Non-Secure      OK
    12 NRF_SAADC            Non-Secure      OK
    13 NRF_PWM0             Non-Secure      OK
    14 NRF_PWM1             Non-Secure      OK
    15 NRF_PWM2             Non-Secure      OK
    16 NRF_PWM3             Non-Secure      OK
    17 NRF_WDT              Non-Secure      OK
    18 NRF_IPC              Non-Secure      OK
    19 NRF_VMC              Non-Secure      OK
    20 NRF_FPU              Non-Secure      OK
    21 NRF_EGU1             Non-Secure      OK
    22 NRF_EGU2             Non-Secure      OK
    23 NRF_DPPIC            Non-Secure      OK
    24 NRF_REGULATORS       Non-Secure      OK
    25 NRF_PDM              Non-Secure      OK
    26 NRF_I2S              Non-Secure      OK
    27 NRF_GPIOTE1          Non-Secure      OK
    
    SPM: NS image at 0x10000
    SPM: NS MSP at 0x2001e3f8
    SPM: NS reset vector at 0x156f1
    SPM: prepare to jump to Non-Secure image.
    
    *** Booting Zephyr OS build v2.6.0-rc1-ncs1  ***
    The MQTT simple sample started
    I: LTE Link Connecting... 0
    %XMODEMSLEEP: 4
    %XMODEMSLEEP: 4
    %XMODEMSLEEP: 4,0
    +CEREG: 0
    +CEREG: 2,"6473","0014B715",9
    I: Network connection attempt timed out
    I: Failed to establish LTE connection: -116
    Retrying in 60 seconds
    +CEREG: 0
    %XMODEMSLEEP: 4
    %XMODEMSLEEP: 4,0
    I: LTE Link Connecting... 0
    I: Failed to establish LTE connection: -120
    Retrying in 60 seconds
    +CEREG: 0
    %XMODEMSLEEP: 4
    %XMODEMSLEEP: 4,0
    I: LTE Link Connecting... 0
    I: Failed to establish LTE connection: -120
    Retrying in 60 seconds
    +CEREG: 0
    %XMODEMSLEEP: 4
    %XMODEMSLEEP: 4,0
    I: LTE Link Connecting... 0
    I: Failed to establish LTE connection: -120
    Retrying in 60 seconds
    +CEREG: 0
    %XMODEMSLEEP: 4
    %XMODEMSLEEP: 4,0
    I: LTE Link Connecting... 0
    I: Failed to establish LTE connection: -120
    Retrying in 60 seconds
    +CEREG: 0
    %XMODEMSLEEP: 4
    %XMODEMSLEEP: 4,0
    I: LTE Link Connecting... 0
    I: Failed to establish LTE connection: -120
    Retrying in 60 seconds
    +CEREG: 0
    %XMODEMSLEEP: 4
    %XMODEMSLEEP: 4,0
    I: LTE Link Connecting... 0
    I: Failed to establish LTE connection: -120
    Retrying in 60 seconds
    +CEREG: 0


    It definitely seems to be when I restart a debug session when its in PSM deep sleep. 

    In regards to your question, I haven't had enough time to test, because it doesn't happen all the time. Its been fine all morning today, then just randomly did it again. 

    The next time it happens I might load the AT Client and see if I can work out what AT commands might help restart it.

  • The log output shows that your device reports that it's in Flight Mode. Can you share your project? Have you tried connecting to network with a standard sample?

    DamoL said:
    Its the latest DK version, and I am using SDK 1.61.

    Please provide the version  number on back of your DK.

  • Please provide the version  number on back of your DK

    PCA10090 1.0.0 2021.25

    I assume the modem sleep callback is telling you its in flight mode?
    Look at the log when there is a successful connection. It says flight mode, then it registers.

    *** Booting Zephyr OS build v2.6.0-rc1-ncs1  ***
    
    Sensor Test Start!
    
    The MQTT simple sample started
    
    I: LTE Link Connecting... 
    
    %XMODEMSLEEP: 4
    
    %XMODEMSLEEP: 4
    
    %XMODEMSLEEP: 4,0
    
    +CEREG: 2,"6473","0014B715",9
    
    +CSCON: 1
    
    +CEREG: 5,"6473","0014B715",9,,,"11100000","00101000"
    
    D: client_id = nrf-351358811387668
    
    I: LTE Link Connected! 
    
    +CEREG: 5,"6473","0014B715",9,,,"00000101","00000111"

    As for trying a standard sample. Yes, and it fails to connect when in that state. A power cycle fixes it. Again, I'm pretty sure it only happens when a debug session is restarted and the modem is in PSM, but even then it doesn't happen every time.

  • DamoL said:
    I assume the modem sleep callback is telling you its in flight mode?

    Yes, this is correct, I do not expect it to be in flight mode and then connect to an LTE network. 

    In order for us to figure out what is going on, we will need modem traces. This will show what is going on in the modem.

    You can follow this guide or use the Trace Collect v2 (preview) found in the nRF Connect for Desktop. 

    Kind regards,
    Øyvind

Reply Children
  • I will try and get you a trace by the end of the day. I have just spent 30+ minutes trying to get it into that state but it's connecting every time. Joy

    Is there a way I can also view the modem trace.bin file?

  • DamoL said:
    I have just spent 30+ minutes trying to get it into that state but it's connecting every time. Joy

    That is typical Joy

    DamoL said:
    Is there a way I can also view the modem trace.bin file?

    No, unfortunately, the .bin file is only readable internally. The Trace Collect v2 (preview) should provide readable Wireshark output.

  • Hello again. 

    So strangely since my last message the modem never failed to register again - until this afternoon. Now I cant seem to get it connected at all. I had made some changes in the init functions of my program, but nothing to do with the modem, so not sure why that would make a difference. 

    Here is the output on my terminal - 

    +CEREG: 2,"6473","0014B715",9
    I: Network connection attempt timed out
    I: Failed to establish LTE connection: -116
    Retrying in 60 seconds
    +CEREG: 0
    %XMODEMSLEEP: 4
    %XMODEMSLEEP: 4,0
    I: LTE Link Connecting... 0
    I: Failed to establish LTE connection: -120
    Retrying in 60 seconds
    +CEREG: 0
    

    And I have attached the bin file from the modem trace. If you could let me know what the trace says that would be helpful.

    Thanks,

    Damien
    trace-2021-12-08T15-24-41.944Z.bin

  • Just been doing some more digging in the code. 

     I noticed when its trying to connect to a network I can actually query "AT+CEREG?" and recieve information that it actually is connected to the network. The problem is its not always getting a CEREG notification. It fails at the point it tries to take the link semaphore (line 661 of lte_lc.c). I wonder is some other thread is blocking the notification coming through somehow, or how I can get round this. 

    I guess one option is to manually check the CEREG status is modem_configure() fails. If so, I assume I would also have to give the link semaphore in case that is used elsewhere. 

  • DamoL said:
    And I have attached the bin file from the modem trace. If you could let me know what the trace says that would be helpful.

    Looking through the modem trace it looks like your device is receiving 

    Cause #17 – Network failure
    This EMM cause is sent to the UE if the MME cannot service an UE generated request because of PLMN
    failures.

    What SIM are you using? Can you please run the AT Client and issue the following commands:

    AT%XSYSTEMMODE=0,1,0,2
    AT+CFUN=1
    AT+CFUN?

    What happens if you run standard Asset Tracker v2 on your board? Do you see the same issue?

    Can you leave the modem trace running for longer? 

Related