This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

nrf9160 modem not responding after switching to offline mode

My application uses the nrf9160 in eDRX mode. I regularly have the problem, that when switching the modem to OFFLINE mode (same for POWER-OFF mode), the modem is not responding anymore to any AT commands. I need to switch to OFFLINE mode for battery saving and for switching between Cat-M1 and Cat-NB1.

I wrote a minimalistic application for the demo board (PCA10090 0.8.5, modem Firmware 1.0.0), where I can reproduce the problem.

First the modem is configured to LTE Cat-M1 mode with an eDRX intervall of 82 seconds. Then it is set so ONLINE mode. After a connection is established or a timeout of 100 seconds expires, the modem is set to OFFLINE mode. Then this sequence is repeated.

For the application to run, the AT Command Driver and Logging with synchronous processing must be configured.

Here is the application code for the demo board:

#include <zephyr.h>
#include <stdio.h>
#include <stdlib.h>
#include <uart.h>
#include <string.h>
#include <logging/log.h>
#include <at_cmd.h>

LOG_MODULE_REGISTER(main, LOG_LEVEL_DBG);

volatile int regStatus = 0;

static void modem_NotificationHandler(char* msg)
{
  LOG_DBG("%s", msg);
  if (memcmp(msg, "+CEREG:", strlen("+CEREG:")) == 0)
  {
    char* p = msg + strlen("+CEREG:");
    regStatus = strtol(p, NULL, 10);
  }
}

void main(void)
{
  const u32_t timeout = 100;
  bool connected;
  u32_t t = 0;

	LOG_DBG("application started");

  at_cmd_set_notification_handler(modem_NotificationHandler);
  at_cmd_write("AT+CEREG=3", NULL, 0, NULL);

  while (1)
  {
    regStatus = 0;
    LOG_DBG("configuring modem");
    at_cmd_write("AT%XSYSTEMMODE=1,0,0,0", NULL, 0, NULL); // LTE Cat-M1
    at_cmd_write("AT+CEDRXS=2,4,\"0101\"", NULL, 0, NULL); // 81.92 seconds
    LOG_DBG("modem configured");
    at_cmd_write("AT+CFUN=1", NULL, 0, NULL);
    LOG_DBG("connecting . . .");
    t = 0;
    connected = false;
    while (!connected && t++ < timeout)
    {
      k_sleep(1000);
      connected = regStatus == 1 || regStatus == 5;
    }
    if (connected)
    {
      LOG_DBG("connected");
      k_sleep(10000);
    }
    else
    {
      LOG_DBG("connection timed out");
    }
    LOG_DBG("setting modem to OFFLINE");
    at_cmd_write("AT+CFUN=4", NULL, 0, NULL);
    LOG_DBG("modem set to OFFLINE");
  }
}

After the second connection cycle, when setting the modem to OFFLINE mode, the modem is not responding anymore.

Here is the output:

***** Booting Zephyr OS build v1.14.99-ncs3-snapshot2-1276-g4493a423a645 *****
[00:00:00.365,661] <dbg> main.main: application started
[00:00:00.371,734] <dbg> main.main: configuring modem
[00:00:00.379,455] <dbg> main.main: modem configured
[00:00:00.415,435] <dbg> main.main: connecting . . .
[00:00:30.800,292] <dbg> main.modem_NotificationHandler: +CEREG: 2,"0328","010D2B08",7,0,0

[00:01:05.904,235] <dbg> main.modem_NotificationHandler: +CEREG: 5,"0328","010D2B08",7

[00:01:05.912,994] <dbg> main.modem_NotificationHandler: +CEDRXP: 4,"0101","0101","0000"

[00:01:06.426,910] <dbg> main.main: connected
[00:01:16.431,854] <dbg> main.main: setting modem to OFFLINE
[00:01:16.592,407] <dbg> main.modem_NotificationHandler: +CEREG: 0,"0328","010D2B08",7,0,0

[00:01:18.002,746] <dbg> main.main: modem set to OFFLINE
[00:01:18.008,575] <dbg> main.main: configuring modem
[00:01:18.022,521] <dbg> main.main: modem configured
[00:01:18.064,727] <dbg> main.main: connecting . . .
[00:01:19.749,694] <dbg> main.modem_NotificationHandler: +CEREG: 2,"0328","01158B05",7,0,0

[00:02:58.079,345] <dbg> main.main: connection timed out
[00:02:58.085,144] <dbg> main.main: setting modem to OFFLINE

Any suggestions?

  • Hi,

     

    Cycling between CFUN=1 and CFUN=4 shouldn't be a problem. But, as mentions, please update to the latest modem fw on all boards.

    I have seen behavior like this in LTE Link Monitor, if I get many debug prints incoming and also sending commands back-and-forth, where the GUI can drag on, but haven't seen this sort of behavior when using a terminal (putty, teraterm etc).

    I would recommend checking the return value for the at_cmd calls, in case any of them fail.

    Also try setting CONFIG_LOG=y, and CONFIG_AT_CMD_LOG_LEVEL_DBG=y to get debug logs from the at_cmd module.

     

    Kind regards,

    Håkon

  • Also in Tera Term the following sequence of AT commands results in modem hang-up:

    AT+CFUN?
    AT+CEREG=3
    AT%CESQ=1
    AT%XSYSTEMMODE=1,0,0,0
    AT+CEDRXS=2,4,"0101"
    AT+CFUN=1
    <<< wait until +CEREG: 1 (or 5) shows >>>
    AT+CFUN=4
    AT+CFUN=1
    <<< wait until +CEREG: 2 shows >>>
    AT+CFUN=4
    

    AT_CMD debug messages in the application do not show anything out of the ordinary...

    Could you please try to reproduce it? Maybe you can see what's happening here.

    Our use case does not allow power cycling the device, so a hung up modem is a No-Go for our application. We really need to know what's going on here and how to prevent modem hang-up.

  • Yep.  This is surprisingly easy to reproduce.  I rebuilt at_client with debug logging enabled, loaded it on my dev kit, and did this on my second try (leftmost timestamps from TeraTerm):

    [2019-10-04 12:54:35.031] ***** Booting Zephyr OS build zephyr-v2.0.0-875-gd6e67554cfeb *****
    [2019-10-04 12:54:35.031] The AT host sample started
    [2019-10-04 12:54:36.016] [00:00:00.270,935] <dbg> at_cmd.at_cmd_driver_init: Common AT socket created
    [2019-10-04 12:54:36.016] [00:00:00.270,965] <dbg> at_cmd.at_cmd_driver_init: Common AT socket processing thread created
    [2019-10-04 12:54:36.032] [00:00:00.273,437] <dbg> at_cmd.socket_thread_fn: AT socket thread started
    [2019-10-04 12:54:37.329] AT
    [2019-10-04 12:54:37.846] OK
    [2019-10-04 12:54:38.033] [00:00:03.070,739] <dbg> at_cmd.at_write: Sending command AT
    [2019-10-04 12:54:38.049] 
    [2019-10-04 12:54:38.049] [00:00:03.079,467] <dbg> at_cmd.at_write: Bytes sent: 3
    [2019-10-04 12:54:53.861] AT+CFUN?
    [2019-10-04 12:54:53.877] +CFUN: 4
    [2019-10-04 12:54:53.877] OK
    [2019-10-04 12:54:54.049] [00:00:19.108,978] <dbg> at_cmd.at_write: Sending command AT+CFUN?
    [2019-10-04 12:54:54.064] 
    [2019-10-04 12:54:54.064] [00:00:19.117,828] <dbg> at_cmd.at_write: Bytes sent: 9
    [2019-10-04 12:55:00.743] AT+CEREG=3
    [2019-10-04 12:55:00.758] OK
    [2019-10-04 12:55:01.071] [00:00:25.990,020] <dbg> at_cmd.at_write: Sending command AT+CEREG=3
    [2019-10-04 12:55:01.071] 
    [2019-10-04 12:55:01.071] [00:00:25.998,840] <dbg> at_cmd.at_write: Bytes sent: 11
    [2019-10-04 12:55:05.292] AT%CESQ=1
    [2019-10-04 12:55:05.308] OK
    [2019-10-04 12:55:06.074] [00:00:30.533,386] <dbg> at_cmd.at_write: Sending command AT%CESQ=1
    [2019-10-04 12:55:06.090] 
    [2019-10-04 12:55:06.090] [00:00:30.542,175] <dbg> at_cmd.at_write: Bytes sent: 10
    [2019-10-04 12:55:10.404] AT%XSYSTEMMODE=1,0,0,0
    [2019-10-04 12:55:10.420] OK
    [2019-10-04 12:55:11.092] [00:00:35.654,785] <dbg> at_cmd.at_write: Sending command AT%XSYSTEMMODE=1,0,0,0
    [2019-10-04 12:55:11.107] 
    [2019-10-04 12:55:11.107] [00:00:35.665,130] <dbg> at_cmd.at_write: Bytes sent: 23
    [2019-10-04 12:55:15.952] AT+CEDRXS=2,4,"0101"
    [2019-10-04 12:55:15.952] OK
    [2019-10-04 12:55:16.108] [00:00:41.190,765] <dbg> at_cmd.at_write: Sending command AT+CEDRXS=2,4,"0101"
    [2019-10-04 12:55:16.108] 
    [2019-10-04 12:55:16.108] [00:00:41.199,645] <dbg> at_cmd.at_write: Bytes sent: 21
    [2019-10-04 12:55:21.844] AT+CFUN=1
    [2019-10-04 12:55:21.891] OK
    [2019-10-04 12:55:22.125] [00:00:47.093,719] <dbg> at_cmd.at_write: Sending command AT+CFUN=1
    [2019-10-04 12:55:22.125] 
    [2019-10-04 12:55:22.125] [00:00:47.133,117] <dbg> at_cmd.at_write: Bytes sent: 10
    [2019-10-04 12:55:31.753] AT
    [2019-10-04 12:55:32.065] OK
    [2019-10-04 12:55:32.127] [00:00:57.311,767] <dbg> at_cmd.at_write: Sending command AT
    [2019-10-04 12:55:32.143] 
    [2019-10-04 12:55:32.143] [00:00:57.312,103] <dbg> at_cmd.at_write: Bytes sent: 3
    [2019-10-04 12:56:54.363] %CESQ: 34,1
    [2019-10-04 12:56:54.363] +CEREG: 2,"0411","00F63F11",7,0,0
    [2019-10-04 12:56:56.019] +CEREG: 5,"0411","00F63F11",7
    [2019-10-04 12:56:56.035] +CEDRXP: 4,"0101","0101","0000"
    [2019-10-04 12:57:07.841] AT+CFUN=4
    [2019-10-04 12:57:08.153] [00:02:33.082,122] <dbg> at_cmd.at_write: Sending command AT+CFUN=4
    [2019-10-04 12:57:08.153] 
    [2019-10-04 12:58:05.442] 
    [2019-10-04 12:58:05.886] at
    [2019-10-04 12:58:09.858] AT
    [2019-10-04 12:58:10.427] AT

    In my case, it actually froze up when I was trying to send it the CFUN=4 to disconnect.  Based on the log messages, it looks like the call to send() the command over the socket never returned.  (I let it sit for tens of minutes...)

    I've never been able to prove it, but I've always been suspicious that the AT send command will lockup if there is data coming back the other way from a modem status message at exactly the wrong time.  Early on in our development, while we were still deciding which messages were useful or interesting, we enabled *every* unsolicited message for the modem and just logged everything.  Things used to lock up all the damn time.  Over time, we turned off most of the messages and things got more stable.  Admittedly, this was also as we were moving up through modem firmware versions, so it's likely the fixes in there also resolved most of the lockups, but I'm still suspicious...

  • Hi,

     

    Sorry for me being a bit slow. I see what you're saying now and I see the same behavior.

    So, if you get a notification or similar (Like registering AT%CESQ=1 or any other form of notification), then sending AT+CFUN=4 hangs the application afterwards.

     

    Thanks again for the detailed description, I'll make sure that this is reported back to the developers.

     

    Kind regards,

    Håkon

  • Thanks for reproducing the issue. Btw, even without registering any unsolicited notifications I am able to send the modem into Nirwana...

Related