This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

nrf9160 takes tens of minutes to finish searching and connect to LTE after hard fault

I've had intermittent issues with the nrf9160 modem using mfw 1.3.0 with an iBASIS SIM getting stuck in "searching" state when first powering on. The symptoms are similar to this thread:

https://devzone.nordicsemi.com/f/nordic-q-a/57105/nrf9160-modem-stuck-in-searching

but different enough that it doesn't help.

After many days of debugging, I think I see a pattern:

  1. Modems works fine for a while, set in GPS + LTE mode, connecting and communicating of CAT-M1 and obtaining a GPS fix. This is through many flash and reboot cycles as I develop the software.
  2. A hard fault occurs during execution due to a software bug. Typically a stack overflow or similar.
  3. On next boot the modem behaviour is the same, except it does not leave "searching" mode and therefore does not connect.
  4. Nothing seems to restore normal behaviour, including re-flashing; rebooting; restoring Asset Tracker or Asset Tracker v2 or the AT client firmware and running the LTE Link Monitor; removing and replacing the SIM; running every AT command I can think of including every setting for and combination of CFUN and XSYSTEMMODE; leaving unpowered for a while; reseting various things in nRF Connect for Cloud.
  5. The only fix I've found is to leave the unit "searching" for around 30 minutes. The unit finally connects as normal and all is well - until I reboot. Then the problem is as it was originally.
  6. The only persistent fix I've found is to leave the unit "searching" for its requisite 30 minutes or so, then issuing "AT+CFUN=0". After that I can re-flash whatever firmware I like and connections occurs within a few seconds after power on. At least until the next hard fault...

A relevant sample of the log from the LTE Link Monitor appears below:

AT+CFUN?

+CFUN: 1

OK

AT+CGSN=1

+CGSN: "352656106125639"

OK

AT+CGMI

Nordic Semiconductor ASA

OK

AT+CGMM

nRF9160-SICA

OK

AT+CGMR

mfw_nrf9160_1.3.0

OK

AT+CEMODE?

+CEMODE: 2

OK

AT%XCBAND=?

%XCBAND: (1,2,3,4,5,8,12,13,18,19,20,25,26,28,66)

OK

AT+CMEE?

+CMEE: 0

OK

AT+CMEE=1

OK

AT+CNEC?

+CNEC: 0

OK

AT+CNEC=24

OK

AT+CGEREP?

+CGEREP: 0,0

OK

AT+CGDCONT?

OK

AT+CGACT?

OK

AT+CGEREP=1

OK

AT+CIND=1,1,1

OK

AT+CEREG=5

OK

AT+CEREG?

+CEREG: 5,2,"20CA","0808F50D",7

OK

AT%CESQ=1

OK

AT+CESQ

+CESQ: 99,99,255,255,3,45

OK

AT%XSIM=1

OK

AT%XSIM?

%XSIM: 1

OK

AT+CPIN?

+CPIN: READY

OK

AT+CPINR="SIM PIN"

+CPINR: "SIM PIN",3

OK

AT+CIMI

204080813633144

OK

%CESQ: 45,2,3,0

AT+CESQ

+CESQ: 99,99,255,255,5,45

OK

AT+CESQ

+CESQ: 99,99,255,255,2,44



There doesn't seem to be any issues with the APN, network selection or band selection, although the AT+CGDCONT? and AT+COPS? commands take many minutes to return.

AT%XCBAND
%XCBAND: 28

Otherwise there doesn't appear to be any reported issues at all. Even if I turn on AT+CMEE=1, AT+CGEREP=1, AT+CNEC=24 and AT+CEINFO=1, there's never any error.

Any ideas to skip this awful inoperable period?

Parents
  • This might be cause by your network operator blocking the device because it might not be behaving correctly. A crash might cause the existing connection to not be cleaned up properly, nor to timeout, and the immediate reconnect (and maybe multiple of those per hour) could make the network mark the device as block for a certain amount of time.

    Try to avoid crashing your device frequently.

    Firmware updates gracefully shut down the connection, that's why that works without issues.

  • Oh really? Sounds plausible, but isn't it kind of unrealistic to expect a mobile cellular device to close connections cleanly? There are many more scenarios that would result in a connection being abandoned than deliberately closed.

    I will try to stop crashing so frequently if the Zephyr and ncs developers could just try to stop introducing so many bugs into their libraries ;-)

    Maybe what is more practical is to prevent an immediate reconnect attempt after a hard fault. Surely you don't get banned on your first crash?

Reply
  • Oh really? Sounds plausible, but isn't it kind of unrealistic to expect a mobile cellular device to close connections cleanly? There are many more scenarios that would result in a connection being abandoned than deliberately closed.

    I will try to stop crashing so frequently if the Zephyr and ncs developers could just try to stop introducing so many bugs into their libraries ;-)

    Maybe what is more practical is to prevent an immediate reconnect attempt after a hard fault. Surely you don't get banned on your first crash?

Children
No Data
Related