Network stability debug help

We are running into an issue of stability with our LTE connection. Our nRF9160 makes a call every 5 minutes to our backend. On start-up, the device connects and starts making these calls.  After some time (90 minutes to 3 days) we stop seeing the communication to our backend. Once we power cycle the device, we start to see the calls coming in again. The 9160 application has the watchdog enabled, and is not hanging as I'm still seeing heartbeat print outs on RTT viewer.

I'm getting the system setup to collect the modem trace files with nRF Connect Trace Collector. Are there any additional debug outputs I should be collecting to help diagnose this network connection issue?

We're using:

NVS v2.1.2

MFW v1.3.3

Zephyr 3.1.99

On ATT network

Parents Reply Children
  • Hello,

    Following up on this ticket. We've updated our code a good amount since the original posting, but are seeing possibly the same issue.

    We are seeing an issue where our LTE data stops reaching our backend. From the RTT viewer prints, the nRF9160 dropped the network connection for a period of ~70 minutes before reconnecting. We didn't have a trace being collected for this, but I've setup a test board with trace capture running to catch this.

    We have a custom board that has a nRF9160 and a nRF5340. POST and GET commands are sent from the 5340 via UART to the 9160, and the 9160 returns data from the calls to the 5340. To save battery, we suspend our UART peripherals on both IC's when not in use, then have shared GPIO act as a wakeup when one of the IC's has data to send over UART. We use the zephyr ‘pm_device_action_run()’ function to suspend and resume the UART peripherals.

    I will post a modem trace once it's captured, but in the meantime we're wondering 1.) if there are any known issues, or additional areas we should look into regarding the 9160 disconnecting from the network, and 2.) are there any known issues with the UART peripherals on either the 9160 or 5340 that could prevent them from exiting low power mode?

    nRF9160, nRF5340

    NCS v2.4.0

    MFW v1.3.5

    AT&T SIM

  • Hi, sorry for the late reply. 

    ERob said:
    1.) if there are any known issues, or additional areas we should look into regarding the 9160 disconnecting from the network,

    Difficult to tell without modem traces. I'm not aware of any issues. AT command CEREG might have more information on reject cause (contains an EPS Mobility Management (EMM) cause value. See 3GPP TS 24.301 Annex A and EMM cause value. See 3GPP TS 24.301 Annex A).

    ERob said:
    2.) are there any known issues with the UART peripherals on either the 9160 or 5340 that could prevent them from exiting low power mode?

    Not that I'm aware of. Will need to know more about your applications on both the nRF9160 and the nRF5350.   

    Kind regards,
    Øyvind

  • Hey Oyvind,

    We were able to record a modem trace when we saw our issue (attached). Are you able to have this reviewed for any issues towards the end of the file?Modem_Trace_2023.11.17.mtrace

    All the best,
    Eric

  • Hi Eric, sorry for the late reply. Have not had the time to forward this to our experts. Hope to forward/progress tomorrow or Friday. 

    Kind regards,
    Øyvind

  • Hello, I apologize again for the late reply. 

    ERob said:
    Are you able to have this reviewed for any issues towards the end of the file?

    Looking through the logs, at the end of the file as you mention, there are attach rejects from the network. 

    Currently three EPS Mobility Management (EMM) reject cause are provided as per3GPP TS 24.301 Annex A

    • Cause #9 – UE identity cannot be derived by the network.
      • This EMM cause is sent to the UE when the network cannot derive the UE's identity from the GUTI/S-TMSI/P- TMSI and RAI e.g. no matching identity/context in the network or failure to validate the UE's identity due to integrity check failure of the received message.
    • Cause #11 – PLMN not allowed
      • This EMM cause is sent to the UE if it requests service, or if the network initiates a detach request, in a PLMN where the UE, by subscription or due to operator determined barring, is not allowed to operate.
    • Cause #15 – No suitable cells in tracking area
      • This EMM cause is sent to the UE if it requests service, or if the network initiates a detach request, in a tracking area where the UE, by subscription, is not allowed to operate, but when it should find another allowed tracking area or location area in the same PLMN or an equivalent PLMN.

    Sounds like a question to bring back to AT&T. Is your device stationary?

Related