nRF9151 Modem Crashed Error

Hello, I'm current using the nRF9151 DK, modem firmware version 2.0.2, sdk version 3.1.1.

I built my application on top of lessons 8 and 4 in the Cellular IoT Fundamentals course, using LTE, GNSS, and MQTT. GNSS is only activated upon a button trigger and it doesn't happen often, maybe every 2 - 3 days. I added Memfault to monitor different LTE metrics and OTA.

The dev kit was running fine, but I noticed that it stopped connecting to cellular after continuous running for 3-4 weeks, unless a reset is required. Upon further debugging, I noticed in the log that normally, it alternates between:

nRF9151_LTE: RRC mode: Connected

nRF9151_LTE: RRC mode: Idle

and periodic checking to Memfault. Until the modem crashed:

<err> nrf_modem: Modem has crashed, reason 0x4, PC: 0x12a0e8

I got no further information. I'm not sure how could I approach debugging this issue, any help is appreciated.

If it is helpful, here is the log right before the modem crashed:

[15:07:34.666,625] <dbg> memfault_ncs_metrics: lte_trace_cb: LTE trace: 20
[15:07:34.669,555] <inf> nRF9151_LTE: RRC mode: Idle
[15:07:34.741,027] <dbg> memfault_ncs_metrics: lte_trace_cb: LTE trace: 16
[15:07:34.744,110] <inf> nRF9151_LTE: LTE cell changed: Cell ID: 91905, Tracking area: 2

[15:08:31.004,241] <dbg> memfault_ncs_metrics: lte_trace_cb: LTE trace: 19
[15:08:31.007,263] <inf> nRF9151_LTE: RRC mode: Connected
[15:08:32.256,378] <dbg> mflt: memfault_platform_log: Timer task cycles: 1133858
[15:08:32.256,561] <dbg> mflt: memfault_platform_log: All tasks cycles: 353742855
[15:08:32.256,835] <dbg> mflt: memfault_platform_log: Non-idle tasks cycles: 2082471
[15:08:32.257,019] <dbg> mflt: memfault_platform_log: CPU usage: 0.58%

[15:08:32.257,141] <err> fs: mount point not found!!
[15:08:32.258,392] <dbg> memfault_ncs_metrics: stack_check: Not relevant stack: fota_work_q
[15:08:32.258,544] <dbg> memfault_ncs_metrics: stack_check: Not relevant stack: thread_analyzer
[15:08:32.258,697] <dbg> memfault_ncs_metrics: stack_check: Not relevant stack: date_time_work_q
[15:08:32.258,819] <dbg> memfault_ncs_metrics: stack_check: Not relevant stack: mcumgr smp
[15:08:32.258,972] <dbg> memfault_ncs_metrics: stack_check: Not relevant stack: mflt_http
[15:08:32.259,124] <dbg> memfault_ncs_metrics: stack_check: Not relevant stack: work_q
[15:08:32.259,155] <dbg> memfault_ncs_metrics: stack_check: Not relevant stack: sysworkq
[15:08:32.259,307] <dbg> memfault_ncs_metrics: stack_check: Not relevant stack: logging
[15:08:32.259,429] <dbg> memfault_ncs_metrics: stack_check: Not relevant stack: idle
[15:08:32.259,582] <dbg> memfault_ncs_metrics: stack_check: Not relevant stack: main
[15:08:39.847,412] <dbg> mflt: memfault_platform_log: DNS lookup for device-nrf.memfault.com = 18.211.102.48
[15:08:40.504,425] <err> nrf_modem: Modem has crashed, reason 0x4, PC: 0x12a0e8
[15:08:40.504,730] <inf> nRF9151_LTE: Reconnecting in 60 seconds...
[15:08:40.504,943] <err> mflt: Failed to connect socket, errno=110
[15:08:40.505,371] <err> nRF9151_LTE: FOTA check failed: -1
[15:09:40.505,157] <inf> nRF9151_LTE: Reconnecting in 60 seconds...
[15:10:40.505,615] <inf> nRF9151_LTE: Reconnecting in 60 seconds...
[15:11:40.505,950] <inf> nRF9151_LTE: Reconnecting in 60 seconds...
[15:12:40.506,500] <inf> nRF9151_LTE: Reconnecting in 60 seconds...
[15:13:40.506,835] <inf> nRF9151_LTE: Reconnecting in 60 seconds...
[15:14:40.507,293] <inf> nRF9151_LTE: Reconnecting in 60 seconds...
[15:15:22.455,993] <inf> thread_analyzer: Thread analyze:
[15:15:22.456,481] <inf> thread_analyzer:  fota_work_q         : STACK: unused 1096 usage 952 / 2048 (46 %); CPU: 0 %
[15:15:22.456,542] <inf> thread_analyzer:                      : Total CPU cycles used: 366144
[15:15:22.457,031] <inf> thread_analyzer:  thread_analyzer     : STACK: unused 1336 usage 712 / 2048 (34 %); CPU: 0 %
[15:15:22.457,061] <inf> thread_analyzer:                      : Total CPU cycles used: 247354
[15:15:22.457,550] <inf> thread_analyzer:  date_time_work_q    : STACK: unused 848 usage 1200 / 2048 (58 %); CPU: 0 %
[15:15:22.457,580] <inf> thread_analyzer:                      : Total CPU cycles used: 569
[15:15:22.458,251] <inf> thread_analyzer:  mcumgr smp          : STACK: unused 1760 usage 288 / 2048 (14 %); CPU: 0 %
[15:15:22.458,282] <inf> thread_analyzer:                      : Total CPU cycles used: 2
[15:15:22.458,679] <inf> thread_analyzer:  mflt_http           : STACK: unused 1064 usage 984 / 2048 (48 %); CPU: 0 %
[15:15:22.458,831] <inf> thread_analyzer:                      : Total CPU cycles used: 102262
[15:15:22.459,197] <inf> thread_analyzer:  work_q              : STACK: unused 816 usage 1232 / 2048 (60 %); CPU: 0 %
[15:15:22.459,228] <inf> thread_analyzer:                      : Total CPU cycles used: 335116
[15:15:22.460,113] <inf> thread_analyzer:  sysworkq            : STACK: unused 2744 usage 1352 / 4096 (33 %); CPU: 0 %
[15:15:22.460,266] <inf> thread_analyzer:                      : Total CPU cycles used: 394472014
[15:15:22.460,784] <inf> thread_analyzer:  logging             : STACK: unused 1368 usage 680 / 2048 (33 %); CPU: 0 %
[15:15:22.460,815] <inf> thread_analyzer:                      : Total CPU cycles used: 224707607
[15:15:22.461,212] <inf> thread_analyzer:  idle                : STACK: unused 272 usage 48 / 320 (15 %); CPU: 99 %
[15:15:22.461,242] <inf> thread_analyzer:                      : Total CPU cycles used: 125685407577
[15:15:22.834,655] <inf> thread_analyzer:  main                : STACK: unused 2928 usage 1168 / 4096 (28 %); CPU: 0 %
[15:15:22.834,777] <inf> thread_analyzer:                      : Total CPU cycles used: 3702060
[15:15:22.835,296] <inf> thread_analyzer:  ISR0                : STACK: unused 1520 usage 528 / 2048 (25 %)

Parents Reply Children
  • Hi Dejan,

    After the addition of which functionality (LTE, MQTT, GNSS, OTA, Memfault diagnostics) have you first seen "Modem has crashed" error?

    - I only do extended testing once all of these are implemented, so I'm not sure. But Memfault OTA was added later on after the crash issue because I want to know the metrics and what caused it, but still unable to figure out the cause.

    How did the modem recover after crashing?

    - The modem did not recover, my program was not able to do any further LTE connection unless I trigger the reset using the RESET button.

    Is this error reproducible on your end?

    - Yes, I was able to reproduce this error twice so far. Note that it takes quite a long time for thing to happen. The 1st error happened when it run for 3 weeks, the 2nd try, it ran for about 5 weeks. I noticed that if I switched back and forth between GNSS and LTE often, the time until crashes decreases. For the 1st run, I issue GNSS fix once per day. For the 2nd run, GNSS fix once every 3-4 days.

    What happens if you remove GNSS and OTA functionalities? Do you still see the same error?

    - To be honest, this issue is hard to reproduce because it needs to run in extended period of time. Any recommendations on what kind of log or monitor methods I should enable before running these tests again?

  • Hi,

    Can you provide modem trace?

    Best regards,
    Dejan

  • Do you want me to re-run the program with modem trace enabled, wait for it to reproduce the issue, send the modem trace to you, or just enable and send the modem trace now?

  • Hi,

    Hopefully, you can enable tracing and provide modem trace when the issue occurs.

    Best regards,
    Dejan

Related