nRF9151 Modem Crashed Error

Hello, I'm current using the nRF9151 DK, modem firmware version 2.0.2, sdk version 3.1.1.

I built my application on top of lessons 8 and 4 in the Cellular IoT Fundamentals course, using LTE, GNSS, and MQTT. GNSS is only activated upon a button trigger and it doesn't happen often, maybe every 2 - 3 days. I added Memfault to monitor different LTE metrics and OTA.

The dev kit was running fine, but I noticed that it stopped connecting to cellular after continuous running for 3-4 weeks, unless a reset is required. Upon further debugging, I noticed in the log that normally, it alternates between:

nRF9151_LTE: RRC mode: Connected

nRF9151_LTE: RRC mode: Idle

and periodic checking to Memfault. Until the modem crashed:

<err> nrf_modem: Modem has crashed, reason 0x4, PC: 0x12a0e8

I got no further information. I'm not sure how could I approach debugging this issue, any help is appreciated.

If it is helpful, here is the log right before the modem crashed:

[15:07:34.666,625] <dbg> memfault_ncs_metrics: lte_trace_cb: LTE trace: 20
[15:07:34.669,555] <inf> nRF9151_LTE: RRC mode: Idle
[15:07:34.741,027] <dbg> memfault_ncs_metrics: lte_trace_cb: LTE trace: 16
[15:07:34.744,110] <inf> nRF9151_LTE: LTE cell changed: Cell ID: 91905, Tracking area: 2

[15:08:31.004,241] <dbg> memfault_ncs_metrics: lte_trace_cb: LTE trace: 19
[15:08:31.007,263] <inf> nRF9151_LTE: RRC mode: Connected
[15:08:32.256,378] <dbg> mflt: memfault_platform_log: Timer task cycles: 1133858
[15:08:32.256,561] <dbg> mflt: memfault_platform_log: All tasks cycles: 353742855
[15:08:32.256,835] <dbg> mflt: memfault_platform_log: Non-idle tasks cycles: 2082471
[15:08:32.257,019] <dbg> mflt: memfault_platform_log: CPU usage: 0.58%

[15:08:32.257,141] <err> fs: mount point not found!!
[15:08:32.258,392] <dbg> memfault_ncs_metrics: stack_check: Not relevant stack: fota_work_q
[15:08:32.258,544] <dbg> memfault_ncs_metrics: stack_check: Not relevant stack: thread_analyzer
[15:08:32.258,697] <dbg> memfault_ncs_metrics: stack_check: Not relevant stack: date_time_work_q
[15:08:32.258,819] <dbg> memfault_ncs_metrics: stack_check: Not relevant stack: mcumgr smp
[15:08:32.258,972] <dbg> memfault_ncs_metrics: stack_check: Not relevant stack: mflt_http
[15:08:32.259,124] <dbg> memfault_ncs_metrics: stack_check: Not relevant stack: work_q
[15:08:32.259,155] <dbg> memfault_ncs_metrics: stack_check: Not relevant stack: sysworkq
[15:08:32.259,307] <dbg> memfault_ncs_metrics: stack_check: Not relevant stack: logging
[15:08:32.259,429] <dbg> memfault_ncs_metrics: stack_check: Not relevant stack: idle
[15:08:32.259,582] <dbg> memfault_ncs_metrics: stack_check: Not relevant stack: main
[15:08:39.847,412] <dbg> mflt: memfault_platform_log: DNS lookup for device-nrf.memfault.com = 18.211.102.48
[15:08:40.504,425] <err> nrf_modem: Modem has crashed, reason 0x4, PC: 0x12a0e8
[15:08:40.504,730] <inf> nRF9151_LTE: Reconnecting in 60 seconds...
[15:08:40.504,943] <err> mflt: Failed to connect socket, errno=110
[15:08:40.505,371] <err> nRF9151_LTE: FOTA check failed: -1
[15:09:40.505,157] <inf> nRF9151_LTE: Reconnecting in 60 seconds...
[15:10:40.505,615] <inf> nRF9151_LTE: Reconnecting in 60 seconds...
[15:11:40.505,950] <inf> nRF9151_LTE: Reconnecting in 60 seconds...
[15:12:40.506,500] <inf> nRF9151_LTE: Reconnecting in 60 seconds...
[15:13:40.506,835] <inf> nRF9151_LTE: Reconnecting in 60 seconds...
[15:14:40.507,293] <inf> nRF9151_LTE: Reconnecting in 60 seconds...
[15:15:22.455,993] <inf> thread_analyzer: Thread analyze:
[15:15:22.456,481] <inf> thread_analyzer:  fota_work_q         : STACK: unused 1096 usage 952 / 2048 (46 %); CPU: 0 %
[15:15:22.456,542] <inf> thread_analyzer:                      : Total CPU cycles used: 366144
[15:15:22.457,031] <inf> thread_analyzer:  thread_analyzer     : STACK: unused 1336 usage 712 / 2048 (34 %); CPU: 0 %
[15:15:22.457,061] <inf> thread_analyzer:                      : Total CPU cycles used: 247354
[15:15:22.457,550] <inf> thread_analyzer:  date_time_work_q    : STACK: unused 848 usage 1200 / 2048 (58 %); CPU: 0 %
[15:15:22.457,580] <inf> thread_analyzer:                      : Total CPU cycles used: 569
[15:15:22.458,251] <inf> thread_analyzer:  mcumgr smp          : STACK: unused 1760 usage 288 / 2048 (14 %); CPU: 0 %
[15:15:22.458,282] <inf> thread_analyzer:                      : Total CPU cycles used: 2
[15:15:22.458,679] <inf> thread_analyzer:  mflt_http           : STACK: unused 1064 usage 984 / 2048 (48 %); CPU: 0 %
[15:15:22.458,831] <inf> thread_analyzer:                      : Total CPU cycles used: 102262
[15:15:22.459,197] <inf> thread_analyzer:  work_q              : STACK: unused 816 usage 1232 / 2048 (60 %); CPU: 0 %
[15:15:22.459,228] <inf> thread_analyzer:                      : Total CPU cycles used: 335116
[15:15:22.460,113] <inf> thread_analyzer:  sysworkq            : STACK: unused 2744 usage 1352 / 4096 (33 %); CPU: 0 %
[15:15:22.460,266] <inf> thread_analyzer:                      : Total CPU cycles used: 394472014
[15:15:22.460,784] <inf> thread_analyzer:  logging             : STACK: unused 1368 usage 680 / 2048 (33 %); CPU: 0 %
[15:15:22.460,815] <inf> thread_analyzer:                      : Total CPU cycles used: 224707607
[15:15:22.461,212] <inf> thread_analyzer:  idle                : STACK: unused 272 usage 48 / 320 (15 %); CPU: 99 %
[15:15:22.461,242] <inf> thread_analyzer:                      : Total CPU cycles used: 125685407577
[15:15:22.834,655] <inf> thread_analyzer:  main                : STACK: unused 2928 usage 1168 / 4096 (28 %); CPU: 0 %
[15:15:22.834,777] <inf> thread_analyzer:                      : Total CPU cycles used: 3702060
[15:15:22.835,296] <inf> thread_analyzer:  ISR0                : STACK: unused 1520 usage 528 / 2048 (25 %)

Parents
  • Hi,

    After the addition of which functionality (LTE, MQTT, GNSS, OTA, Memfault diagnostics) have you first seen "Modem has crashed" error?

    How did the modem recover after crashing?

    Is this error reproducible on your end?

    What happens if you remove GNSS and OTA functionalities? Do you still see the same error?

    Best regards,
    Dejan

  • Hi Dejan,

    After the addition of which functionality (LTE, MQTT, GNSS, OTA, Memfault diagnostics) have you first seen "Modem has crashed" error?

    - I only do extended testing once all of these are implemented, so I'm not sure. But Memfault OTA was added later on after the crash issue because I want to know the metrics and what caused it, but still unable to figure out the cause.

    How did the modem recover after crashing?

    - The modem did not recover, my program was not able to do any further LTE connection unless I trigger the reset using the RESET button.

    Is this error reproducible on your end?

    - Yes, I was able to reproduce this error twice so far. Note that it takes quite a long time for thing to happen. The 1st error happened when it run for 3 weeks, the 2nd try, it ran for about 5 weeks. I noticed that if I switched back and forth between GNSS and LTE often, the time until crashes decreases. For the 1st run, I issue GNSS fix once per day. For the 2nd run, GNSS fix once every 3-4 days.

    What happens if you remove GNSS and OTA functionalities? Do you still see the same error?

    - To be honest, this issue is hard to reproduce because it needs to run in extended period of time. Any recommendations on what kind of log or monitor methods I should enable before running these tests again?

  • Hi,

    Can you provide modem trace?

    Best regards,
    Dejan

Reply Children
Related