nRF9151 modem lock up, can't reprogram

While working on debugging our application firmware using the nRF9151, we ran into issues that we hypothesized as being caused by an incorrect modem clock leading to validation issues with TLS certificates. As part of looking into this, we added some calls to date_time_set to our firmware intended to desynchronize the modem clock to attempt to reproduce the errors. The date time library was used with CONFIG_DATE_TIME_MODEM=y, and set the clock to 2015-01-01T00:00:00Z, the earliest time the library would allow. 

After this call, the modem core crashed due to a failed assertion, and emitted the following logs:
[00:00:13.016,876] <err> nrf_modem: Modem has crashed, reason 0x10, PC: 0x3bd84

In following attempts to reset the device, application firmware would hang during initialization. Attaching gdb revealed that the main thread was deadlocked in nrf_modem_lib_init, specifically an internal call to k_sem_take in rpc_transport_ipc_init. At one point, continuing execution in gdb allowed the application to move past the blocking sem, yielding the following logs from the modem:

[00:00:00.700,469] <err> nrf_modem: Modem has crashed, reason 0x11, PC: 0x3b97e
[00:00:00.701,019] <err> nrf_modem: Modem library initialization failed, err -5
[00:00:00.701,049] <err> nrf_modem: Modem library did not initialize: -5
[00:00:00.701,049] <err> modem_antenna: Modem library did not initialize: -5
[00:00:00.701,080] <err> lte_lc: Modem library init error: -5, lte_lc not initialized

Issues initializing the modem persist through both software resets and hard power cycles of the unit. Currently the unit has returned to deadlocking internal to the modem library. While we have modem trace over RTT enabled, no data is being transferred. Attempting to reprogram the modem firmware with nrfutil stalls at "Setting up modem IPC" with the following error:

[00:00:10] ------   0% [1/2 1051205126] Failed, Timed out waiting for IPC event on RootKeyDigest channel
Error: One or more program tasks failed:
 * 1051205126: Timed out waiting for IPC event on RootKeyDigest channel (Generic)

Is this unit bricked? Are there any other recovery steps that I haven't taken? Any advice would be appreciated. This board is running modem firmware v2.0.4

Parents
  • Hi Redrield!

    Thanks for reaching out. The modem should not behave like this. I've reached out to the developers to see if they can provide som insight.

    Could you share the exact nrfutil-command you used to program the modem firmware?

    Best regards,
    Carl Richard


  • Hi Carl,

    I tried to program the modem with `nrfutil device program --firmware mfw_nrf91x1_2.0.4.zip`

  • Hi again!

    Thanks. Could you also share the program that triggered this issue? I want to try to reproduce it.

    We can make the ticket private if you don't want to share this publicly.

    Best regards,
    Carl Richard

  • Our firmware is a lot more expansive that the part that triggered the bug, so I'll summarize it here. 

    1. init modem library, connect to lte. We do this in the same way as samples, lte_lc_connect_async with a semaphore given in the callback when the device is connected
    2. Set up socket to connect over TLS to some server. Target shouldn't matter
    3. Before the calls to `socket` and `connect`, I added the following

    struct tm tm = {
        .tm_sec = 0,
        .tm_min = 0,
        .tm_hour = 0,
        .tm_mon = 0,
        .tm_mday = 1,
        .tm_year = 115,
    }; // Defined above the function call, timestamp of 2015-01-01T00:00:00Z

    date_time_set(&tm);
    sock = socket(...);
    ...
    date_time_set(&tm);
    connect(sock, ...);

    With kconfig definitions CONFIG_DATE_TIME=y, CONFIG_DATE_TIME_MODEM=y (To ensure the timestamp update is sent to the modem)

    The 2 calls to date_time_set completed successfully, and the modem panic occurred almost immediately after.

  • Thanks. 

    I've tried to reproduce the issue here using the https_client sample, without success. A couple of things you can try while we wait for a response from the developers:

    • Recovering the DK as described here to ensure that it's not an issue with the application processor.
    • After this, try to test the AT Client sample to see if you can communicate with the modem at all. You can also try programming the MFW again.

    Also: are you using a custom board or a nRF9151 DK? In the latter case, which version?

    Best regards,
    Carl Richard

Reply
  • Thanks. 

    I've tried to reproduce the issue here using the https_client sample, without success. A couple of things you can try while we wait for a response from the developers:

    • Recovering the DK as described here to ensure that it's not an issue with the application processor.
    • After this, try to test the AT Client sample to see if you can communicate with the modem at all. You can also try programming the MFW again.

    Also: are you using a custom board or a nRF9151 DK? In the latter case, which version?

    Best regards,
    Carl Richard

Children
Related