nRF9160 modem hard fault debugging

  • Hello,

My nRF9160 hard faults (i think?) when issuing a command to the modem, specifically this line:

query_modem("AT+CGSN", imei_buf, sizeof(imei_buf));

which then calls ncs function:

/**
 * @brief Send a formatted AT command to the modem
 *	  and receive the response into the supplied buffer.
 *
 * @param buf Buffer to receive the response into.
 * @param len Buffer length.
 * @param fmt Command format.
 * @param ... Format arguments.
 *
 * @retval  0 On "OK" responses.
 * @returns A positive value On "ERROR", "+CME ERROR", and "+CMS ERROR" responses.
 *	    The type of error can be distinguished using @c nrf_modem_at_err_type.
 *	    The error value can be retrieved using @c nrf_modem_at_err.
 * @retval -NRF_EPERM The Modem library is not initialized.
 * @retval -NRF_EFAULT @c buf or @c fmt are @c NULL.
 * @retval -NRF_ENOMEM Not enough shared memory for this request.
 * @retval -NRF_E2BIG The response is larger than the supplied buffer @c buf.
 * @retval -NRF_EINVAL If @c len is zero.
 */
int nrf_modem_at_cmd(void *buf, size_t len, const char *fmt, ...);

This leads to I think is a hard fault and the board resets, the modem never returns any values, so I can't see what is happening.

NCS 1.9.1

mfw 1.3.2

What might cause this?

How can I debug this?

Thank you

Parents Reply Children
  • This is what I can extract from when the debugger hits the breakpoint in VS Code:


    0x20002ea0 <z_interrupt_stacks+1984>

  • Hi,

     

    If you convert the decimal numbers to hex, specifically the contents of LR and PC, you should be able to do a lookup of the addresses using arm-none-eabi-addr2line:

    LR: arm-none-eabi-addr2line -e path/to/build/zephyr/zephyr.elf 0x4a17b

    PC: arm-none-eabi-addr2line -e path/to/build/zephyr/zephyr.elf 0x27326

     

    Content of R2, 537000870 (0x2001fba6), is in the RAM and must be checked in the build/zephyr/zephyr.map file manually to see which section/thread/etc that this might be.

     

    Could you post the output of the above?

     

    Kind regards,

    Håkon

  • Thank you, this is the output:

    LR: ncs/v1.9.1/zephyr/include/arch/arm/aarch32/asm_inline_gcc.h:95
    PC: ncs/v1.9.1/zephyr/include/drivers/adc.h:386

    I got R2 hex 0xD this time i ran it, and in the zephyr.map I got 8 occurences for this value:

     .rodata.lwm2m_engine_delete_obj_inst.str1.1
                    0x00000000        0xd zephyr/subsys/net/lib/lwm2m/libsubsys__net__lib__lwm2m.a(lwm2m_engine.c.obj)
                    
    .rodata.str1.1
                    0x00058cf4        0xd zephyr/libzephyr.a(stream_flash.c.obj)
                    
    .rodata.cmd_read.str1.1
                    0x0005d6b3        0xd zephyr/drivers/flash/libdrivers__flash.a(flash_shell.c.obj)
                    
    .rodata.str1.1
                    0x0005da85        0xd modules/nrf/lib/nrf_modem_lib/lib..__nrf__lib__nrf_modem_lib.a(nrf91_sockets.c.obj)
                    
    .rodata.fota_update_counter_update.str1.1
                    0x0005fcbd        0xd modules/nrf/subsys/net/lib/lwm2m_client_utils/lib..__nrf__subsys__net__lib__lwm2m_client_utils.a(settings.c.obj)
                    
    .rodata.str1.1
                    0x00060328        0xe modules/nrf/subsys/net/lib/lwm2m_client_utils/lib..__nrf__subsys__net__lib__lwm2m_client_utils.a(lwm2m_device.c.obj)
                                      0xd (size before relaxing)
                                      
                                      
     .rodata.str1.1
                    0x00060c47        0xd modules/mcuboot/boot/bootutil/zephyr/libmcuboot_util.a(bootutil_public.c.obj)
                    
    .rodata._dtoa_r.str1.1
                    0x00061de4        0xd c:/Users/Robert/ncs/v1.9.1/toolchain/opt/arm-none-eabi/lib/thumb/v8-m.main/nofp\libc_nano.a(lib_a-dtoa.o)
                                      0xf (size before relaxing)

  • Hi,

     

    Robert K said:
    I got R2 hex 0xD this time i ran it, and in the zephyr.map I got 8 occurences for this value:

    0xd can just be a random value that was used within one scope, it is not a guarantee that it can be resolved back to a specific function or RAM area.

     

    Robert K said:

    Thank you, this is the output:

    LR: ncs/v1.9.1/zephyr/include/arch/arm/aarch32/asm_inline_gcc.h:95
    PC: ncs/v1.9.1/zephyr/include/drivers/adc.h:386

    Was this output with the latest assert messages? Each time you compile and flash, the content of the CPU registers will change.

    If they were, it indicates that you assert when you're trying to access the ADC / unlocking an IRQ:

    https://github.com/nrfconnect/sdk-zephyr/blob/v2.7.99-ncs1-1/include/arch/arm/aarch32/asm_inline_gcc.h#L95

    https://github.com/nrfconnect/sdk-zephyr/blob/v2.7.99-ncs1/include/drivers/adc.h#L386

    Which thread was running? Have you tried debugging to see between which functions it faults? Ie. set a breakpoint, see if it hits, jump to the next function, etc. until it faults.

     

    Kind regards,

    Håkon

  • Yes this was the latest output,

    Im not quite sure how to see what Thread that was running, i'm using VS Code with the nRF Connect Extensions.

    If I debug with Ozone and save the snapshot after the fault, would that be of any use? 

Related