having issues with saving coredump to flash or at all

Hi Nordic

I am working with nrf52840 and nrf52832 using ncs v2.8.0

I am trying to save coredump to flash according to instructions on this link - https://docs.nordicsemi.com/bundle/ncs-2.8.0/page/zephyr/services/debugging/coredump.html

I added this to my pm_static_my_board.yml

coredump_partition:
  address: 0xCF000
  size: 0x8000
  region: flash_primary

And this to my_board.overlay

&flash0 {
    /*
     * For more information, see:
     * http: //docs.zephyrproject.org/latest/guides/dts/index.html#flash-partitions
     */
    partitions {
        compatible = "fixed-partitions";
        #address-cells = <1>;
        #size-cells = <1>;

      ...
        coredump_partition: partition@000080000 { //THIS IS NOT LEGIT ADDRESS(END OF FLASH) BUT IT IS NOT TAKEN TO ACOUNT BECAUS PM_STATIC IS
            label = "coredump-partition";
            reg = <0x000080000 DT_SIZE_K(4)>;
        };
    };

A side note is that this is strange that I need to set it in the overlay which is basically ignored because pm_static partitions is the one that actually matters (unless i got something wrong ? )

And this configs to my prj.conf

# Coredump 
CONFIG_DEBUG_COREDUMP=y
CONFIG_DEBUG_COREDUMP_BACKEND_FLASH_PARTITION=y
CONFIG_DEBUG_COREDUMP_MEMORY_DUMP_THREADS=y

In my my_board/my_app/zephyr/.config i see this coredump related configs

CONFIG_ARCH_SUPPORTS_COREDUMP=y
CONFIG_ARCH_SUPPORTS_COREDUMP_THREADS=y

# CONFIG_COREDUMP_DEVICE is not set

CONFIG_DEBUG_THREAD_INFO=y
CONFIG_DEBUG_COREDUMP=y
# CONFIG_DEBUG_COREDUMP_BACKEND_LOGGING is not set
CONFIG_DEBUG_COREDUMP_BACKEND_FLASH_PARTITION=y
# CONFIG_DEBUG_COREDUMP_BACKEND_OTHER is not set
# CONFIG_DEBUG_COREDUMP_MEMORY_DUMP_MIN is not set
CONFIG_DEBUG_COREDUMP_MEMORY_DUMP_THREADS=y
# CONFIG_DEBUG_COREDUMP_MEMORY_DUMP_LINKER_RAM is not set
CONFIG_DEBUG_COREDUMP_FLASH_CHUNK_SIZE=64
CONFIG_DEBUG_COREDUMP_THREADS_METADATA=y

I am generating a coredump using this implementation 

void trigger_coredump(void)
{
    __ASSERT(0, "Forcing coredump");
}

When i try to read the flash area after generating the coredump with nrfjprog --memrd 0xCF000 --w 32 --n 0x8000
i get all 0xFF 

what i am missing ?

I also tried to check myself by replacing CONFIG_DEBUG_COREDUMP_BACKEND_FLASH_PARTITION=y

With CONFIG_DEBUG_COREDUMP_BACKEND_LOGGING=y

Hopping to see the coredump on my open rtt but nothing .. when coredump is triggered prints just stop

  1. What am I missing? Why can't I find a coredump on the flash partition or in the rtt log ?
  2. Can it be that the device does not have the time to write the coredump before the actual crash ? If so, how can I manage that ?
  3. Is there some auto deletion of the flash partition with the coredump so new coredumps can be saved or is it something i have to manage myself after i read the coredump from flash ? 

Hope to read you soon

Best regards

Ziv

Parents Reply Children
  • It's a bit hard since I assume a hard fault/ stack overflow blocks all further instructions from running. What I found from my experience is that you can inject your own message near your point of interest and see which branch it goes through, other adjacent branches and what conditions you need to trigger them, etc.

    For example, given this usage fault:

    **** Using Zephyr OS v4.0.99-a0e545cb437a ***
    [00:00:00.297,698] <inf> flashdisk: Initialize device NAND
    [00:00:00.297,729] <inf> flashdisk: offset 300000, sector size 512, page size 4096, volume size 4194304
    [00:00:14.148,559] <err> os: ***** USAGE FAULT *****
    [00:00:14.148,559] <err> os:   Attempt to execute undefined instruction
    [00:00:14.148,590] <err> os: r0/a1:  0x0bad0000  r1/a2:  0x00000000  r2/a3:  0x00000000
    [00:00:14.148,590] <err> os: r3/a4:  0xffffffff r12/ip:  0x0004e4bb r14/lr:  0x0001b203
    [00:00:14.148,620] <err> os:  xpsr:  0x49100000
    [00:00:14.148,620] <err> os: s[ 0]:  0x200099e4  s[ 1]:  0x00000000  s[ 2]:  0x00000009  s[ 3]:  0x00021cc7
    [00:00:14.148,651] <err> os: s[ 4]:  0x00000001  s[ 5]:  0x00000030  s[ 6]:  0x0005ac30  s[ 7]:  0x0004dd57
    [00:00:14.148,651] <err> os: s[ 8]:  0x00000000  s[ 9]:  0x200099e0  s[10]:  0x20009ae0  s[11]:  0x00001972
    [00:00:14.148,681] <err> os: s[12]:  0x20005a94  s[13]:  0x0001df3b  s[14]:  0x20005a94  s[15]:  0x00001000
    [00:00:14.148,681] <err> os: fpscr:  0xffffffff
    [00:00:14.148,681] <err> os: Faulting instruction address (r15/pc): 0x00017a88
    [00:00:14.148,712] <err> os: >>> ZEPHYR FATAL ERROR 36: Unknown error on CPU 0
    [00:00:14.148,742] <err> os: Current thread: 0x20002758 (mp_main)
    [00:00:14.273,651] <err> os: Halting system

    I wanna see what triggers it, so I search for: " Attempt to execute undefined instruction" and found it here:

    /opt/nordic/ncs/v3.0.0/zephyr/arch/arm/core/cortex_m/fault.c:550:

    PR_FAULT_INFO(" Attempt to execute undefined instruction");
    inside the function "static uint32_t usage_fault(const struct arch_esf *esf)".

    It might seem very basic/ rudimentary, but it helped me overcome various hurdles when working with different Zephyr/ Nordic features.

    ==============================

    Also worth mentioning is this post:
    RE: Saving coredumps to external flash 

    Where they mention that:
    "To get the ESF you can override Zephyr's fatal function and simply store the values in retained memory (as the above example shows). 

    void k_sys_fatal_error_handler(unsigned int reason, const z_arch_esf_t *esf_input)"

    Which in my interpretation means that "k_sys_fatal_error_handler()" is the function that you're looking to debug.

    ==============================

    I looked into Memfault conceptually, but it's a much bigger feature (from a ROM and RAM consumption perspective) than simply having the Coredump being saved to flash/ external flash.

    I can provide you a working sample for printing it to the serial CLI, using:

    CONFIG_DEBUG_COREDUMP_BACKEND_LOGGING=y
    Would that be of any use?
  • For example, given this usage fault:

    this is not helping my case, first i don't even get this build in log at my current branch (i meant branch as a git branch from my main development branch which currently works with memfault) i am generating the assertion with __ASSERT(0, ..) so i know where it happens i just want to see that i know to save the coredump into flash and currently it does not .. and like i mentioned even does not prints the logs you mentioned .. so unfortunately no use for this at the moment Pray

    Also worth mentioning is this post:
    RE: Saving coredumps to external flash 

    also tried what  did in his attempt and it did not work for me man shrugging

    and i am not sure why he put CONFIG_ASSERT=n .. i want to be able to catch those as well

  • Hi Ziv and sorry for the delay. From my testing it seems like I'm currently not able to catch asserts with coredump. Triggering another fatal error works like s charm, so I suspect there is something with the configuration of the error fault handler.

    Regards

    Runar

  • For me, the post that I linked was a keystone to getting the _LOGGING version working. It's interesting that you say:

    so i know where it happens i just want to see that i know to save the coredump into flash

    When the Coredump works, you shouldn't do any saving yourself. The Coredump feature itself saves to flash/ external flash.

    The way you're generating the error doesn't matter as long as you can get the coredump working. This is why I suggested to first do the _LOGGING option since then you know for sure that the Coredump feature itself works. Afterwards it's just a matter of switching the Coredump feature from _LOGGING to 

    CONFIG_DEBUG_COREDUMP_BACKEND_FLASH_PARTITION or CONFIG_DEBUG_COREDUMP_BACKEND_OTHER. This is where I'm stuck at. :)
  • hi Runsiv

    catch asserts with coredump

    i wonder why there is no problem to save coredump with assertion when using memfault instead of only trying to capture coredump ???

    also, tried to generate a crash another way with 

    void trigger_coredump(void)
    {
        	*(uint32_t *) 0xFFFFFFFF = 1;
    
        // __ASSERT(0, "Forcing coredump");
    }

    instead of using assert

    same results

    no logs that are usually build in in zephyr for telling you where the crash happened and off course no coredump saved to memory ..

    if there is any farther data i can share to get some direction for this please let me know

    i am kind of stuck on this feature which supposed to be a builtin supported feature both in nordic and in zephyr

    https://docs.nordicsemi.com/bundle/ncs-2.4.2/page/zephyr/services/debugging/coredump.html

    docs.zephyrproject.org/.../coredump.html ]

    hope to read you soon

    best regards

    Ziv

Related