having issues with saving coredump to flash or at all

Hi Nordic

I am working with nrf52840 and nrf52832 using ncs v2.8.0

I am trying to save coredump to flash according to instructions on this link - https://docs.nordicsemi.com/bundle/ncs-2.8.0/page/zephyr/services/debugging/coredump.html

I added this to my pm_static_my_board.yml

coredump_partition:
  address: 0xCF000
  size: 0x8000
  region: flash_primary

And this to my_board.overlay

&flash0 {
    /*
     * For more information, see:
     * http: //docs.zephyrproject.org/latest/guides/dts/index.html#flash-partitions
     */
    partitions {
        compatible = "fixed-partitions";
        #address-cells = <1>;
        #size-cells = <1>;

      ...
        coredump_partition: partition@000080000 { //THIS IS NOT LEGIT ADDRESS(END OF FLASH) BUT IT IS NOT TAKEN TO ACOUNT BECAUS PM_STATIC IS
            label = "coredump-partition";
            reg = <0x000080000 DT_SIZE_K(4)>;
        };
    };

A side note is that this is strange that I need to set it in the overlay which is basically ignored because pm_static partitions is the one that actually matters (unless i got something wrong ? )

And this configs to my prj.conf

# Coredump 
CONFIG_DEBUG_COREDUMP=y
CONFIG_DEBUG_COREDUMP_BACKEND_FLASH_PARTITION=y
CONFIG_DEBUG_COREDUMP_MEMORY_DUMP_THREADS=y

In my my_board/my_app/zephyr/.config i see this coredump related configs

CONFIG_ARCH_SUPPORTS_COREDUMP=y
CONFIG_ARCH_SUPPORTS_COREDUMP_THREADS=y

# CONFIG_COREDUMP_DEVICE is not set

CONFIG_DEBUG_THREAD_INFO=y
CONFIG_DEBUG_COREDUMP=y
# CONFIG_DEBUG_COREDUMP_BACKEND_LOGGING is not set
CONFIG_DEBUG_COREDUMP_BACKEND_FLASH_PARTITION=y
# CONFIG_DEBUG_COREDUMP_BACKEND_OTHER is not set
# CONFIG_DEBUG_COREDUMP_MEMORY_DUMP_MIN is not set
CONFIG_DEBUG_COREDUMP_MEMORY_DUMP_THREADS=y
# CONFIG_DEBUG_COREDUMP_MEMORY_DUMP_LINKER_RAM is not set
CONFIG_DEBUG_COREDUMP_FLASH_CHUNK_SIZE=64
CONFIG_DEBUG_COREDUMP_THREADS_METADATA=y

I am generating a coredump using this implementation 

void trigger_coredump(void)
{
    __ASSERT(0, "Forcing coredump");
}

When i try to read the flash area after generating the coredump with nrfjprog --memrd 0xCF000 --w 32 --n 0x8000
i get all 0xFF 

what i am missing ?

I also tried to check myself by replacing CONFIG_DEBUG_COREDUMP_BACKEND_FLASH_PARTITION=y

With CONFIG_DEBUG_COREDUMP_BACKEND_LOGGING=y

Hopping to see the coredump on my open rtt but nothing .. when coredump is triggered prints just stop

  1. What am I missing? Why can't I find a coredump on the flash partition or in the rtt log ?
  2. Can it be that the device does not have the time to write the coredump before the actual crash ? If so, how can I manage that ?
  3. Is there some auto deletion of the flash partition with the coredump so new coredumps can be saved or is it something i have to manage myself after i read the coredump from flash ? 

Hope to read you soon

Best regards

Ziv

Parents
  • Hi

    I will look into your case. Just a quick question to start with. Are you using MCUBOOT also?
    Regards

    Runar

  • hi Vidar

    What is relevant here is whether you have CONFIG_ASSERT enabled in your build not

    we use assert a lot in our code and also zephyr uses it internally so this is why it is wird for me and also why i try to avoid it plus i don't seem to be getting to the writing of CD to flash at the moment .. anyway i'll try to run with it disabled and see

    i can not use RAM for saving logs or CD since the devices in the field are configured with logs disabled and i need to know what happens there in a retrospective, after they reset and reconnected with a node

    i don't mind using the internal flash or the external (external will be more comfortable obviously but internal will work for now as well) 

    You can simply redefine the weakly defined k_sys_fatal_error_handler() function in your application

    the api seems to have fatal error reason and exception context .. are those filled automatically ?

    just to be sure if i overwrite this api then i do not have to CONFIG_ASSERT=n ?

    p.s. this api seems to be called from inside z_fatal_error which i don't thing i am getting into since i both don't see its prints and putting a breakpoint at the beginning of it when debugging did not stop there 

  •  Well, we save the CD, then we reset, we get the reset reason, check the ext flash for CD's presence, save a correlated file with reset reason next to it. Something like:

    crash_nr_231_CD.txt

    crash_nr_231_reset_reason.txt

    This also brings another question: can we also save other things the moment the hard fault happens? Sensor states, total runtime, battery level, etc?

    But again, since you said that CD can't really save to external -> should we simply override the implementation for k_sys_fatal_error_handler() and disable CD completely?

     I didn't not ignore your point ziv, but this ties in to my main question to Vidar: can CD write to external? Is CD extendable to be able to write to external and write custom data? If not, then the best course of action is to make the mechanism ourselves inside k_sys_fatal_error_handler()?

  • we use assert a lot in our code and also zephyr uses it internally so this is why it is wird for me and also why i try to avoid it plus i don't seem to be getting to

    Please read my previous comments where I try to explain what the issue is. If you are going to have ASSERTs enabled you must use the flash storage backend introduced by the commit I linked.

    i can not use RAM for saving logs or CD since the devices in the field are configured with logs disabled and i need

    My suggestion is to not use the CD functionality at all but rather store relevant information to RAM from the k_sys_fatal_error_handler(). What you do with the RAM content on subsequent reboot is up to you. You can store it to flash, transfer it over BLE, etc.

  • tore relevant information to RAM from the k_sys_fatal_error_handler(). What you do with the RAM content on subsequent reboot is up to you. You can store it to flash, transfer it over BLE, etc.

    i think there is something fundamental i am missing here .. as far as i know ,whatever i save in RAM to some variable or whatever, is gone after reset, right ? . so, if i do not use logs and can only get info on the crash from the device via OTA, after it resets back to normal, then how saving things to RAM help me ? 

  • Hey Ziv.

    There is a special no_init area of the RAM which is persistent through a hot (soft) reset. Hot reset = reset command to the MCU; cold reset = complete power cycle. There is a CONFIG_ option for the system to do a hot reset in case of stack overflows/ hard faults -> the no_init area of the RAM is kept. Wink

Reply Children
No Data
Related