Call to sys_reboot() on nRF52840 causes hard fault

Currently on SDK v2.3.0 but this problem has persisted no matter the SDK version. Here's the fault info.

[00:00:34.898,223] <err> mpsl_init: MPSL ASSERT: 112, 2195
[00:02:03.002,807] <err> os: ***** HARD FAULT *****
[00:02:03.002,807] <err> os: Fault escalation (see below)
[00:02:03.002,838] <err> os: ARCH_EXCEPT with reason 3

[00:02:03.002,838] <err> os: r0/a1: 0x00000003 r1/a2: 0x00000000 r2/a3: 0x000000bf
[00:02:03.002,838] <err> os: r3/a4: 0x00000000 r12/ip: 0x20001918 r14/lr: 0x00000000
[00:02:03.002,868] <err> os: xpsr: 0x61000018
[00:02:03.002,868] <err> os: Faulting instruction address (r15/pc): 0x0002caf8
[00:02:03.002,899] <err> os: >>> ZEPHYR FATAL ERROR 3: Kernel oops on CPU 0
[00:02:03.002,929] <err> os: Fault during interrupt handling

[00:02:03.002,960] <err> os: Current thread: 0x20002d18 (unknown)
[00:02:04.186,157] <err> fatal_error: Resetting system

I have CONFIG_REBOOT=y listed in the prj.conf file. I try to step into sys_reboot and it immediately jumps to arch_irq_lock and faults. I look through the sample projects that reboot and I don't see anything that I'm missing in the prj.conf. Not sure what to think.

Any incite would be appreciated.

Parents
  • Hi,

    The log here shows that you get an MPSL assert which is a direct consequence of the MPSL loosing track of time due to continuing from a breakpoint or stepping. So this is a direct consequence of your debugging and not the error you are attempting to debug. Instead of stepping or continuing form a breakpoint, you can mimic the same behavior by using a breakpoint, and moving it before resetting.

    Can you say more about the issue you are seeing and what you find by debugging/logging?

  • Hi Einar,

    Thanks for taking out the time to help!

    Tring to reboot just appears to freeze the 52840 but what I discovered recently is that if I let it sit long enough, it appears to have rebooted. Takes about 2 minutes, 30 seconds.

    Within Ozone, it will sit there and not show any signs of anything wrong. Not sure what to make of all this.

  • I added log print to the RTT Console every 10 seconds as a heart beat to know it's still active.

    for(;;)
    {
        static uint32_t counter = 0;
    
        if(counter == 5)
        {
            LOG_INF("Will now reboot in 2 seconds.");
            k_sleep(K_SECONDS(2));
            NVIC_SystemReset();
            //sys_reboot(SYS_REBOOT_WARM);
        }
    
        LOG_DBG("Alive 10 second counter: %d", counter++);
        k_sleep(K_SECONDS(10));
    }

    I then added a counter so that it reboots when it reaches 5. The addition of 

    NVIC_SystemReset() made no difference. Behavior is still the same.
    Here's my prj.conf
    # C++ enable flags. Adds in up to C++20 
    CONFIG_CPLUSPLUS=n
    CONFIG_LIB_CPLUSPLUS=n
    
    # Optimize for debug
    CONFIG_NO_OPTIMIZATIONS=n
    CONFIG_DEBUG_OPTIMIZATIONS=n
    CONFIG_DEBUG=n
    
    # General config
    CONFIG_NEWLIB_LIBC=y
    CONFIG_NEWLIB_LIBC_FLOAT_PRINTF=n
    CONFIG_ASSERT=n
    CONFIG_REBOOT=y
    CONFIG_FPU=n
    
    CONFIG_CLOCK_CONTROL_NRF_K32SRC_RC=y
    CONFIG_CLOCK_CONTROL_NRF_K32SRC_XTAL=n
    
    # Enable the UART driver
    CONFIG_UART_ASYNC_API=y
    CONFIG_NRFX_UARTE0=y
    CONFIG_SERIAL=y
    CONFIG_UART_NRFX=y
    CONFIG_UART_INTERRUPT_DRIVEN=y
    
    # Make sure printk is printing to the UART console
    CONFIG_PRINTK=n
    
    # Heap and stacks
    CONFIG_HEAP_MEM_POOL_SIZE=4096
    CONFIG_MAIN_STACK_SIZE=2048
    CONFIG_MAIN_THREAD_PRIORITY=-10
    CONFIG_SYSTEM_WORKQUEUE_STACK_SIZE=4096
    
    CONFIG_BT=y
    CONFIG_BT_PERIPHERAL=y
    CONFIG_BT_DEVICE_NAME="Orbis_UART_Service"
    CONFIG_BT_DEVICE_APPEARANCE=833
    CONFIG_BT_MAX_CONN=1
    CONFIG_BT_MAX_PAIRED=1
    
    # Enable the NUS service
    CONFIG_BT_NUS=y
    
    # Enable bonding
    CONFIG_BT_SETTINGS=y
    CONFIG_FLASH=y
    CONFIG_FLASH_PAGE_LAYOUT=y
    CONFIG_FLASH_MAP=y
    CONFIG_NVS=y
    CONFIG_SETTINGS=y
    
    # Enable DK LED and Buttons library
    CONFIG_DK_LIBRARY=y
    
    # Config logger
    CONFIG_LOG=y
    CONFIG_USE_SEGGER_RTT=y
    CONFIG_LOG_BACKEND_RTT=y
    CONFIG_LOG_BACKEND_UART=n
    
    # Image manager
    CONFIG_IMG_MANAGER=y
    CONFIG_FLASH=y
    CONFIG_IMG_ERASE_PROGRESSIVELY=y
    
    # DFU Target
    CONFIG_DFU_TARGET=y
    
    # Application Upgrade support
    CONFIG_BOOTLOADER_MCUBOOT=y
    This is the output from the RTT Console.
    *** Booting Zephyr OS build v3.2.99-ncs2 ***
    [00:00:00.007,171] \033[0m<inf> fs_nvs: 2 Sectors of 4096 bytes\033[0m
    [00:00:00.007,171] \033[0m<inf> fs_nvs: alloc wra: 0, fc8\033[0m
    [00:00:00.007,171] \033[0m<inf> fs_nvs: data wra: 0, 2c\033[0m
    [00:00:00.007,263] \033[0m<inf> bt_sdc_hci_driver: SoftDevice Controller build revision: 
                                                d8 0c 2d 2f 36 ae e2 5c  80 26 80 4c 3f 4d 16 53 |..-/6..\ .&.L?M.S
                                                50 96 c7 73                                      |P..s             \033[0m
    [00:00:00.010,131] \033[0m<inf> bt_hci_core: No ID address. App must call settings_load()\033[0m
    [00:00:00.010,131] \033[0m<inf> MAIN: Bluetooth initialized\033[0m
    [00:00:00.010,192] \033[0m<inf> MAIN: Firmware Version: D1.4.24\033[0m
    [00:00:00.010,192] \033[0m<dbg> MAIN: main: Settings loaded\033[0m
    [00:00:00.010,833] \033[0m<dbg> BLE: uart_init: UART initialized\033[0m
    [00:00:00.010,864] \033[0m<dbg> MAIN: main: Alive 10 second counter: 0\033[0m
    [00:00:10.010,925] \033[0m<dbg> MAIN: main: Alive 10 second counter: 1\033[0m
    [00:00:14.005,340] \033[0m<inf> BLE: Advertising started.\033[0m
    [00:00:20.010,986] \033[0m<dbg> MAIN: main: Alive 10 second counter: 2\033[0m
    [00:00:30.011,047] \033[0m<dbg> MAIN: main: Alive 10 second counter: 3\033[0m
    [00:00:40.011,108] \033[0m<dbg> MAIN: main: Alive 10 second counter: 4\033[0m
    [00:00:50.011,169] \033[0m<inf> MAIN: Will now reboot in 2 seconds.\033[0m
    *** Booting Zephyr OS build v3.2.99-ncs2 ***
    [00:00:00.007,476] \033[0m<inf> fs_nvs: 2 Sectors of 4096 bytes\033[0m
    [00:00:00.007,507] \033[0m<inf> fs_nvs: alloc wra: 0, fc8\033[0m
    [00:00:00.007,507] \033[0m<inf> fs_nvs: data wra: 0, 2c\033[0m
    [00:00:00.007,598] \033[0m<inf> bt_sdc_hci_driver: SoftDevice Controller build revision: 
                                                d8 0c 2d 2f 36 ae e2 5c  80 26 80 4c 3f 4d 16 53 |..-/6..\ .&.L?M.S
                                                50 96 c7 73                                      |P..s             \033[0m
    [00:00:00.010,467] \033[0m<inf> bt_hci_core: No ID address. App must call settings_load()\033[0m
    [00:00:00.010,467] \033[0m<inf> MAIN: Bluetooth initialized\033[0m
    [00:00:00.010,498] \033[0m<inf> MAIN: Firmware Version: D1.4.24\033[0m
    [00:00:00.010,528] \033[0m<dbg> MAIN: main: Settings loaded\033[0m
    [00:00:00.011,169] \033[0m<dbg> BLE: uart_init: UART initialized\033[0m
    [00:00:00.011,169] \033[0m<dbg> MAIN: main: Alive 10 second counter: 0\033[0m
    [00:00:10.011,260] \033[0m<dbg> MAIN: main: Alive 10 second counter: 1\033[0m
    [00:00:20.011,322] \033[0m<dbg> MAIN: main: Alive 10 second counter: 2\033[0m
    [00:00:30.011,383] \033[0m<dbg> MAIN: main: Alive 10 second counter: 3\033[0m
    [00:00:40.011,444] \033[0m<dbg> MAIN: main: Alive 10 second counter: 4\033[0m
    [00:00:50.011,505] \033[0m<inf> MAIN: Will now reboot in 2 seconds.\033[0m
    *** Booting Zephyr OS build v3.2.99-ncs2 ***
    [00:00:00.007,141] \033[0m<inf> fs_nvs: 2 Sectors of 4096 bytes\033[0m
    [00:00:00.007,171] \033[0m<inf> fs_nvs: alloc wra: 0, fc8\033[0m
    [00:00:00.007,171] \033[0m<inf> fs_nvs: data wra: 0, 2c\033[0m
    [00:00:00.007,263] \033[0m<inf> bt_sdc_hci_driver: SoftDevice Controller build revision: 
                                                d8 0c 2d 2f 36 ae e2 5c  80 26 80 4c 3f 4d 16 53 |..-/6..\ .&.L?M.S
                                                50 96 c7 73                                      |P..s             \033[0m
    [00:00:00.010,131] \033[0m<inf> bt_hci_core: No ID address. App must call settings_load()\033[0m
    [00:00:00.010,131] \033[0m<inf> MAIN: Bluetooth initialized\033[0m
    [00:00:00.010,192] \033[0m<inf> MAIN: Firmware Version: D1.4.24\033[0m
    [00:00:00.010,192] \033[0m<dbg> MAIN: main: Settings loaded\033[0m
    [00:00:00.010,833] \033[0m<dbg> BLE: uart_init: UART initialized\033[0m
    [00:00:00.010,864] \033[0m<dbg> MAIN: main: Alive 10 second counter: 0\033[0m
    [00:00:10.010,925] \033[0m<dbg> MAIN: main: Alive 10 second counter: 1\033[0m
    [00:00:20.010,986] \033[0m<dbg> MAIN: main: Alive 10 second counter: 2\033[0m
    [00:00:30.011,047] \033[0m<dbg> MAIN: main: Alive 10 second counter: 3\033[0m
    [00:00:40.011,108] \033[0m<dbg> MAIN: main: Alive 10 second counter: 4\033[0m
    [00:00:50.011,169] \033[0m<inf> MAIN: Will now reboot in 2 seconds.\033[0m
    *** Booting Zephyr OS build v3.2.99-ncs2 ***
    [00:00:00.007,476] \033[0m<inf> fs_nvs: 2 Sectors of 4096 bytes\033[0m
    [00:00:00.007,507] \033[0m<inf> fs_nvs: alloc wra: 0, fc8\033[0m
    [00:00:00.007,507] \033[0m<inf> fs_nvs: data wra: 0, 2c\033[0m
    [00:00:00.007,598] \033[0m<inf> bt_sdc_hci_driver: SoftDevice Controller build revision: 
                                                d8 0c 2d 2f 36 ae e2 5c  80 26 80 4c 3f 4d 16 53 |..-/6..\ .&.L?M.S
                                                50 96 c7 73                                      |P..s             \033[0m
    [00:00:00.010,437] \033[0m<inf> bt_hci_core: No ID address. App must call settings_load()\033[0m
    [00:00:00.010,467] \033[0m<inf> MAIN: Bluetooth initialized\033[0m
    [00:00:00.010,498] \033[0m<inf> MAIN: Firmware Version: D1.4.24\033[0m
    [00:00:00.010,498] \033[0m<dbg> MAIN: main: Settings loaded\033[0m
    [00:00:00.011,138] \033[0m<dbg> BLE: uart_init: UART initialized\033[0m
    [00:00:00.011,169] \033[0m<dbg> MAIN: main: Alive 10 second counter: 0\033[0m
    [00:00:10.011,230] \033[0m<dbg> MAIN: main: Alive 10 second counter: 1\033[0m
    [00:00:20.011,291] \033[0m<dbg> MAIN: main: Alive 10 second counter: 2\033[0m
    [00:00:30.011,352] \033[0m<dbg> MAIN: main: Alive 10 second counter: 3\033[0m
    [00:00:40.011,413] \033[0m<dbg> MAIN: main: Alive 10 second counter: 4\033[0m
    [00:00:50.011,474] \033[0m<inf> MAIN: Will now reboot in 2 seconds.\033[0m
    
  • Hi,

    I see from the log that there are repeated resets, but is it still so that there is a long delay between each of them (you wrote  about 2 minutes, 30 seconds)? I wonder what happens during this time...

    Can you read, print and clear the RESETREAS register so that we see that in the log? (You can for instance copy-paste the reset_reason_print() from nrf/samples/bluetooth/peripheral_power_profiling/src/main.c). Are you testing on your custom HW? If so, can you describe it? Also, it would be good if you could test this on a DK to see if you see the same behavior there as on your HW.

  • I see from the log that there are repeated resets, but is it still so that there is a long delay between each of them (you wrote  about 2 minutes, 30 seconds)? I wonder what happens during this time...

    Once I enter sys_reboot() or NVIC_SystemReset() it hangs for 2 minutes, 30 seconds. Chances are the fault dog is resetting the chip.

    Can you read, print and clear the RESETREAS register

    Yep. This is all I got from it.
    [00:00:00.000,244] \033[0m<inf> MAIN: Application soft reset detected

    Are you testing on your custom HW?

    Yes.

    If so, can you describe it?

    Here's the schematic.

  • Are you able to reproduce this on a DK? If so, can you upload a minimal project that reproduce it here so that I can test on my end?

  • Are you able to reproduce this on a DK?

    no. Works as expected on a DK and I tried it on the 3 that I have.

Reply Children
  • I found this ticket and it was the solution to the problem which is a bug in the bootloader. Who would have thought? I removed all support for Bootloader and DFU and low and behold, rebooting works. Now I just need to figure out a work around.
    sd_nvic_System_Reset() causes my application to hang until WDT timeout 

    Thanks for all your help!

  • Hi,

    Good catch! I did not think about it, but this is a known issue which is only seen after doing a soft reset with the WDT enabled and using the LFRC as low frequency clock source.

    The workaround is to modify timer_init() from components/libraries/bootloader/nrf_bootloader_dfu_timers.c like this (notice the commented out lines):

    static void timer_init(void)
    {
        static bool m_timer_initialized;
    
        if (!m_timer_initialized)
        {
            // if (!nrf_clock_lf_is_running())
            // {
                nrf_clock_task_trigger(NRF_CLOCK_TASK_LFCLKSTART);
            // }
    
            nrf_rtc_event_clear(RTC_STRUCT, NRF_RTC_EVENT_TICK);
            nrf_rtc_event_clear(RTC_STRUCT, NRF_RTC_EVENT_COMPARE_0);
            nrf_rtc_event_clear(RTC_STRUCT, NRF_RTC_EVENT_COMPARE_1);
            NRFX_IRQ_PRIORITY_SET(RTC_IRQn, 5);
            NRFX_IRQ_ENABLE(RTC_IRQn);
            nrf_rtc_prescaler_set(RTC_STRUCT, RTC_PRESCALER);
            nrf_rtc_task_trigger(RTC_STRUCT, NRF_RTC_TASK_CLEAR);
            nrf_rtc_task_trigger(RTC_STRUCT, NRF_RTC_TASK_START);
            nrf_rtc_int_enable(RTC_STRUCT, RTC_INTENSET_OVRFLW_Msk);
    
            m_timer_initialized = true;
        }
    }

    The reason this is needed is that when the WDT is running, this will force the LFRC to continue to run during and after a soft reset. And the LFCLKSTAT register that is checked by nrf_clock_lf_is_running() will indicate that the clock is running.  However, the clock is only routed to the WDT in this case, so it needs to be started by triggering the LFCLKSTART task. So the if statement is never needed, but only cause problems in this specific case.

    Edit: Apparently I forget a lot of things during a weekend. The post you linked to was about the nRF5 SDK bootloader, but I remember now that you are using the nRF Connect SDK, which is completely different (using MCUBoot as a bootloader). There might still be an issue here though, but I am not aware of it, and it does not seem like you are using a Watchdog?

  • The post you linked to was about the nRF5 SDK bootloader, but I remember now that you are using the nRF Connect SDK, which is completely different (using MCUBoot as a bootloader).

    Yep. I'm developing under nFR Connect SDK v2.3.0. The file specified in the link and the one you highlighted is not in v2.3.0. I did some searches and came up empty handed.

  • Are you using a watchdog? And if so, do you still reproduce this issue if not using the watchdog? I am asking as that is a easy way to find out if what you are seeing could be something similar to in the other case (even though the SW is completely different).

  • Are you using a watchdog?

    I assume there's a hardware watchdog because the schematic shows we are using the reset pins. There's currently no code in place for "petting the dog". The 52840 will reset exactly at 2 minute and 30 seconds each time. It reboots properly once I comment out the bootloader and DFU config entries in prj.conf.

    I opened up a private ticket and received some feed back from a Nordic tech engineer. I'll update you on the progress. He's having me try some tests.

Related