This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

HardFault and watchdog?

We've got an application (not using SD or SDK functions) that locks up the nRF51822 and does not reset even though the watchdog is configured and confirmed to work in all situations we can create. I thought I'd found a clue here but in practice when I write code that forces entry to HardFault_Handler, the watchdog does reset the MCU after the timeout expires. Tested with both v2 and v3 nRF51822 chips. We normally run with NRF_WDT->CONFIG.HALT cleared (power-up default), unlike the other question, but I get the same behavior when it's set.

Is it true that the HardFault exception priority is high enough to block the watchdog? If so, what might explains why it works for me in that situation? Is there any other situation where the watchdog might fail?

  • Hi

    Thank you for you observation. I have tested this on my nRF51-DK board (third revision nRF51) as well and the watchdog works fine after a hardfault. It is also logical because the watchdog reset is a HW reset, so it is supposed to work at all time.

    The only thing I could think of is that you write to the NRF_WDT->CONFIG register to set/clear the HALT bit but accidentally clear the SLEEP bit at the same time. If you clear it, the watchdog will halt while the CPU is asleep. But the question is are you asleep after entering hardfault handler? I can go to sleep after hardfault by implementing a hardfault handler in the application with

    void HardFault_Handler(void)  { }
    

    and by going to sleep in the main loop

    power_manage();
    

    could that be the case in your application?

    How is the hardfault generated in your case?

  • Good; a watchdog that couldn't recover from a hard fault seemed pretty useless.

    I generate a hard fault with:

       *(uint32_t *)~0 = 4;
    

    The ultimate cause of the failure I was diagnosing (not directly related to this question) appears to have been a race condition where TASKS_HFCLKSTOP was issued before TASKS_CAL completed, resulting in a situation where LFCLK went out of spec killing both the watchdog and the timer that normally woke the application. This is theory only, consistent with the behavior described for PAN #14 (which doesn't apply to the chips I was using), but the anomalous lock-ups went away when I refactored the request mediation.

Related