How to add a fallback Hardware watchdog to a device tree/overlay file

Hello,

I am using the task watchdog sample to test the task watchdog subsystem. It is working fine.

I would like to use a hardware watchdog as a fallback; for this I need to define it in the devicetree or an overlay file.

How can I do either of the above?

I am using the following hardware/build environment,

- A proprietary board built around the nRF52833 SoC
- SEGGER Embedded Studio for ARM
  Release 5.60  Build 2021081102.47262
  Nordic Edition
  Windows x64
- Zephyr OS build v2.6.99-ncs1
- SDK 1.7.0
Thank you.
Kind regards
Mohamed
  • Hi Susheel,

    I do not understand what fallback means here? 

    According to the comment in the Nordic's example, it means if the watchdog timer task (based on kernel's timers) does not timeout and does not perform as expected then the hardware watchdog timer will come into action. You should be able to find the sample code at,

    ...\v1.7.0\zephyr\samples\subsys\task_wdt

    I am sorry Susheel but I am not allowed to share the whole main() with you. However, here is a skeleton version of it. Note, the functions dealing with wdt have the prefix, 'wdgt_'.

    void main( void )
    {
        ...[snip]
        LOG_INF( "Booting-up time [%u s]\n", time_s );
    
        /* Set up debug port first */
        system_SetDebugPortAccess();
    
        if ( initialise( device_mode ) == RESULT_OK )   
        {
            ...[snip]
            
            if ( device_mode )
            {
                /* Initialise only the i2c and mp2695 charger to be able to detect 'mains' supply. */
                io_initialise_batt_charger();
    
                /* Initialise the rest of the peripheral devices after the we've detected 'mains' supply. */
                io_initialise();
    
                LOG_INF("Waiting for DC supply - Plug In");
    
                /* Initialise the Watch Dog Timer Task */
                wdgt_init();
    
                /* wait for the charger to be plugged in to signal installation */
                for(;;)
                {
                    batt_charger_get_info();
    
                    if ( ( io.charger->plugged_in == true ) && ( ( io.charger->charging == true ) || ( io.charger->ChargeDone == true ) ) )
                    {
                        /* Update the battery voltage value read from the ADC */
                        io_update_batteryvoltage();
    
                        if ( io.battery.mv_raw >= 2700 ) /*!!!TBD replace magic number 2700 with BATT_DISCHARGE_PROTECT */
                        {
                            LOG_INF( "Batt_raw = %d mV", io.battery.mv_raw );
    
                            break;
                        }
                    }
    
                    LOG_INF( "Plug In DC - Charger Status: %d\n", io.charger->charging | io.charger->ChargeDone );
    
                    /* Feed the watchdog channel timer. */
                    wdgt_refresh();
    
                    k_sleep(K_MSEC(10)); 
                }
    
                LOG_INF("Plugged");
                k_sleep(K_MSEC(10));
    
                /* Initialise the radio state machine. */
                hb_radio_initialise();
    
                k_work_init( &homebeacon_radio_status_check, homebeacon_radio_check_status_tasks );
    
                /* Initialise the tamper button */
                io_tamper_button_initialise();
    
                /* Also, we can seed random number effectively */
                srand( time( NULL ) );
    
                io_led_off( RED_LED );
    
                while ( 1 )
                {
                    ...[snip]
                    
                    if ( flash_static_data_copy_hb.shutdown_flag == POWER_SHUTDOWN_SOURCE_NONE )
                    {
                        /* This is the path for normal operation */
                        if ( b_one_sec_tasks )
                        {
                            b_one_sec_tasks = FALSE;
    
                            /* Refresh the watchdog channel timer. */
                            wdgt_refresh();
    
                        ...[snip]
                        
                        }
                    }/* POWER_SHUTDOWN_SOURCE_NONE */
                    else
                    {
                        /* This is the path when the device is in SHUTDOWN mode */
                        ...[snip]
                    }
                }
            }
            else    /* device mode NOT known */
            {
                ...[snip]
    
                /* What else can we do? */
                LOG_INF( "Unsupported device mode\n" );
            }
        }
        else
        {
            /* Initialisation error condition */
            ...[snip]
    
            /* What else can we do? */
            LOG_INF( "Initialisation Failed\n" );
        }
    
    }
    

    Note the same code works fine if the hardware watchdog is not defined i.e. this line is commented out.

    #define WDT_NODE DT_COMPAT_GET_ANY_STATUS_OKAY(nordic_nrf_watchdog)

    Kind regards
    Mohamed
  • Learner said:
    According to the comment in the Nordic's example, it means if the watchdog timer task (based on kernel's timers) does not timeout and does not perform as expected then the hardware watchdog timer will come into action. You should be able to find the sample code at,

    I see, this is called the callback feature not the fallback feature.

    Learner said:
    According to the comment in the Nordic's example, it means if the watchdog timer task (based on kernel's timers) does not timeout and does not perform as expected then the hardware watchdog timer will come into action. You should be able to find the sample code at,

    I think you misunderstood this feature. the callback in the task_wdt_add will be triggered when the watchdog has already expired. That means that if you call wdt_feed function in this callback it will not reset the wdt timer. This callback function needs to be used in places to run put your device in safe mode where it should expect that the device reset will happen soon.

    If you see in the file zephyr\subsys\task_wdt\task_wdt.c, you will see that the callback is called with task_wdt_trigger->channels[channel_id].callback and the description of this function is here.

    /**
     * @brief Task watchdog timer callback.
     *
     * If the device operates as intended, this function will never be called,
     * as the timer is continuously restarted with the next due timeout in the
     * task_wdt_feed() function.
     *
     * If all task watchdogs have longer timeouts than the hardware watchdog,
     * this function is called regularly (via the background channel). This
     * should be avoided by setting CONFIG_TASK_WDT_MIN_TIMEOUT to the minimum
     * task watchdog timeout used in the application.
     *
     * @param timer_id Pointer to the timer which called the function
     */
    static void task_wdt_trigger(struct k_timer *timer_id)

  • Hi Susheel,

    Thank you.

    I think you misunderstood this feature. the callback in the task_wdt_add will be triggered when the watchdog has already expired

    Yes, I agree.

    I see, this is called the callback feature not the fallback feature.

    Not according to the comment in Nordic's example 

    ...\v1.7.0\zephyr\samples\subsys\task_wdt

    /*
    * To use this sample, either the devicetree's /aliases must have a
    * 'watchdog0' property, or one of the following watchdog compatibles
    * must have an enabled node.
    *
    * If the devicetree has a watchdog node, we get the watchdog device
    * from there. Otherwise, the task watchdog will be used without a
    * hardware watchdog fallback.
    */

    If you see in the file zephyr\subsys\task_wdt\task_wdt.c, you will see that the callback is called with task_wdt_trigger->channels[channel_id].callback

    Does the function static void task_wdt_trigger(struct k_timer *timer_id) get called when the hardware watchdog timer timesout?

    In my prj.conf file I have,

    CONFIG_TASK_WDT_MIN_TIMEOUT=100

    whereas my task watchdog timer is set with 30000 ms 

    task_wdt_id = task_wdt_add(30000U, (task_wdt_callback_t)task_wdt_callback, NULL);

    Must I set CONFIG_TASK_WDT_MIN_TIMEOUT=30000 

    to avoid the problem I ma seeing?

    Kind regards

    Mohamed

  • Hi Susheel,

    For some obscure reason I am not seeing the watchdog timer timing out when I enable the hardware timer. It is rather worrying because I think it will come back at some point. However, let's forget about it for now.

    My question now is this,

    How do I check that the hardware timer will timeout if the the task watchdog does not work as expected?          In other words I would like to simulate the scenario where the hardware watchdog fallback is used instead of the task watchdog as suggested in this Nordic example, see its description below.

    ...\v1.7.0\zephyr\samples\subsys\task_wdt

    /*
    * To use this sample, either the devicetree's /aliases must have a
    * 'watchdog0' property, or one of the following watchdog compatibles
    * must have an enabled node.
    *
    * If the devicetree has a watchdog node, we get the watchdog device
    * from there. Otherwise, the task watchdog will be used without a
    * hardware watchdog fallback.
    */

    Also, I have set my task watchdog timer to 600000 ((one minute) but the maximum value CONFIG_TASK_WDT_MIN_TIMEOUT can take is 10000, so I am setting it to 10000 in prj.conf.
    According to the description of static void task_wdt_trigger() you sent me above I was expecting this function to be called regularly but it is not. 
    Is it because I am refreshing the task watchdog timer every second?
    I am printing the reset reason in my callback function task_wdt_callback_t task_wdt_callback() expecting it to be 2 since it was caused by a task watchdog timeout, instead I am getting reason = 0.
    printf( "Reas = %u\n", NRF_POWER->RESETREAS );
    Can you please explain why this is the case?
    Finally, when I try to write to flash in the callback function task_wdt_callback(), nvs_write() is returning the error ECANCELED (140) meaning the write operation has been cancelled. Is it because, as stated in the task_wdt example comments, the callback has only 61.2 us to complete before the reset occurs and writing to flash is taking longer than this?
    Kind regards
    Mohamed
  • Hi Mohamed,

    We are severely understaffed this week because of the Christmas holidays, and Susheel will have to get back to you on this next week.

    Sorry for the inconvenience.

Related