This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Zephyr Task WDT doesn't cause a system reset, hangs system

Hello, 

I am trying to add a watchdog timer to my code which consists of THREE threads, so I referred to the task_wdt example in Zephyr/Samples/Subsys for SDK1.6.1. 

Based on the example, I initialize the WDT and add a sleep > watchdog time after a count of 50, but instead of calling a wdt callback or system reset, my code just gets stuck into an undefined state. 

My config and my code snippet is given below: 

CONFIG_WATCHDOG=y
CONFIG_WDT_LOG_LEVEL_DBG=y
CONFIG_WDT_DISABLE_AT_BOOT=y

CONFIG_TASK_WDT=y
CONFIG_TASK_WDT_MIN_TIMEOUT=50
CONFIG_TASK_WDT_CHANNELS=5

Code Snippet: 

**********************************************************************************************

int count = 0;
const struct device *hw_wdt_dev = DEVICE_DT_GET(WDT_NODE);

if (!device_is_ready(hw_wdt_dev)) {
printk("Hardware watchdog %s is not ready; ignoring it.\n",
hw_wdt_dev->name);
hw_wdt_dev = NULL;
}
task_wdt_init(hw_wdt_dev);

task_wdt_id = task_wdt_add(100U, task_wdt_callback,
(void *)k_current_get());

while (true) {
if (count == 50) {
printk("Control thread getting stuck...\n");
k_sleep(K_FOREVER);
}

task_wdt_feed(task_wdt_id);
k_sleep(K_MSEC(50));
count++;
}

*********************************************************************************

So here are my questions: 

1. Does the task watchdog work the same way as system watchdog in terms of system reset or does it only kill the thread and lets the system run. 

2. Is there a way to test this is debug mode? Like setting a breakpoint on main() and checking if it hits main() post watchdog firing? 

3. My requirement is that if any thread is stuck for longer than a usual time, I want the system to reboot and restart all threads, am I missing something in config/understanding?  

Parents
  • Hi,

    Based on the example, I initialize the WDT and add a sleep > watchdog time after a count of 50, but instead of calling a wdt callback or system reset, my code just gets stuck into an undefined state. 

    I tried reproducing this, but it's working fine here. Could you post your whole main.c and prj.conf?

    Does the unmodified sample work as expected? What board target did you build for?

Reply
  • Hi,

    Based on the example, I initialize the WDT and add a sleep > watchdog time after a count of 50, but instead of calling a wdt callback or system reset, my code just gets stuck into an undefined state. 

    I tried reproducing this, but it's working fine here. Could you post your whole main.c and prj.conf?

    Does the unmodified sample work as expected? What board target did you build for?

Children
  • /*
     * Copyright (c) 2016 Intel Corporation
     * Copyright (c) 2020 Nordic Semiconductor ASA
     *
     * SPDX-License-Identifier: Apache-2.0
     */
    
    /**
     * @file main runtime loop for Gem.
     */
    
    // MAIN LIBS
    #include <zephyr.h>
    #include <sys/printk.h>
    #include <device.h>
    #include <string.h>
    #include <date_time.h>
    #include <stdio.h>
    
    // LOGGING
    #include <logging/log.h>
    #include <logging/log_ctrl.h>
    
    // PERIPERHALS
    #include <drivers/pwm.h>
    #include <drivers/i2c.h>
    #include <drivers/spi.h>
    #include <drivers/gpio.h>
    #include <drivers/led_strip.h>
    
    // HAL Drivers
    #include <pcf8575.h>
    #include <veml7700.h>
    #include <status_led.h>
    #include <ws2812_driver.h>
    #include <networking.h>
    #include <nvs.h>
    #include <update.h>
    #include <sys/reboot.h>
    #include <semver.h>
    
    // DFU
    #include <dfu/mcuboot.h>
    #include <dfu/dfu_target_mcuboot.h>
    #include <modem/nrf_modem_lib.h>
    #include <net/fota_download.h>
    
    // RUNTIME CONTROLLERS
    #include <server_controller.h>
    #include <ui_controller.h>
    
    // version 1.0.0
    #include "config.h"
    #include <drivers/watchdog.h>
    #include <task_wdt/task_wdt.h>
    
    // Check every minute to see if the heartbeat has been flatline
    #define HEARTBEAT_TIMEOUT_CHECK 1000*60
    #define WDT_NODE DT_COMPAT_GET_ANY_STATUS_OKAY(nordic_nrf_watchdog)
    
    LOG_MODULE_REGISTER(main);
    
    void main(void)
    {
    
            uint32_t Reset_reason = NRF_POWER->RESETREAS;
            LOG_INF("Previous Reset Reason: %d", Reset_reason);
            NRF_POWER->RESETREAS = 0; 
    
            LOG_INF("Booting FW Version: %s", log_strdup(FW_VERSION));
    
    	struct device *gpio_dev;
    
    	gpio_dev = device_get_binding("GPIO_0");
    
            // Turns on the 5V rail
            gpio_pin_configure(gpio_dev, 6, GPIO_OUTPUT); //p0.06
            gpio_pin_set(gpio_dev, 6, 0);
    
            const struct device *hw_wdt_dev = DEVICE_DT_GET(WDT_NODE);
    
            if (!device_is_ready(hw_wdt_dev)) {
    	  printk("Hardware watchdog %s is not ready; ignoring it.\n",
    		hw_wdt_dev->name);
    	  hw_wdt_dev = NULL;
            }
    
            task_wdt_init(hw_wdt_dev);
    
            int task_wdt_id = task_wdt_add(2000, NULL, NULL);
    
            int attempts = 0; 
    
            // 3x attempts to connect to LTE network, then retry again every minute 
            do {
              LOG_INF("Connecting to server (attempt: %d)", attempts);
              connect_to_server();
              attempts++;
            } while ( (!get_server_status()) && (attempts < 3) );
            
            task_wdt_feed(task_wdt_id); 
    
    	while (1) {      
              task_wdt_feed(task_wdt_id); 
              k_sleep(K_SECONDS(1U));
            }
    }
    

    4338.prj.conf

    So here are the two files - In my main, I initialize WDT at bootup and start it and then there is the connect() do_while loop, which takes roughly 10 seconds to complete so I want my wdt to fire and reboot just for testing, but instead it just hangs indefinitely. 

    And yes, the example worked for me. Not sure what I'm missing here.

  • Hi,

    I think you are running into a issue that is fixed in the latest release.

    See this commit: 

    https://github.com/nrfconnect/sdk-zephyr/commit/df1125a58c8a313b6518ef44054d6c25f37e15a4

    There are also some other bugs in task_wdt that is fixed in the latest release.

Related