This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

WDT Times Out Prematurely ( nRF9160 )

We're experiencing occasional unexpected hardware watchdog resets. The watchdog is set to 10 minutes, and sometimes just 2 minutes after feeding the WDT we see a reset triggered with the watchdog being specified as the reset reason on boot. Are there any known issues with the WDT? Any thoughts on what might be causing this?

Parents
  • Have you tried setting the WDT to a smaller value to observe how the WDT behaves?

    I have WDT set to 5 seconds and observe no unexpected issues with the WDT.

  • Yes, we have but it's not always practical in our code. Also, the 10 minute timeout seems to work 95% of the time so I'm concerned with masking the issue by simple decreasing the period.

  • When you say 95% of the time. Over how much time on average does the issue manifest? Immediately after boot, hours, days?

  • There's no real predicable pattern. We've seen it minutes after boot and days after boot

  • What exactly is the device doing when it does trip if it is possible to know. Does it trip at the same point everytime or randomly?

  • It has definitely happened in different parts of the code but there's no way to know exactly when so there's a chance it was performing similar types of tasks when it happened.

  • How is it reloaded? Upon a specific task being completed, an event or is it perpetually reloaded throughout the code at a rate that is definitely going to be every few seconds? I know you mentioned that it can happen after only 2 minutes but with mine, the code runs the reload every few lines and I haven't noticed any odd behaviour.

Reply
  • How is it reloaded? Upon a specific task being completed, an event or is it perpetually reloaded throughout the code at a rate that is definitely going to be every few seconds? I know you mentioned that it can happen after only 2 minutes but with mine, the code runs the reload every few lines and I haven't noticed any odd behaviour.

Children
  • We have a work item reloading one channel and the main thread reloading the other channel. The work queue is on a 5 minute interval, however the main thread will reload whenever it processes something. I've changed the work item to 30 seconds now, which may resolve our issue, however it seems like this is a bug regardless.

  • The other issue is that on boot we block on connecting to the cellular network which may take up to several minutes in some cases. During this time nothing else is running to kick the WDT

  • This is probably a bad idea. Have you thought about using a non-blocking method instead?

    For example, setting your modem state, activating the modem and then monitoring the CEREG response periodically to determine connection status, thus leaving your device in a state to which it can respond to events and reload the WDT.