This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

Ending up in WDT_IRQHandler() without knowingly using the watchdog timer

Hi.

I have a custom board based on the nRF51822. Application uses s130, SDK 11.0.0. It's a peripheral. Using gcc.

I'm finding that when field testing my device, it will often stop running. If I then bring it back to the desk and attach gdb without resetting, I can see I'm in the watchdog timer IRQ handler:

$ make gdb-client-no-reset
...
0x0001edce in WDT_IRQHandler ()
...
(gdb) bt full
#0  0x0001ed84 in Reset_Handler ()
No symbol table info available.
(gdb) info symbol 0x0001edce
WDT_IRQHandler in section .text

Why on earth would that be? I'm not setting up the watchdog timer in my application at all. In nrf_drv_config.h:

#define WDT_ENABLED 0

However, a bunch of the IRQ handlers are ending up at that same address, if I check my .Map file:

 .text          0x000000000001ed84       0x4c _build/gcc_startup_nrf51.os
                0x000000000001ed84                Reset_Handler
                0x000000000001edc4                NMI_Handler
                0x000000000001edc8                SVC_Handler
                0x000000000001edca                PendSV_Handler
                0x000000000001edcc                SysTick_Handler
                0x000000000001edce                SWI4_IRQHandler
                0x000000000001edce                TEMP_IRQHandler
                0x000000000001edce                QDEC_IRQHandler
                0x000000000001edce                SWI5_IRQHandler
                0x000000000001edce                TIMER0_IRQHandler
                0x000000000001edce                TIMER1_IRQHandler
                0x000000000001edce                ECB_IRQHandler
                0x000000000001edce                Default_Handler
                0x000000000001edce                SWI3_IRQHandler
                0x000000000001edce                CCM_AAR_IRQHandler
                0x000000000001edce                WDT_IRQHandler
                0x000000000001edce                RNG_IRQHandler
                0x000000000001edce                TIMER2_IRQHandler
                0x000000000001edce                SWI1_IRQHandler
                0x000000000001edce                RTC0_IRQHandler
                0x000000000001edce                POWER_CLOCK_IRQHandler
                0x000000000001edce                RADIO_IRQHandler
                0x000000000001edce                LPCOMP_IRQHandler

(Edit) The IPSR is 0x18 at the time:

(gdb) mon regs
R0 = 00000060, R1 = 0001EDCF, R2 = 680ED406, R3 = 00008F7B
R4 = 20007F87, R5 = 3E894F53, R6 = FFFFFFFF, R7 = 00000000
R8 = FFFFFFFF, R9 = FFFFFFFF, R10= 1FFF8000, R11= 00000000
R12= FFFF0001, R13= 20007F38, MSP= 20007F38, PSP= FFFFFFFC
R14(LR) = FFFFFFF9, R15(PC) = 0001ED84
XPSR 91000018, APSR 90000000, EPSR 01000000, IPSR 00000018
CFBP 00000000, CONTROL 00, FAULTMASK 00, BASEPRI 00, PRIMASK 00

(Edit 2) Which is the Timer 0 IRQ, from nrf51.h:

/* ----------------------  nrf51 Specific Interrupt Numbers  ---------------------- */
  POWER_CLOCK_IRQn              =   0,              /*!<   0  POWER_CLOCK                                                      */
  RADIO_IRQn                    =   1,              /*!<   1  RADIO                                                            */
  UART0_IRQn                    =   2,              /*!<   2  UART0                                                            */
  SPI0_TWI0_IRQn                =   3,              /*!<   3  SPI0_TWI0                                                        */
  SPI1_TWI1_IRQn                =   4,              /*!<   4  SPI1_TWI1                                                        */
  GPIOTE_IRQn                   =   6,              /*!<   6  GPIOTE                                                           */
  ADC_IRQn                      =   7,              /*!<   7  ADC                                                              */
  TIMER0_IRQn                   =   8,              /*!<   8  TIMER0                                                           */
  TIMER1_IRQn                   =   9,              /*!<   9  TIMER1                                                           */
  TIMER2_IRQn                   =  10,              /*!<  10  TIMER2                                                           */
  RTC0_IRQn                     =  11,              /*!<  11  RTC0                                                             */
  TEMP_IRQn                     =  12,              /*!<  12  TEMP                                                             */
  RNG_IRQn                      =  13,              /*!<  13  RNG                                                              */
  ECB_IRQn                      =  14,              /*!<  14  ECB                                                              */
  CCM_AAR_IRQn                  =  15,              /*!<  15  CCM_AAR                                                          */
  WDT_IRQn                      =  16,              /*!<  16  WDT                                                              */
  RTC1_IRQn                     =  17,              /*!<  17  RTC1                                                             */
  QDEC_IRQn                     =  18,              /*!<  18  QDEC                                                             */
  LPCOMP_IRQn                   =  19,              /*!<  19  LPCOMP                                                           */
  SWI0_IRQn                     =  20,              /*!<  20  SWI0                                                             */
  SWI1_IRQn                     =  21,              /*!<  21  SWI1                                                             */
  SWI2_IRQn                     =  22,              /*!<  22  SWI2                                                             */
  SWI3_IRQn                     =  23,              /*!<  23  SWI3                                                             */
  SWI4_IRQn                     =  24,              /*!<  24  SWI4                                                             */
  SWI5_IRQn                     =  25               /*!<  25  SWI5                                                             */

(Edit 3) On another occurrence of the same behaviour in the field, I reattach gdb at the desk and find myself in the reset handler with the ISPR set to zero:

$ make gdb-client-no-reset

Reading symbols from _build/biketracker_app_s130.elf...done.
0x0002b236 in nrf_delay_us (number_of_us=999) at /Users/Eliot/dev/nRF5_SDK_11.0.0_89a8197/components/drivers_nrf/delay/nrf_delay.h:166
166	__ASM volatile (
...
(gdb) mon regs
R0 = 000000CC, R1 = 00000003, R2 = 00000754, R3 = 000003E7
R4 = 00000000, R5 = 00000000, R6 = FFFFFFFF, R7 = 00000000
R8 = FFFFFFFF, R9 = FFFFFFFF, R10= 1FFF8000, R11= 00000000
R12= FFFFFFFF, R13= 20007F88, MSP= 20007F88, PSP= FFFFFFFC
R14(LR) = 0002067B, R15(PC) = 0001ED84
XPSR 21000000, APSR 20000000, EPSR 01000000, IPSR 00000000
CFBP 00000000, CONTROL 00, FAULTMASK 00, BASEPRI 00, PRIMASK 00
(gdb) bt full
#0  0x0001ed84 in Reset_Handler ()
No symbol table info available.

(Edit 4)

My gdb-client-no-reset make target does this

gdb-client-no-reset:
	printf "target remote localhost:2331\nload\neval \"monitor exec SetRTTAddr %%p\", &_SEGGER_RTT\n" > $(OUTPUT_PATH).gdbinit-noreset
	$(GDB) -x $(OUTPUT_PATH).gdbinit-noreset $(OUTPUT_PATH)*.elf

(Edit 5)

Attaching my error handling functions. error.c

(Edit 6)

Since this was working for me in SDK 10 and is not in SDK 11, here's the diff between the TWI library across SDKs:

diff -r nRF51_SDK_10.0.0_dc26b5e/components/libraries/twi/app_twi.c nRF5_SDK_11.0.0_89a8197/components/libraries/twi/app_twi.c
13d12
< #include <stdbool.h>
16a16
> #include "sdk_common.h"
74c74
< static ret_code_t start_transfer(app_twi_t const * p_app_twi)
---
> static ret_code_t start_transfer(app_twi_t * p_app_twi)
85,89c85,112
<     if (APP_TWI_IS_READ_OP(p_transfer->operation))
<     {
<         return nrf_drv_twi_rx(&p_app_twi->twi, address,
<             p_transfer->p_data, p_transfer->length,
<             (p_transfer->flags & APP_TWI_NO_STOP));
---
>     nrf_drv_twi_xfer_desc_t xfer_desc;
>     uint32_t                flags;
> 
>     xfer_desc.address       = address;
>     xfer_desc.p_primary_buf = p_transfer->p_data;
>     xfer_desc.primary_length = p_transfer->length;
> 
>     /* If it is possible try to bind two transfers together. They can be combined if:
>      * - there is no stop condition after current transfer.
>      * - current transfer is TX.
>      * - there is at least one more transfer in the transaction.
>      * - address of next trnasfer is the same as current transfer.
>      */
>     if ((p_transfer->flags & APP_TWI_NO_STOP) &&
>         !APP_TWI_IS_READ_OP(p_transfer->operation) &&
>         ((current_transfer_idx+1) < p_app_twi->p_current_transaction->number_of_transfers) &&
>         APP_TWI_OP_ADDRESS(p_transfer->operation) ==
>         APP_TWI_OP_ADDRESS(p_app_twi->p_current_transaction->p_transfers[current_transfer_idx+1].operation)
>     )
>     {
>         app_twi_transfer_t const * p_second_transfer =
>             &p_app_twi->p_current_transaction->p_transfers[current_transfer_idx+1];
>         xfer_desc.p_secondary_buf = p_second_transfer->p_data;
>         xfer_desc.secondary_length = p_second_transfer->length;
>         xfer_desc.type = APP_TWI_IS_READ_OP(p_second_transfer->operation) ? NRF_DRV_TWI_XFER_TXRX :
>                                                                             NRF_DRV_TWI_XFER_TXTX;
>         flags = (p_second_transfer->flags & APP_TWI_NO_STOP) ? NRF_DRV_TWI_FLAG_TX_NO_STOP : 0;
>         p_app_twi->current_transfer_idx++;
93,95c116,120
<         return nrf_drv_twi_tx(&p_app_twi->twi, address,
<             p_transfer->p_data, p_transfer->length,
<             (p_transfer->flags & APP_TWI_NO_STOP));
---
>         xfer_desc.type = APP_TWI_IS_READ_OP(p_transfer->operation) ? NRF_DRV_TWI_XFER_RX :
>                 NRF_DRV_TWI_XFER_TX;
>         xfer_desc.p_secondary_buf = NULL;
>         xfer_desc.secondary_length = 0;
>         flags = (p_transfer->flags & APP_TWI_NO_STOP) ? NRF_DRV_TWI_FLAG_TX_NO_STOP : 0;
96a122,123
> 
>     return nrf_drv_twi_xfer(&p_app_twi->twi, &xfer_desc, flags);
180c207
<     if (p_event->type != NRF_DRV_TWI_ERROR)
---
>     if (p_event->type == NRF_DRV_TWI_EVT_DONE)
238,241c265,266
<     if (err_code != NRF_SUCCESS)
<     {
<         return err_code;
<     }
---
>     VERIFY_SUCCESS(err_code);
> 
339,342c364
<         if (result != NRF_SUCCESS)
<         {
<             return result;
<         }
---
>         VERIFY_SUCCESS(result);

This is only at the app_twi_() level of course. At the nrf_drv_() level, there are 2K lines of code difference, too much to post here.

  • Well I do use the app timer library all over the place. But doesn't that use the RTC peripheral instead? In any case, I'm not explicitly enabling the Timer 0 interrupt anywhere that I'm aware of. I recently upgraded from SDK 10 to SDK 11. Could that have anything to do with it?

  • Stop and think, what uses TIMER0, the softdevice uses TIMER0, you are using the softdevice so the thing turning TIMER0 on is probably the softdevice. So why is it forwarding to your interrupt handler instead of handling it itself? So one possibility, you're enabling and disabling the softdevice and you've found a bug where disabling it leaves a TIMER0 interrupt outstanding (despite what sd_softdevice_disable() says). Does your code bang sd_softdevice_enable()/disable()? There's other possibilities, that you really enabling it in your code but have forgotten where or why, that would be unlikely as it would require INTENSET on the peripheral and the interrupt enabled. Obviously the SDK doesn't use TIMER0 because it's a restricted peripheral in the softdevice and the SDK avoids those, so that's not a good guess.

    also go check what's stacked by the interrupt, where did you come from?

  • @RK OK, thanks. My application will generally cycle through states of sys off -> on, but with the SD not yet enabled -> SD enabled and then BLE advertising or connected -> sys off again. I never explicitly call sd_softdevice_disable(), I just go straight to sys off. Nowhere in my application code do deal with an INTENSET explicitly. Added another edit above. Can't get any stack trace from gdb here. Is there another way to look at the stack that I'm missing?

  • In this case you are in the reset handler. In fact, looking at the PC from the original register dump, you were in the reset handler too, so I don't know where the whole WDT question came from. Why did you originally ask for the symbol at 0x0001edce, or is the register dump after that from a different occasion?

    In the reset handler there IS no more stack trace. The chip has reset and jumped to the first instruction and is executing (what's actually at 0x0001ED84). It looks here more like you're in a reset loop, crashing very early in the code and resetting, then doing it over and over again.

    it makes no sense that you are at the start of the reset handler in interrupt context nor with the stack at 0x20007f88 either.

    are you sure also that gdb-client-no-reset does what you think? I'd attach a jlink directly and just do h and regs to see where the thing is.

  • Indeed, apologies for the confusion, but when I first attached gdb it said "0x0001edce in WDT_IRQHandler ()" and I believed it. It now looks like I was in the reset handler there in fact. So I now have two questions: 1. Why am I not progressing from the reset handler and into my code? and 2. Why am I resetting in the first place? What I'll do now is alter my assert handling functions to not reset but instead just sit there waiting for me to reconnect gdb, then I can get a stack trace. That'll help with #2 but I'm still confused by #1. Will report back. Thanks.

Related