APP_TIMER problem, handler is not always called

Hi nrf masters!

We have a problem quite late in a large development project and urgently need assistants. We have spent hour's reading all posts about this and debugging to our best knowledge. The problem is that the a timer (app_timer2.c) is not always fired, the handler for the timer is not always called. I will do me best to give necessary information.

We have an external hardware that gives an input signal to the processor, this signal is handled by a GPIOTE that is triggering an interrupt with sense TOGGLE. At our custom circuit board we have external pull up handling therefore it is not configured in code during pin GPIOTE configuration.  

static void alarm_pin_init()
{
    ret_code_t err_code;

    /* err_code = nrf_drv_gpiote_init(); Called elsewhere in the code 
    APP_ERROR_CHECK(err_code); */

    nrf_drv_gpiote_in_config_t in_config = GPIOTE_CONFIG_IN_SENSE_TOGGLE(true);
    in_config.pull = NRF_GPIO_PIN_NOPULL;

    err_code = nrf_drv_gpiote_in_init(ALARM_SIGNAL_2, &in_config, alarm_pin_handler);
    APP_ERROR_CHECK(err_code);

    nrf_drv_gpiote_in_event_enable(ALARM_SIGNAL_2, true);
}

The GPIOTE interrupt (function alarm_pin_handler) handler looks like this:

static void alarm_pin_handler(nrf_drv_gpiote_pin_t pin, nrf_gpiote_polarity_t action)
{
    /**@requirement SW8, SW45*/
    ret_code_t err_code;
    
    static uint32_t prev_cnt;
    static uint32_t curr_cnt;
    uint32_t alarm_pulses = 0;
    uint32_t no_alarm_pulses = 0;
    

    
    curr_cnt = app_timer_cnt_get();

    if (nrf_drv_gpiote_in_is_set(pin))
    {
        low_time[pulses] = app_timer_cnt_diff_compute(curr_cnt, prev_cnt);
        //NRF_LOG_DEBUG("ticks LOW: %d   %d", low_time[pulses], pulses);
    }
    else
    {
        high_time[pulses] = app_timer_cnt_diff_compute(curr_cnt, prev_cnt);
        //NRF_LOG_DEBUG("ticks HIGH: %d   %d", high_time[pulses], pulses);
        pulses++;
        if(!(pulses%SAMPLE_DEPTH))
        {
            comm_au_hb_relay(); /**@requirement SW14*/
            pulses = 0;
        }
    }

    for (uint16_t i=0; i<SAMPLE_DEPTH; i++)
    {
        //if (high_time[i] < low_time[i] && high_time[i]!=0)
        if (high_time[i] > low_time[i] && low_time[i]!=0)
        {
            alarm_pulses++;
            
        }
        else
        {
            no_alarm_pulses++;
        }
    }

    if (alarm_pulses == SAMPLE_DEPTH)
    {  
      if(get_au_status() == 100)
      {
	    if(!is_high_alert_signalled())
	    {
	        NRF_LOG_INFO("******************ALARM**************************");
	        logging_write_log_post(LOG_ALARM, NULL);
	    }
	    set_alert_level(BLE_CHAR_ALERT_LEVEL_HIGH_ALERT); /**@requirement SW21 */
        }
    }
    if (no_alarm_pulses == SAMPLE_DEPTH)
    {
        if(is_high_alert_signalled())
        {
            NRF_LOG_INFO("ALARM CLEARED (AU signal OK)");
        }
        set_alert_level(BLE_CHAR_ALERT_LEVEL_NO_ALERT);

        if(!treatment_ongoing)
        {
            logging_write_log_post(LOG_TREATMENT_START, NULL);
            treatment_ongoing = true;
            led_set_intense(HEARTBEAT_USER, LED_LOW, true);
        }

        if(ble_conn_connected())
        {
          set_keep_alive(true); /**@requirement SW8*/
        }
    }

    prev_cnt = curr_cnt;
    
    err_code = app_timer_stop(m_alarm_pin_timer);
    APP_ERROR_CHECK(err_code);

    err_code = app_timer_start(m_alarm_pin_timer, APP_TIMER_TICKS(2000), NULL);
    APP_ERROR_CHECK(err_code);
}

GPIOTE interrupt are generated 3 times every 600ms. Each time a timer is restarted with app_timer_stop and then restarted with app_timer_start (see the bottom of the alarm_pin_handler function interrupt handler. The logic is so that if the signal disappears the timer will fire and call the timer handler.

The timer is created like this:

APP_TIMER_DEF(m_alarm_pin_timer);

err_code = app_timer_create(&m_alarm_pin_timer, APP_TIMER_MODE_SINGLE_SHOT, alarm_pin_timer_handler);
APP_ERROR_CHECK(err_code);

  

The bug occurs when the following happens: The external hardware that is connected is disconnected, physically removed. Meaning that no GPIOTE events will be triggered since the signal has been removed.

The Softdevice is enabled so the LFCLK should be active. When the error occurs the rest of the system continues to work, other timers that handle LED blinks and so on works fine. The RTC1 cunter is also counting and have not stopped. If we connect back the external hardware so that new GPIOTE events are raised it continues to work and the timer is started again as expected.          

We cannot predict the behavior and it is frustrating to debug, it works sometimes and sometimes not.

no APP_ERROR_CHECK are reported from the SDK or softdevice. We have tested to use the APP_TIMER_CONFIG_USE_SCHEDULER and APP_TIMER_WITH_PROFILER but with no success either. IDE SEGGER embedded studio is used together with SEGGER J-link Ultra + for programming.

What could cause this behavior? The interrupt priority is set to 6 on both GPIOTE and APP_TIMER, could some events be triggered so that the SDK misses a timer?

SDK config file:

166878.sdk_config.h

Parents
  • We have continued debugging and testing different solutions. There seems to be one thing that have solved the problem or at least greatly improved it. As of late last week we cannot reproduce the problem, hopefully it is fixed.  

    This is what we have done: 

    In the alarm_pin_handler (GPIOTE event handler) the first thing we do is disabling new events with nrf_drv_gpiote_in_event_disable(ALARM_SIGNAL_2);

    New events are first enabled again after 100ms. The idea is to use this as a simple filter to prevent the event handler to be called multiple times during disconnection of the external hardware, debouncing. In the alarm_pin_handler we are sending a BLE package with ble_nus_c_string_send() at line 27 (inside the comm_au_hb_relay()). Perhaps this can cause a problem if it is called to fast or that it gets interrupted in some unlucky situation?

    static void alarm_pin_handler(nrf_drv_gpiote_pin_t pin, nrf_gpiote_polarity_t action)
    {   
        nrf_drv_gpiote_in_event_disable(ALARM_SIGNAL_2);
        /**@requirement SW8, SW45*/
        ret_code_t err_code;
        
        static uint32_t prev_cnt;
        static uint32_t curr_cnt;
        uint32_t alarm_pulses = 0;
        uint32_t no_alarm_pulses = 0;
        
        /* Store current RTC1 value, later compared in main() to detect the bug. */
        au_rtc1_counter = curr_cnt = app_timer_cnt_get();
    
        if (nrf_drv_gpiote_in_is_set(pin))
        {
            low_time[pulses] = app_timer_cnt_diff_compute(curr_cnt, prev_cnt);
            NRF_LOG_DEBUG("ticks LOW: %d   %d", low_time[pulses], pulses);
        }
        else
        {
            high_time[pulses] = app_timer_cnt_diff_compute(curr_cnt, prev_cnt);
            NRF_LOG_DEBUG("ticks HIGH: %d   %d", high_time[pulses], pulses);
            pulses++;
            if(!(pulses%SAMPLE_DEPTH))
            {
                comm_au_hb_relay(); /**@requirement SW14*/
                pulses = 0;
            }
        }
    
        for (uint16_t i=0; i<SAMPLE_DEPTH; i++)
        {
            //if (high_time[i] < low_time[i] && high_time[i]!=0)
            if (high_time[i] > low_time[i] && low_time[i]!=0)
            {
                alarm_pulses++;
                
            }
            else
            {
                no_alarm_pulses++;
            }
        }
    
        if (alarm_pulses == SAMPLE_DEPTH)
        {  
          if(get_au_status() == 100)
          {
    	    if(!is_high_alert_signalled())
    	    {
    	        NRF_LOG_INFO("******************ALARM**************************");
    	        logging_write_log_post(LOG_ALARM, NULL);
    	    }
    	    set_alert_level(BLE_CHAR_ALERT_LEVEL_HIGH_ALERT); /**@requirement SW21 */
            }
        }
        if (no_alarm_pulses == SAMPLE_DEPTH)
        {
            if(is_high_alert_signalled())
            {
                NRF_LOG_INFO("ALARM CLEARED (AU signal OK)");
            }
            set_alert_level(BLE_CHAR_ALERT_LEVEL_NO_ALERT);
    
            if(!treatment_ongoing)
            {
                logging_write_log_post(LOG_TREATMENT_START, NULL);
                treatment_ongoing = true;
                led_set_intense(HEARTBEAT_USER, LED_LOW, true);
                set_keep_alive(true); /**@requirement SW8*/
            }
        }
    
        prev_cnt = curr_cnt;
        
        err_code = app_timer_stop(m_alarm_pin_timer);
        APP_ERROR_CHECK(err_code);
    
        err_code = app_timer_start(m_alarm_pin_timer, APP_TIMER_TICKS(2000), NULL);
        APP_ERROR_CHECK(err_code);
    
        err_code = app_timer_start(m_gpiote_timer, APP_TIMER_TICKS(100), NULL);
        APP_ERROR_CHECK(err_code);
    }

    But since the app_timer is the problem and if the new timer fails, it will not re-enable the events and that would be very bad.

    We have also added an extra supervision/monitor of the alarm_pin_handler from main loop where we check the last time the the event handler was called. In this way we can determine if the event handler have been called and how long it have been since the last event was generated, and therefore we can detect if the timer should have been fired or not.

    if(au_rtc1_counter != 0 && (app_timer_cnt_diff_compute(app_timer_cnt_get() ,au_rtc1_counter) > AU_RTC1_SHUTDOWN_BUG)) /* 1 minute passed since last AU signal was detected. */
    {
      set_keep_alive(false);
    }

    What is frustrating is that we do not know why this works since we cannot find why the problem occurs in the first place. We still need some support from you at NRF on this manner or if someone has any suggestion.

Reply
  • We have continued debugging and testing different solutions. There seems to be one thing that have solved the problem or at least greatly improved it. As of late last week we cannot reproduce the problem, hopefully it is fixed.  

    This is what we have done: 

    In the alarm_pin_handler (GPIOTE event handler) the first thing we do is disabling new events with nrf_drv_gpiote_in_event_disable(ALARM_SIGNAL_2);

    New events are first enabled again after 100ms. The idea is to use this as a simple filter to prevent the event handler to be called multiple times during disconnection of the external hardware, debouncing. In the alarm_pin_handler we are sending a BLE package with ble_nus_c_string_send() at line 27 (inside the comm_au_hb_relay()). Perhaps this can cause a problem if it is called to fast or that it gets interrupted in some unlucky situation?

    static void alarm_pin_handler(nrf_drv_gpiote_pin_t pin, nrf_gpiote_polarity_t action)
    {   
        nrf_drv_gpiote_in_event_disable(ALARM_SIGNAL_2);
        /**@requirement SW8, SW45*/
        ret_code_t err_code;
        
        static uint32_t prev_cnt;
        static uint32_t curr_cnt;
        uint32_t alarm_pulses = 0;
        uint32_t no_alarm_pulses = 0;
        
        /* Store current RTC1 value, later compared in main() to detect the bug. */
        au_rtc1_counter = curr_cnt = app_timer_cnt_get();
    
        if (nrf_drv_gpiote_in_is_set(pin))
        {
            low_time[pulses] = app_timer_cnt_diff_compute(curr_cnt, prev_cnt);
            NRF_LOG_DEBUG("ticks LOW: %d   %d", low_time[pulses], pulses);
        }
        else
        {
            high_time[pulses] = app_timer_cnt_diff_compute(curr_cnt, prev_cnt);
            NRF_LOG_DEBUG("ticks HIGH: %d   %d", high_time[pulses], pulses);
            pulses++;
            if(!(pulses%SAMPLE_DEPTH))
            {
                comm_au_hb_relay(); /**@requirement SW14*/
                pulses = 0;
            }
        }
    
        for (uint16_t i=0; i<SAMPLE_DEPTH; i++)
        {
            //if (high_time[i] < low_time[i] && high_time[i]!=0)
            if (high_time[i] > low_time[i] && low_time[i]!=0)
            {
                alarm_pulses++;
                
            }
            else
            {
                no_alarm_pulses++;
            }
        }
    
        if (alarm_pulses == SAMPLE_DEPTH)
        {  
          if(get_au_status() == 100)
          {
    	    if(!is_high_alert_signalled())
    	    {
    	        NRF_LOG_INFO("******************ALARM**************************");
    	        logging_write_log_post(LOG_ALARM, NULL);
    	    }
    	    set_alert_level(BLE_CHAR_ALERT_LEVEL_HIGH_ALERT); /**@requirement SW21 */
            }
        }
        if (no_alarm_pulses == SAMPLE_DEPTH)
        {
            if(is_high_alert_signalled())
            {
                NRF_LOG_INFO("ALARM CLEARED (AU signal OK)");
            }
            set_alert_level(BLE_CHAR_ALERT_LEVEL_NO_ALERT);
    
            if(!treatment_ongoing)
            {
                logging_write_log_post(LOG_TREATMENT_START, NULL);
                treatment_ongoing = true;
                led_set_intense(HEARTBEAT_USER, LED_LOW, true);
                set_keep_alive(true); /**@requirement SW8*/
            }
        }
    
        prev_cnt = curr_cnt;
        
        err_code = app_timer_stop(m_alarm_pin_timer);
        APP_ERROR_CHECK(err_code);
    
        err_code = app_timer_start(m_alarm_pin_timer, APP_TIMER_TICKS(2000), NULL);
        APP_ERROR_CHECK(err_code);
    
        err_code = app_timer_start(m_gpiote_timer, APP_TIMER_TICKS(100), NULL);
        APP_ERROR_CHECK(err_code);
    }

    But since the app_timer is the problem and if the new timer fails, it will not re-enable the events and that would be very bad.

    We have also added an extra supervision/monitor of the alarm_pin_handler from main loop where we check the last time the the event handler was called. In this way we can determine if the event handler have been called and how long it have been since the last event was generated, and therefore we can detect if the timer should have been fired or not.

    if(au_rtc1_counter != 0 && (app_timer_cnt_diff_compute(app_timer_cnt_get() ,au_rtc1_counter) > AU_RTC1_SHUTDOWN_BUG)) /* 1 minute passed since last AU signal was detected. */
    {
      set_keep_alive(false);
    }

    What is frustrating is that we do not know why this works since we cannot find why the problem occurs in the first place. We still need some support from you at NRF on this manner or if someone has any suggestion.

Children
No Data
Related