Beware that this post is related to an SDK in maintenance mode
More Info: Consider nRF Connect SDK for new designs
This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

GPIOTE interrupt stops firing after thousands of successful calls

Hi,

We're running into a frustrating and tricky to debug issue with handling interrupts on the nrf52832. We have an active-low interrupt to pin 28, that is activated periodically (usually at 1 Hz but pushed to 1 kHz to reproduce this bug more quickly). The interrupt is handled in firmware using the nrf_gpiote library:

const bool accuracy = false;  // true = "high accuracy"
drv->gpiote_cfg = (nrf_drv_gpiote_in_config_t)GPIOTE_CONFIG_IN_SENSE_HITOLO(accuracy);
drv->gpio_interrupt_pin = gpio_interrupt_pin;
drv->gpio_interrupt_fired = false;

// Initialize the GPIO interrupt and start handling interrupts.
const ret_code_t err_code =
    nrf_drv_gpiote_in_init(drv->gpio_interrupt_pin,
                           &drv->gpiote_cfg,
                           &interrupt_handler);
ASSERT(err_code == NRF_SUCCESS);

nrf_drv_gpiote_in_event_enable(drv->gpio_interrupt_pin, true);

and the interrupt handler looks like:

void interrupt_handler(nrf_drv_gpiote_pin_t pin,
                       nrf_gpiote_polarity_t action) {
  (void)pin;
  (void)action;
  s_drv->gpio_interrupt_fired = true;
}

In normal operation the interrupt_handler code is called on every interrupt, and we do some i2c operations in response. This works fine, for a while. After some time, ranging from a few seconds to a few minutes, the interrupt_handler just stops being called -- and we have verified on a scope that the interrupt line is going low as expected:

The only way we have been able to make this work consistently is to change the accuracy argument to GPIOTE_CONFIG_IN_SENSE_HITOLO(accuracy) from false to true. Putting it in high accuracy mode seems to make things work reliably (but perhaps just masks the failure/makes it much less likely).

We are using the nRF SDK 14.0.0.
Reading the documentation about low vs high accuracy (infocenter.nordicsemi.com/index.jsp it doesn't seem like low accuracy should ever result in us missing an interrupt entirely, rather there is just less accurate timing about when the interrupt fires.

Any help or debugging advice would be appreciated!

Thanks,

Robbie

Parents
  • Perhaps a race hazard, even though the transactions are slow .. what is driving the 'scope trace? Is s_drv volatile? How is it related to drv? Often even if this behaviour is never observed, a periodic test doesn't hurt, typically driven by an independent timed event:

            // Test for interrupt signal from active-low /DRDY signal
            if (!mAfeSampleDataReady)
            {
                // Check for missed interrupt, perhaps due to hardware race hazard
                if (TimeSinceLastPacket > 10)
                {
                    // Looks suspicious, check if the signal indicates data ready
                    if (nrf_gpio_pin_read(ADS_DRDY_PIN) == 0)
                    {
                        // Set sample pending signal, as it should have been set
                        mAfeSampleDataReady = true;
                        // Update error log
                        mAfeLateDrdyCount++;
                    }
                    else
                    {
                        // Looks like AFE has ceased streaming, try to recover it
                        amInitialized = false;
                        // Update error log
                        mAfeRestartCount++;
                    }
                }
            }
    

  • A periodic app_timer is triggering the TWI write. The peripheral asserts a pin that connects to the nRF52832, which we see in "interrupt goes low". The pin is only de-asserted when we issue a subsequent TWI read to read the results, so it's effectively a latching mechanism.

    It's very interesting to us that setting high accuracy mode on the GPIOTE configuration "fixes" this, and we never see any "missed" interrupts. This led to the question of the data sheet: we expect low-accuracy GPIOTE events to miss pulse functions, but will we ever miss simple step functions?

Reply
  • A periodic app_timer is triggering the TWI write. The peripheral asserts a pin that connects to the nRF52832, which we see in "interrupt goes low". The pin is only de-asserted when we issue a subsequent TWI read to read the results, so it's effectively a latching mechanism.

    It's very interesting to us that setting high accuracy mode on the GPIOTE configuration "fixes" this, and we never see any "missed" interrupts. This led to the question of the data sheet: we expect low-accuracy GPIOTE events to miss pulse functions, but will we ever miss simple step functions?

Children
No Data
Related