This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

SDK 15.3.0: Multiple single-shot timers not running

In migrating a project from SDK 14.2.0 to SDK 15.3.0, I've found that one of my drivers has broken. It's a driver for a GPS module and uses SPI. My test for the driver is getting stuck in a loop waiting for a single shot timer to run. The same test passed on SDK 14.2.0. I can step through the code and see a recurring timer running, with the timeout handler being called. But at the same point I'm running two single shot timers and neither of those are getting their timeout handlers called ever.

Here's the relevant bit of the GPS driver:

// Read up to 255 bytes worth of NMEA sentences from the receiver, split it into lines, and parse each line.
static void gps_rx(void)
{
    ret_code_t err_code = 0;

    // Send no TX. Just listen for the RX.
    memset(m_gps.response, 0, GPS_MAX_CHARS);
    m_gps.spi_transfer_done = false;
    m_gps.spi_transfer_timed_out = false;
    err_code = nrf_drv_spi_transfer(&m_spi, NULL, 0, m_gps.response, GPS_MAX_CHARS);

    // Common errors here are:
    // NRF_ERROR_INVALID_ADDR: If the TX buffer is in flash, not RAM. Assert on this one.
    // NRF_ERROR_BUSY: If we're going too fast. Log this one but carry on.
    if (err_code == NRF_ERROR_BUSY)
    {
        NRF_LOG_WARNING("SPI busy. Slow down.");
    }
    else
    {
        APP_ERROR_CHECK(err_code);
    }

    // Stop the SPI timeout timer, ignoring any error because it will only be that we haven't started it yet the first time, then start it again.
    app_timer_stop(m_gps_spi_timeout_timer_id);
    err_code = app_timer_start(m_gps_spi_timeout_timer_id, GPS_OUTPUT_TIMEOUT, NULL);
    APP_ERROR_CHECK(err_code);

    // TODO: Remove when debugged:
    uint32_t i = 0;

    // Wait here until gps_spi_event_handler() is called by the SPI driver, or it times out.
    while (!m_gps.spi_transfer_done && !m_gps.spi_transfer_timed_out)
    {
        i++;

        // This will check whether nrf_sdh_is_enabled() and use sd_app_evt_wait() if so, but __WFE() and __SEV() if not.
        nrf_pwr_mgmt_run();
    }

    if (m_gps.spi_transfer_timed_out)
    {
        NRF_LOG_ERROR("SPI transfer timed out.");
    }

    // Stop the timeout timer, whether we timed out or not. If we did time out, this is harmless.
    m_gps.spi_transfer_timed_out = false;
    app_timer_stop(m_gps_spi_timeout_timer_id);
    // ...
}

Here's the test:

void test_gps_comms(void)
{
    NRF_LOG_INFO("test_gps_comms()");

    uint32_t err_code = 0;
    extern char m_device_id[DEVICE_ID_CHARACTERS + 1];

    led_init();

    err_code = app_timer_init();
    APP_ERROR_CHECK(err_code);
    APP_SCHED_INIT(sizeof(ble_bts_t), APP_SCHED_QUEUE_SIZE);

    // NRF_ERROR_INVALID_STATE here just means another test has done this already.
    err_code = nrf_drv_gpiote_init();
    if (err_code != NRF_ERROR_INVALID_STATE)
    {
        APP_ERROR_CHECK(err_code);
    }

    pwr_init();

    start_clock();

    gps_init();

    pwr_modem_gps_on();
    gps_power_on();

    // Starting the GPS involves instructing it to give us sentences every second.
    gps_start(NULL, 0);

    APP_TIMER_DEF(test_timer_id);
    err_code = app_timer_create(&test_timer_id, APP_TIMER_MODE_SINGLE_SHOT, test_timeout_handler);
    APP_ERROR_CHECK(err_code);

    // Wait a bit more than a second. 2s seems reasonable.
    err_code = app_timer_start(test_timer_id, APP_TIMER_TICKS(2000), NULL); // 2s
    APP_ERROR_CHECK(err_code);

    NRF_LOG_INFO("test_gps_comms() Waiting up to 2s for sentences");

    // Assert that gps_got_sentences() returns true within time.
    while (!m_timed_out)
    {
        if (gps_got_sentences())
        {
            NRF_LOG_INFO("PASS test_gps_comms() Got NMEA sentences within 2s");
            m_timed_out = false;
            led_off();
            pwr_uninit();

            return;
        }

        tracking_run();
    }

    m_timed_out = false;
    led_off();
    pwr_uninit();

    // Timer timed out without us seeing any NMEA sentences. Fail.
    NRF_LOG_ERROR("FAIL test_gps_comms() No NMEA sentences in time");
}

This test doesn't reach the FAIL line, it just gets stuck in the loop forever, when it should exit that after 2s. Further, the while loop in the GPS driver also gets stuck because m_gps.spi_transfer_timed_out is never true because the timer with ID m_gps_spi_timeout_timer_id never gets its handler called.

My app_config.h is attached and the sdk_config.h is the one from <SDK>/config/nrf52832/config. My timer prescaler is 31. The delay for the single shot timer in the driver is:

#define GPS_OUTPUT_TIMEOUT APP_TIMER_TICKS(2100)

app_config.h

Parents
  • Hi,

    I have not seen or heard about such an issue with the app_timer in SDK 15.3, so I have a few basic questions just to rule out a few things:

    • Have you verified that the timeout handler is not being called by building the project without optimization and setting a breakpoint in the timeout handler?
    • Is m_timed_out (which I assume you set to true in the timeout handler) volatile?

    Another thing that came to mind was that you could see a problem with the operation queue being overflowed? It does not seem likely since you check the return value of the calls to app_timer_start() in the snippets you included in the questions, but I notice that you do not check the return value of your calls to app_timer_stop().

    In which conditions do you trigger this issue? Do you have any code that can reproduce this on a DK?

  • Yes, I use -O0 and -g3 in CFLAGS and I can confirm that a breakpoint set in the timeout handler is not hit.

    m_timed_out is volatile, yes.

    As the comment says, the only reason I don't check the err_code on app_timer_stop() is that I know it'll fail the first time, before the timer has ever been started. If there was a problem with starting the timer right after that, app_timer_start() would return a non-zero value and I'd catch that.

    This issue always triggers. Both in my tests and in production code. It's 100% reproducible.

    I can't run this exact code on a DK because there's isn't any GPS on the DK and the driver needs to be talking SPI to something.

Reply
  • Yes, I use -O0 and -g3 in CFLAGS and I can confirm that a breakpoint set in the timeout handler is not hit.

    m_timed_out is volatile, yes.

    As the comment says, the only reason I don't check the err_code on app_timer_stop() is that I know it'll fail the first time, before the timer has ever been started. If there was a problem with starting the timer right after that, app_timer_start() would return a non-zero value and I'd catch that.

    This issue always triggers. Both in my tests and in production code. It's 100% reproducible.

    I can't run this exact code on a DK because there's isn't any GPS on the DK and the driver needs to be talking SPI to something.

Children
Related