This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

SDK14, S132 v5.0.0, nRF52832: Unexpectedly high RTC drift?

I've modified the app_timer library slightly to keep total system uptime, which I can use to keep calendar/epoch time if initialized with the current time from an external source. We're using the nRF52832 with SDK14 and softdevice S132 v5.0.0. Our 32.768kHz crystal is rated at 10ppm, so I'd expect at most just over 5 minutes of drift per year. However, we've measured approximately 15 minutes of drift over three months, which extrapolates out to about 60 minutes in a year, which is way out of spec for the tolerance of our crystal.

So far, we've tested about ten devices with various system uptimes, and they all seem to be running too fast, putting the nRF52-maintained time ahead of NTP server time. It looks like they're all drifting at roughly the same rate and in the same direction, which seems unlikely for a crystal tolerance issue.

Is there anything I might be doing wrong in the firmware? Here are the relevant settings in sdk_config.h:

#define NRF_SDH_CLOCK_LF_SRC 1	// NRF_CLOCK_LF_SRC_XTAL
#define NRF_SDH_CLOCK_LF_RC_CTIV 0
#define NRF_SDH_CLOCK_LF_RC_TEMP_CTIV 0
#define NRF_SDH_CLOCK_LF_XTAL_ACCURACY 7	// NRF_CLOCK_LF_XTAL_ACCURACY_20_PPM even though ours is 10ppm

And here are the edits I made to app_timer.c. Basically I've edited it so that the RTC doesn't stop when there are no timers active, and I count overflow interrupt events. Then when system uptime is requested, I calculate uptime in ticks (number of overflows multiplied by ticks per overflow, then add current value of the RTC1 counter), then convert number of ticks to milliseconds.

modified functions:
static void rtc1_start(void)
{
    /*
     * SDK_edit: enable RTC1 overflow events and interrupts so overflow events
     * can be counted, allowing long-term system uptime calculations.
     */
    NRF_RTC1->EVTENSET = RTC_EVTEN_COMPARE0_Msk | RTC_EVTEN_OVRFLW_Msk;
    NRF_RTC1->INTENSET = RTC_INTENSET_COMPARE0_Msk | RTC_INTENSET_OVRFLW_Msk;

    NVIC_ClearPendingIRQ(RTC1_IRQn);
    NVIC_EnableIRQ(RTC1_IRQn);

    NRF_RTC1->TASKS_START = 1;
    nrf_delay_us(MAX_RTC_TASKS_DELAY);

    m_rtc1_running = true;
}

static void rtc1_stop(void)
{
    NVIC_DisableIRQ(RTC1_IRQn);

    /*
     * SDK_edit: Even though RTC1 should never be disabled with these SDK edits,
     * allow rtc1_stop() to disable RTC1 overflow events and interrupts just to be
     * consistent.
     */
    NRF_RTC1->EVTENCLR = RTC_EVTEN_COMPARE0_Msk | RTC_EVTEN_OVRFLW_Msk;
    NRF_RTC1->INTENCLR = RTC_INTENSET_COMPARE0_Msk | RTC_INTENSET_OVRFLW_Msk;

    NRF_RTC1->TASKS_STOP = 1;
    nrf_delay_us(MAX_RTC_TASKS_DELAY);

    NRF_RTC1->TASKS_CLEAR = 1;
    m_ticks_latest        = 0;
    nrf_delay_us(MAX_RTC_TASKS_DELAY);

    m_rtc1_running = false;
}

(in timer_list_remove()):
    // Timer is the first in the list
    if (p_previous == p_current)
    {
        mp_timer_id_head = mp_timer_id_head->next;

        // No more timers in the list. Reset RTC1 in case Start timer operations are present in the queue.
        if (mp_timer_id_head == NULL)
        {
            /*
             * SDK_edit: Don't reset RTC1 even if there are no more timers. RTC1 needs to run continuously
             * to maintain system uptime.
             */
            //NRF_RTC1->TASKS_CLEAR = 1;
            //m_ticks_latest        = 0;
            //m_rtc1_reset          = true;
            //nrf_delay_us(MAX_RTC_TASKS_DELAY);
        }
    }
    
void RTC1_IRQHandler(void)
{
    /*
     * SDK_edit: Before clearing all events, check to see if the event was
     * caused by an RTC1 overflow. If so, increment the overflow counter.
     */
    if(NRF_RTC1->EVENTS_OVRFLW != 0)
    {
        m_rtc1_overflows++;
    }

    // Clear all events (also unexpected ones)
    NRF_RTC1->EVENTS_COMPARE[0] = 0;
    NRF_RTC1->EVENTS_COMPARE[1] = 0;
    NRF_RTC1->EVENTS_COMPARE[2] = 0;
    NRF_RTC1->EVENTS_COMPARE[3] = 0;
    NRF_RTC1->EVENTS_TICK       = 0;
    NRF_RTC1->EVENTS_OVRFLW     = 0;

    // Check for expired timers
    timer_timeouts_check();
}

new functions:
static uint64_t get_total_ticks(void)
{
    volatile uint32_t overflow_before, overflow_after;
    uint32_t current_ticks;

    /*
     * There's a small chance that the overflow could be captured and then
     * _between_ the overflow read and the ticks read the actual RTC1 ticks
     * count could roll over without a corresponding overflow increment to
     * account for the overflow. To attempt to account for this, capture the
     * overflow count both before and after the ticks read, and if they don't
     * match, then that implies there was an extremely recent rollover. In
     * that case, repeat the overflow->ticks->overflow process again to have
     * confidence that the overflow count is synchronized.
     */
    do
    {
        overflow_before = m_rtc1_overflows;
        current_ticks = rtc1_counter_get();
        overflow_after = m_rtc1_overflows;
    } while(overflow_before != overflow_after);

    #define TICKS_PER_OVERFLOW  (1<<24)

    return ((uint64_t)overflow_before * TICKS_PER_OVERFLOW) + current_ticks;
}

uint64_t app_timer_get_uptime_milliseconds(void)
{
    return TICKS_TO_MILLISECONDS(get_total_ticks());
}

  • Sounds like the 32 kHz crystal isn't loaded correctly. Check the CL value in the crystal's datasheet. The pin capacitance of the nRF51 is 4 pF. This gives the value of the load caps to be (each):

    Ccal p= CL *2 - 4 pF

    If you're using the wrong value, you will pull the frequency of the crystal.

  • Good call. My best guess currently is that our original board designer copied the dev kit, which uses 12pF loading caps. But the dev kit uses a crystal with CL=9pF, which according to this equation would mean 14pF capacitors should've been used, which I guess is close enough.

    But our designer must've changed the crystal without changing the loading caps accordingly. Our crystal has CL=12.5pF, which works out to 21pF loading caps. We'll get this fixed in future runs of the board, but in the meantime, we'll probably need to apply some sort of scale factor in firmware.

    On a related note, any idea how this might be affecting BLE performance? As mentioned above, we've configured the softdevice with an LFCLK accuracy of 20ppm, but our effective accuracy is more like 100-120ppm.

  • The width of the receive window is set by the clock tolerance. You can have a narrower window with a more accurate crystal. If you set the tolerance lower than the crystal have, you will miss the window more often and the packet loss will increase.

Related