This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

Softdevice Assert at PC=0x15810 (S132 7.2.0) / RTC clock drift when using timeslot

I'm trying to narrow down the cause of a Softdevice assertion happening in S132 7.2.0 at PC=0x15810.

We set up a proprietary RF project which utilises parts of the SDK for Mesh (specifically, the timeslot implementation and bearer_handler) because it provides a safe base to run high performance timeslot applications on. Unfortunately I do have one device which runs into a softdevice assertion at instruction 0x15810. I feel that it is a timing issue - maybe the device is operating at the outer limits of the clock accuracy, because while the issue appears sporadically on Development Kits or other devices, this specific device does trigger it quite often.

  • What exact assertion fails when at PC=0x15810?
  • Does the Softdevice shut down TIMER0 before doing this test or after an assertion fails?
  • Are timing assertions made by the softdevice based on RTC0?
  • Is there a reason why TIMER0 in the mesh stack is running in 24-bit mode as opposed to 32-bit mode?

Any help is greatly appreciated.

EDIT: In the meantime I think I found the cause of the issue. Assuming that the timing assertions by the Softdevice are done using RTC0, there seems to be a rather large discrepancy between the RTC0 timing and TIMER0 timing. After 9'999'249us on TIMER0 pass, RTC0 has counted 10'000'732us, so they're almost 1ms apart!

The device in question is running the LFCLK from the RC oscillator and we do usually have BLE deactivated. I did assume that the softdevice takes care of adjusting for clock drift, but could it be that I have to somehow take care of this manually?

EDIT2: Note that - as we're using the nRF SDK for Mesh as a codebase - when calculating the available time on the timeslot, we should already account for clock drift per the following calculation:

(p_timeslot->length_us * (m_lfclk_ppm + HFCLOCK_PPM_WORST_CASE)) / 1000000;

EDIT3: I previously wrote that we "do usually have BLE deactivated". What I actually mean by this is that most of the time the device is neither connected nor is it currently advertising. So there is no BLE activity to schedule by the softdevice. Timeslots are always active, though.

  • Hi Mike,

    m.wagner said:

    Are there any downsides to using this apart from current consumption? For our setup with power supply and maximised radio uptime (i.e. HFCLK is always running), it seems like this is actually the "better" solution for this issue than using the RC oscillator.

    Considering the symptoms of the issue, I think this should resolve it from a software POV.

    The downside is that the SoftDevice is not tested with SYNT clock source, which is why we state that it is not supported. That said my understanding is that there is no reason to believe that there should be any issues with SYNT other than current consumption, so it could be a good choice as long as you accept the risk with it not being properly tested.

    m.wagner said:

    In the meantime, unfortunately it looked like the unwanted drift still occurs even with #define NRF_SDH_CLOCK_LF_RC_CTIV 8 and NRF_SDH_CLOCK_LF_RC_TEMP_CTIV 0.

    The frequency of it happening is massively reduced, though. While it seemed to happen once in 10 - 15mins with NRF_SDH_CLOCK_LF_RC_CTIV 16 and NRF_SDH_

    CLOCK_LF_RC_TEMP_CTIV 2, it only happened about twice or thrice a day with the above configuration.

    I would expect you should see a further improvement by using RC_CTIV 1 to calibrate even more often (and still TEMP_CTIV 0)?

    m.wagner said:
    I also looked into Erratum 192, which looked like it may be interfering here, but the fact that it happens less frequently with more frequent calibration seems to contradict this...

    The workaround for erratum 192 is uses in the SoftDevice you use (7.2.0), so it should not be relevant in this case.

    m.wagner said:

    I do have one suspicion as to the cause of the issue: Is it possible that the cause could be a mismatch of the HFXO load capacitance? It does not look right to me, but I'll have to talk to our HW engineer. We do not seem to have any issues with BLE, though.

    EDIT2:

    It does indeed seem that C_pin_HFXO and C_PCB have been neglected when calculating the required capacitance. The XTAL requires a C_L of 6pF while we currently do have a C_L in the range of 8pF to 9pF (assuming C_PCB between 0pF and 2pF).

    Without knowing your HW I would not think this could be the main issue. If the HFXO is significantly off you would not get BLE working (frequencies would be shifted). It is good to look into though.

    I have one other question. As you do not care about current consumption I was wondering if your product is a type of product that could see significant rapid temperature changes (like a lightbulb when it is being turned on or off). Do you see this issue mostly around times when temperature changes, or also when the temperature is roughly stable?

  • The downside is that the SoftDevice is not tested with SYNT clock source, which is why we state that it is not supported.

    Does that have any implications on product certification?

    I would expect you should see a further improvement by using RC_CTIV 1 to calibrate even more often (and still TEMP_CTIV 0)?

    I assume so, but I'll have to test this in long-term tests. I deliberately tested with NRF_SDH_CLOCK_LF_RC_CTIV 8, to see if the issue still persists and intend to further reduce the value and see how it behaves.

    Without knowing your HW I would not think this could be the main issue. If the HFXO is significantly off you would not get BLE working (frequencies would be shifted).

    That confirms what I discussed with our HW engineer.

    if your product is a type of product that could see significant rapid temperature changes (like a lightbulb when it is being turned on or off). Do you see this issue mostly around times when temperature changes, or also when the temperature is roughly stable?

    It would surprise me if there were any significant temperature variations. There is a motor with all the necessary periphery on the board as well, but my testing was done without any motor activity whatsoever.

    Also - I think I mentioned it earlier - while I was still running the RC oscillator with its default configuration, I did not observe any recalibration due to temperature drift. I.e. I traced NRF_CLOCK->EVENTS_DONE via PPI and did never observe recalibration occurring after 4s instead of 8s.

  • Hi Mike,

    m.wagner said:
    Does that have any implications on product certification?

    No, the choice of clock source have no impact on qualification or certifications. So the question is basically if this works well in your product or not.

    m.wagner said:

    It would surprise me if there were any significant temperature variations. There is a motor with all the necessary periphery on the board as well, but my testing was done without any motor activity whatsoever.

    Also - I think I mentioned it earlier - while I was still running the RC oscillator with its default configuration, I did not observe any recalibration due to temperature drift. I.e. I traced NRF_CLOCK->EVENTS_DONE via PPI and did never observe recalibration occurring after 4s instead of 8s.

    I see, thanks for confirming. I just wanted to rule out any temperature change effects. (You are of course also right that this should have triggered more calibrations in that case.)

    Thanks,

    Einar

Related