Crash in sd_clock_hfclk_is_running on Soft Device S140, 7.3.0

Hi, I recently noticed crashing in sd_clock_hfclk_is_running() on a nrf52840 using SoftDevice S140 7.3.0. This is the callstack:

??@0x00000ac4 (Unknown Source:0)

<signal handler called>@0xffffffe9 (Unknown Source:0)

sd_clock_hfclk_is_running@0x000276ae (.../nRF5_SDK_17.1.0_ddde560/components/softdevice/s140/headers/nrf_soc.h:720)

I'm using the following to enable the hfclk whenever I enable QSPI to avoid errata 244:

sd_clock_hfclk_request();

uint32_t isHfclkRunning = 0;

do {

APP_ERROR_CHECK(sd_clock_hfclk_is_running(&isHfclkRunning));

} while (!isHfclkRunning);

I can trigger this somewhat reliably if I unplug and plug usb power while this code triggers.

Any tips on how I can avoid this issue?

Thanks,

Jeff

Top Replies

Priyanka 11 months ago in reply to Priyanka +1

Hi, From the callstack, it looks like the crash has happened in application and not inside sd_clock_hfclk_is_running, since the address is 0x000276ae, which is outside Softdevice.. One possibility is that…

Parents

0 jthlim 11 months ago

Is there any other way to avoid the errata other than turning on the hfclk? it'd be nice to avoid the power draw that is associated with this workaround.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Priyanka 11 months ago in reply to jthlim

I am yet to hear from them. I will check your suggestion too in the meantime.

-Priyanka
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Martin Tverdal 10 months ago in reply to jthlim

yes, good point pan-201 should not be a concern for you then!
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 jthlim 10 months ago in reply to Martin Tverdal

I tried this again today, and the `NRF_CLOCK->EVENTS_HFCLKSTARTED == 0` works, but testing with the old sd_clock_hfclk_request() approach again gave me a different callstack:

The call hung, and stopping execution landed on 0x25ed8:

0x00025ed2: mrs r1, MSP
0x00025ed6: ldr r0, [r1, #24]

0x00025ed8: subs r0, #2

0x00025eda: ldrb r0, [r0, #0]

0x00025edc: cmp r0, #16

0x00025ede: blt.n 0x25f08

Sharing this in case it helps at all.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 jthlim 10 months ago in reply to jthlim

One possible explanation is corruption of the instruction in FLASH itself, does this happen easily for you on many different boards? If you have only reproduced it on only one board, is it possible that you have exhausted the number flash erase cycles nrf52840 supports. From top of my head I think that is 10'000.

To answer this question - this happens on every board I've tested on, some with single digit erase cycles
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 jthlim 10 months ago in reply to jthlim

Another explanation might be corruption of callstack somehow, maybe you can try increasing the interrupt callstack a bit and see if problem goes away?

Just to make sure I'm not mis-understanding it - is the stack here shared with user and interrupts? Or is interrupts separate?

I use an 8kb stack size on the nrf52840, and the code (minus nrf5 specific stuff) runs on a 2kb stack on another mcu.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Martin Tverdal 10 months ago in reply to jthlim

Thanks for sharing latest stack frame, Im not able to see anything interesting from it, other than that now the fault seems to happen somewhere in your application (beaus 0x25ed8 is higher than the highest address in s140). or is 0x25ed8 just an adress in your hardfault-handler maybe?

Since it happens on many different boards, and also on pretty fresh boards then I think we can rule out flash-wearing too.

Coretex has concept of two differint stacks, msp and psp. one for "main mode" (also called "thread mode), that cpu is using when it is not handling interrupts, and another callsstack it uses for interrupts. it is optional to use two different stacks for interrupts and "main mode", so I'm not sure what your application does.

But all of the code in Softdevice anyway runs in interrupt mode, since all the calls to Softdevice is implemented as SVC interrupts. And since there is only one callstack for all interrupts, Softdevice and application end up sharing the same callstack.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Reply

0 Martin Tverdal 10 months ago in reply to jthlim

Thanks for sharing latest stack frame, Im not able to see anything interesting from it, other than that now the fault seems to happen somewhere in your application (beaus 0x25ed8 is higher than the highest address in s140). or is 0x25ed8 just an adress in your hardfault-handler maybe?

Since it happens on many different boards, and also on pretty fresh boards then I think we can rule out flash-wearing too.

Coretex has concept of two differint stacks, msp and psp. one for "main mode" (also called "thread mode), that cpu is using when it is not handling interrupts, and another callsstack it uses for interrupts. it is optional to use two different stacks for interrupts and "main mode", so I'm not sure what your application does.

But all of the code in Softdevice anyway runs in interrupt mode, since all the calls to Softdevice is implemented as SVC interrupts. And since there is only one callstack for all interrupts, Softdevice and application end up sharing the same callstack.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Children

0 jthlim 10 months ago in reply to Martin Tverdal

My linker file is set up like this:

MEMORY
{
FLASH (rx) : ORIGIN = 0x27000, LENGTH = 0x7d000
RAM (rwx) : ORIGIN = 0x20008000, LENGTH = 0x38000
CODE_RAM (rx) : ORIGIN = 0x808000, LENGTH = 0x38000
}

And this was based on the notes in s140_nrf52_7.2.0_release-notes-update-1.pdf.

The callstack clearly had `sd_clock_hfclk_is_running` in it, which looks like it's going into the SVC.

Is 0x25ed8 not in the soft device memory space?

> it is optional to use two different stacks for interrupts and "main mode", so I'm not sure what your application does.

What flag/code would help identify whether two different stacks are being used?
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel