I’m working with a custom board based on the nRF52833. It uses an external 32 kHz TCXO (SiT1552) as the LFCLK source. In my main Zephyr app I set: CONFIG_CLOCK_CONTROL_NRF_K32SRC_EXT_FULL_SWING=y and in mcuboot i set CONFIG_CLOCK_CONTROL_NRF_K32SRC_RC=y. This boots fine if I power cycle the board, but then when the app does a software reboot ( sys_reboot(...) or NVIC_SystemReset() ), the system hangs indefinitely. Power-cycling causes it to again boot into the primary image normally. If I switch mcuboot to also use EXT_FULL_SWING , then after reset I see the UART log: mcuboot: Jumping to the first image slot but then nothing. I investigated with a debugger and it the app never seems to reach main() or even any SYS_INIT routines. Whenever I break, it's just in the zephyr kernel's idle thread. So, to summarize: App works with TCXO only if mcuboot uses RC, but any software reboots causes hang. If both use TCXO, app never boots. If both use RC, then every thing works normally (but i of course want to use the external tcxo). Versions: zephyr: nrfconnect sdk-zephyr v3.5.99 nrf: nrfconnect sdk-nrf v2.6.1 Any help on this is greatly appreciated. Thanks EDIT: I forgot to mention that I have indeed verified that the program is getting past the `lfclk_spinwait` function, so it does seem to be detecting that the clock has started. I placed a breakpoint at `clock_control_nrf.c:540` after the `!nrf_clock_is_running` loop, and it does indeed reach that point.

nRF52833 + external 32 kHz TCXO: soft reset hangs

alexdr5398 9 months ago

I’m working with a custom board based on the nRF52833. It uses an external 32 kHz TCXO (SiT1552) as the LFCLK source.

In my main Zephyr app I set: CONFIG_CLOCK_CONTROL_NRF_K32SRC_EXT_FULL_SWING=y and in mcuboot i set CONFIG_CLOCK_CONTROL_NRF_K32SRC_RC=y. This boots fine if I power cycle the board, but then when the app does a software reboot (sys_reboot(...) or NVIC_SystemReset()), the system hangs indefinitely. Power-cycling causes it to again boot into the primary image normally.
If I switch mcuboot to also use EXT_FULL_SWING, then after reset I see the UART log:
```
mcuboot: Jumping to the first image slot
```
but then nothing. I investigated with a debugger and it the app never seems to reach main() or even any SYS_INIT routines. Whenever I break, it's just in the zephyr kernel's idle thread.

So, to summarize:

App works with TCXO only if mcuboot uses RC, but any software reboots causes hang.
If both use TCXO, app never boots.
If both use RC, then every thing works normally (but i of course want to use the external tcxo).

Versions:

zephyr: nrfconnect sdk-zephyr v3.5.99
nrf: nrfconnect sdk-nrf v2.6.1

Any help on this is greatly appreciated.

Thanks

EDIT: I forgot to mention that I have indeed verified that the program is getting past the `lfclk_spinwait` function, so it does seem to be detecting that the clock has started. I placed a breakpoint at `clock_control_nrf.c:540` after the `!nrf_clock_is_running` loop, and it does indeed reach that point.

Top Replies

Parents

0 Vidar Berg 9 months ago

Hello,

I don't have a TCXO to test with here so I can't easily reproduce this, but I assume the reason it hangs is that the LF clock started event is not triggered causing the program to never exit clock_control_nrf.c->lfclk_spinwait() function. You may be able to confirm this by looking at the call stacks.

If I switch mcuboot to also use EXT_FULL_SWING, then after reset I see the UART log:

As a test, please try and see if you get the same result if you build the app with CONFIG_SYSTEM_CLOCK_NO_WAIT=y.

Best regards,

Vidar
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 alexdr5398 9 months ago in reply to Vidar Berg

Small update:

I have confirmed that I do hit breakpoints in PRE_KERNEL_X SYS_INIT's and in POST_KERNEL SYS_INIT with priority 0, *but not* priority 99.

Currently narrowing down exactly which sys_init is causing the issue.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 alexdr5398 9 months ago in reply to alexdr5398
I have confirmed that this breakpoint is hit:

static int test_init() { return 0; } SYS_INIT(test_init, POST_KERNEL, 79);

but this one never is:
static int test_init() { return 0; } SYS_INIT(test_init, POST_KERNEL, 80);
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Vidar Berg 9 months ago in reply to alexdr5398

Interesting. You can view the list of init functions here if you're using VS code:

alexdr5398 said:
CONFIG_SYSTEM_CLOCK_NO_WAIT=y had no effect, i'm assuming this is just skipping that spinwait function?

Correct.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 alexdr5398 9 months ago in reply to Vidar Berg

I'm not using vscode unfortunately. do you what command line tool gives access to that info?

I can maybe try to get vscode working for this project.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Vidar Berg 9 months ago in reply to alexdr5398

I don't know how they are making this list, I would have to ask them. I don't immediately see any ways to check this manually. However, you can try to to search for this priority in the .config file for your app(build/<app name>/zephyr/.config)
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Reply

0 Vidar Berg 9 months ago in reply to alexdr5398

I don't know how they are making this list, I would have to ask them. I don't immediately see any ways to check this manually. However, you can try to to search for this priority in the .config file for your app(build/<app name>/zephyr/.config)
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Children

0 alexdr5398 9 months ago in reply to Vidar Berg

I managed to find the sys init that caused the issue. It's in a custom driver init that hangs when it calls 'k_sleep'.

When I put `k_msleep` in my test_init function, it hangs there too. I believe that zephyr uses the LFCLK for its scheduling, so it would seem that the LFCLK is not actually started?
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Vidar Berg 9 months ago in reply to alexdr5398

Yes, the scheduler relies on the lfclk, so I agree, this indicates that the clock is not running. Not sure what could be causing this yet.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 alexdr5398 9 months ago in reply to Vidar Berg
EDIT: this whole reply is inaccurate, see next reply

Okay I managed to get it to work reliably by swapping from k_sleep -> k_busy_wait in another driver init and by adding this function that explicitly starts the LF clock:

static int start_clock() { int err; struct onoff_manager *clk_mgr; struct onoff_client clk_cli; clk_mgr = z_nrf_clock_control_get_onoff(CLOCK_CONTROL_NRF_SUBSYS_LF); if (!clk_mgr) { LOG_ERR("Unable to get the Clock manager"); return -ENXIO; } sys_notify_init_spinwait(&clk_cli.notify); err = onoff_request(clk_mgr, &clk_cli); if (err < 0) { LOG_ERR("Clock request failed: %d", err); return err; } int res; do { err = sys_notify_fetch_result(&clk_cli.notify, &res); // printk("err %d res %d\n", err, res); if (!err && res) { // LOG_ERR("Clock could not be started: %d", res); return res; } } while (err); return 0; } SYS_INIT(start_clock, POST_KERNEL, 0);

The strange part is that before updating that driver to use k_busy_wait, if I changed the log level in the module with the `start_clock` sys_init from LOG_LEVEL_WRN to LOG_LEVEL_DBG, it did boot normally.

I really don't like how unreliable this seems, especially sihce this is a device that we will definitely not be able to update in the field if they don't boot.

Is there a way to simply use the internal RC for both mcuboot and the primary image, but then switch to the external TCXO at runtime? that way if it doesn't work (or if the component breaks/falls off), we can always fall back to the internal RC.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 alexdr5398 9 months ago in reply to alexdr5398

Never mind.. it actually appears like it wasn't using the TCXO at all when it was booting successfully. It must have fallen back to the internal RC since the drift I measured was way way too high to be the TCXO, even though it was configured with EXT_FULL_SWING.

EDIT: confirmed, i printed the LFCLKSRC register. if mcuboot is configured with CONFIG_CLOCK_CONTROL_NRF_K32SRC_EXT_FULL_SWING=y, this register is 0, meaning it's using the internal RC.

If mcuboot is using the internal RC, then this register is 0x00030001 meaning it's using external TCXO.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 alexdr5398 9 months ago in reply to alexdr5398
So i managed to make some progress.

From what I can tell the LFCLK is not turned off when it resets or when mcuboot passes execution off to the primary image (can you confirm?), and if it's already running from an external source, it does not get configured correctly. So it's able to go from RC->TCXO, but not TCXO->RC or TCXO->TCXO. This would explain the behaviour in the OP.

When it's in this state where it's not configured correctly, i can attach with a debugger and read out the LFCLKSTAT and LFCLKSRC registers:

(gdb) x/wx 0x40000518 0x40000518: 0x00000000 (gdb) x/wx 0x4000041c 0x4000041c: 0x00000000 (gdb) x/wx 0x40000418 0x40000418: 0x00010001

so LFCLKSRC and LFCLKSRCCOPY is configured to use the internal RC while LFCLKSTAT is configured to use the xtal.

I can set LFCLKSRC manually, but this then switches LFCLKSTAT to use the RC oscillator:

(gdb) set {int}0x40000518 = 0x00030001 (gdb) x/wx 0x40000518 0x40000518: 0x00030001 (gdb) x/wx 0x40000418 0x40000518: 0x00010000

BUT now if I let execution continue, my app boots normally, but with the RC clock.

Please advise on what I should do. Is this a bug in clock_control_nrf.c?

I can likely stop the LFCLK in my application and configure it from scratch but this doesn't solve the case where it reboots and enters MCUBOOT while running off the TCXO.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel