nRF52833 + external 32 kHz TCXO: soft reset hangs

I’m working with a custom board based on the nRF52833. It uses an external 32 kHz TCXO (SiT1552) as the LFCLK source.

  • In my main Zephyr app I set: CONFIG_CLOCK_CONTROL_NRF_K32SRC_EXT_FULL_SWING=y  and in mcuboot i set CONFIG_CLOCK_CONTROL_NRF_K32SRC_RC=y. This boots fine if I power cycle the board, but then when the app does a software reboot (sys_reboot(...) or NVIC_SystemReset()), the system hangs indefinitely. Power-cycling causes it to again boot into the primary image normally. 

  • If I switch mcuboot to also use EXT_FULL_SWING, then after reset I see the UART log:

    mcuboot: Jumping to the first image slot
    

    but then nothing. I investigated with a debugger and it the app never seems to reach main() or even any SYS_INIT routines. Whenever I break, it's just in the zephyr kernel's idle thread. 

So, to summarize:

  • App works with TCXO only if mcuboot uses RC, but any software reboots causes hang.

  • If both use TCXO, app never boots.

  • If both use RC, then every thing works normally (but i of course want to use the external tcxo). 

Versions:

zephyr: nrfconnect sdk-zephyr v3.5.99
nrf: nrfconnect sdk-nrf v2.6.1

Any help on this is greatly appreciated. 

Thanks

EDIT: I forgot to mention that I have indeed verified that the program is getting past the `lfclk_spinwait` function, so it does seem to be detecting that the clock has started. I placed a breakpoint at `clock_control_nrf.c:540` after the `!nrf_clock_is_running` loop, and it does indeed reach that point. 

Parents Reply Children
  • I managed to find the sys init that caused the issue. It's in a custom driver init that hangs when it calls 'k_sleep'. 

    When I put `k_msleep` in my test_init function, it hangs there too. I believe that zephyr uses the LFCLK for its scheduling, so it would seem that the LFCLK is not actually started? 

  • Yes, the scheduler relies on the lfclk, so I agree, this indicates that the clock is not running. Not sure what could be causing this yet.

  • EDIT: this whole reply is inaccurate, see next reply

    Okay I managed to get it to work reliably by swapping from k_sleep -> k_busy_wait in another driver init and by adding this function that explicitly starts the LF clock:

    static int start_clock() {
    	int err;
    	struct onoff_manager *clk_mgr;
    	struct onoff_client clk_cli;
    
    	clk_mgr = z_nrf_clock_control_get_onoff(CLOCK_CONTROL_NRF_SUBSYS_LF);
    	if (!clk_mgr) {
    		LOG_ERR("Unable to get the Clock manager");
    		return -ENXIO;
    	}
    
    	sys_notify_init_spinwait(&clk_cli.notify);
    
    	err = onoff_request(clk_mgr, &clk_cli);
    	if (err < 0) {
    		LOG_ERR("Clock request failed: %d", err);
    		return err;
    	}
    
    	int res;
    	do {
    		err = sys_notify_fetch_result(&clk_cli.notify, &res);
    		// printk("err %d res %d\n", err, res);
    		if (!err && res) {
    			// LOG_ERR("Clock could not be started: %d", res);
    			return res;
    		}
    	} while (err);
    
    	return 0;
    }
    SYS_INIT(start_clock, POST_KERNEL, 0);
    



    The strange part is that before updating that driver to use k_busy_wait, if I changed the log level in the module with the `start_clock` sys_init from LOG_LEVEL_WRN to LOG_LEVEL_DBG, it did boot normally.  

    I really don't like how unreliable this seems, especially sihce this is a device that we will definitely not be able to update in the field if they don't boot. 

    Is there a way to simply use the internal RC for both mcuboot and the primary image, but then switch to the external TCXO at runtime? that way if it doesn't work (or if the component breaks/falls off), we can always fall back to the internal RC. 

  • Never mind.. it actually appears like it wasn't using the TCXO at all when it was booting successfully. It must have fallen back to the internal RC since the drift I measured was way way too high to be the TCXO, even though it was configured with EXT_FULL_SWING. 

    EDIT: confirmed, i printed the LFCLKSRC register. if mcuboot is configured with CONFIG_CLOCK_CONTROL_NRF_K32SRC_EXT_FULL_SWING=y, this register is 0, meaning it's using the internal RC.

    If mcuboot is using the internal RC, then this register is 0x00030001 meaning it's using external TCXO. 


  • So i managed to make some progress. 

    From what I can tell the LFCLK is not turned off when it resets or when mcuboot passes execution off to the primary image (can you confirm?), and if it's already running from an external source, it does not get configured correctly. So it's able to go from RC->TCXO, but not TCXO->RC or TCXO->TCXO. This would explain the behaviour in the OP. 

    When it's in this state where it's not configured correctly, i can attach with a debugger and read out the LFCLKSTAT and LFCLKSRC registers:

    (gdb) x/wx 0x40000518
    0x40000518:     0x00000000
    (gdb) x/wx 0x4000041c
    0x4000041c:     0x00000000
    (gdb) x/wx 0x40000418
    0x40000418:     0x00010001


    so LFCLKSRC and LFCLKSRCCOPY is configured to use the internal RC while LFCLKSTAT is configured to use the xtal.

    I can set LFCLKSRC manually, but this then switches LFCLKSTAT to use the RC oscillator:

    (gdb) set {int}0x40000518 = 0x00030001
    (gdb) x/wx 0x40000518
    0x40000518:     0x00030001
    (gdb) x/wx 0x40000418
    0x40000518:     0x00010000

    BUT now if I let execution continue, my app boots normally,  but with the RC clock. 

    Please advise on what I should do. Is this a bug in clock_control_nrf.c? 

    I can likely stop the LFCLK in my application and configure it from scratch but this doesn't solve the case where it reboots and enters MCUBOOT while running off the TCXO. 

Related