MPSL Assert 109, 496

I am developing an application based on nRF desktop, SDK 2.9.0. In normal build it works as expected. In release build, which includes MCUMgr/MCUBoot and no logging or asserts (which are the only differences, and which work normally), after ~30m connected to my iPhone I get "MPSL ASSERT: 109, 496". I'm not using multiple radio protocols, and I've not fundamentally altered the application in a way that I can think would cause an MPSL error. Nothing should happen after 30m – there are no timeouts that occur, and nothing happens in the debug build. I am using nRF52832, so I modified the nRF52kb configuration. What could be causing that error?

Parents
  • Hello,

    I don’t think I’ve seen any reports of this exact assertion before. If it’s possible to start a debug session before or after the assertion is raised, could you please post a picture of the call stack? The developers were interested in having a look at the backtrace.

    Best regards,

    Vidar

    Edit: they were also wondering if you were using bt_disable() in your app and if you are using the internal RC oscillator as your LF clock source.

  • Yes I am using the RC oscillator, because that config was copied from the keyboard. However, the board does actually have an external crystal I could use.

    I have not added bt_disable() and it doesn't look like it was used by nrf_desktop anywhere.

    Stack trace attached.

  • The issue seems to be that LF clock calibration events fails to be configured properly when the bootloader and the application select different LF clock sources (LFXO and LFRC). The MPSL team is looking into this. If you plan to use the RC oscillator, it is best to set it as the default clock source in your board file. That way it will default to the same clock source for other FW images such as bootloader.

    nrbrook said:
    In what cases would I want to enable "CONFIG_SYSTEM_CLOCK_WAIT_FOR_STABILITY"?

    This is the default option, but I can't think of a good example where it is strictly required. There are some samples which enables SYSTEM_CLOCK_NO_WAIT to reduce the startup time. MPSL will ensure LFXO is running, or LFRC is calibrated before BLE is started.

    nrbrook said:
    Should I be modifying any other configuration to ensure the RC oscillator does not get out of sync? Existing boards e.g. nRF52DK and nRF52kbd don't seem to.

    It should be sufficient to enable CONFIG_CLOCK_CONTROL_NRF_K32SRC_RC. CLOCK_CONTROL_NRF_K32SRC_RC_CALIBRATION is automatically enabled when the RC osc. is selected.

    nrbrook said:
    How can I ensure the firmware reboots on MPSL assert?

    CONFIG_RESET_ON_FATAL_ERROR=y

    nrbrook said:
    Possibly also related: delayed work (using k_work_reschedule) seems to be submitted before expected when compared to the log timestamp. This happens even with XTAL. For example:

    I'm not sure about this. I thought both were using the same system clock. I need to see if I can reproduce this.

  • Thanks for your responses.

    To assist reproduction of the timestamp issue, here is the code to print the timestamps:

    static struct k_work_delayable power_off_timeout_work;
    
    static void power_off_timeout(struct k_work *work) {
        LOG_WRN("Power off timeout");
        int64_t current_time = k_uptime_get();
        {
            int16_t ms = current_time % MSEC_PER_SEC;
            int16_t s = (current_time / MSEC_PER_SEC) % SEC_PER_MIN;
            int16_t m = (current_time / (MSEC_PER_SEC * SEC_PER_MIN)) % MIN_PER_HOUR;
            int16_t h = current_time / (MSEC_PER_SEC * SEC_PER_MIN * MIN_PER_HOUR);
            LOG_DBG("Current time: %02u:%02u:%02u.%03u", h, m, s, ms);
        }
    }
    
    static void reschedule_power_off_after(uint32_t delay_s) {
        int64_t current_time = k_uptime_get();
        {
            int16_t ms = current_time % MSEC_PER_SEC;
            int16_t s = (current_time / MSEC_PER_SEC) % SEC_PER_MIN;
            int16_t m = (current_time / (MSEC_PER_SEC * SEC_PER_MIN)) % MIN_PER_HOUR;
            int16_t h = current_time / (MSEC_PER_SEC * SEC_PER_MIN * MIN_PER_HOUR);
            LOG_DBG("Current time: %02u:%02u:%02u.%03u", h, m, s, ms);
        }
        int64_t completed_time = current_time + delay_s * MSEC_PER_SEC;
        int16_t ms = completed_time % MSEC_PER_SEC;
        int16_t s = (completed_time / MSEC_PER_SEC) % SEC_PER_MIN;
        int16_t m = (completed_time / (MSEC_PER_SEC * SEC_PER_MIN)) % MIN_PER_HOUR;
        int16_t h = completed_time / (MSEC_PER_SEC * SEC_PER_MIN * MIN_PER_HOUR);
        LOG_DBG("Rescheduling power off timeout in %d seconds at %02u:%02u:%02u.%03u", delay_s, h, m, s, ms);
        k_work_reschedule(&power_off_timeout_work, K_SECONDS(delay_s));
    }

  • Thank you for the code snippets. Unfortunately, I have not managed to reproduce the issue here. Both are using the kernel's system clock (RTC) to produce timestamps. Here is the project I used for test:

    4263.timestamp_test.zip

    Were you able to reproduce this with any of the SDK samples?

  • I can confirm your example works correctly. However, it doesn't in nRF Desktop. Modify main.c as shown:

    #include <app_event_manager.h>
    
    #define MODULE main
    #include <caf/events/module_state_event.h>
    
    #include <zephyr/logging/log.h>
    LOG_MODULE_REGISTER(MODULE);
    
    static struct k_work_delayable delay_work;
    
    static void delay_work_handler(struct k_work *work)
    {
    	int64_t current_time = k_uptime_get();
        {
            int16_t ms = current_time % MSEC_PER_SEC;
            int16_t s = (current_time / MSEC_PER_SEC) % SEC_PER_MIN;
            int16_t m = (current_time / (MSEC_PER_SEC * SEC_PER_MIN)) % MIN_PER_HOUR;
            int16_t h = current_time / (MSEC_PER_SEC * SEC_PER_MIN * MIN_PER_HOUR);
            LOG_INF("Current time: %02u:%02u:%02u.%03u", h, m, s, ms);
        }
    }
    
    int main(void)
    {
    	if (app_event_manager_init()) {
    		LOG_ERR("Application Event Manager not initialized");
    	} else {
    		module_set_state(MODULE_STATE_READY);
    	}
    
    	int64_t current_time = k_uptime_get();
        {
            int16_t ms = current_time % MSEC_PER_SEC;
            int16_t s = (current_time / MSEC_PER_SEC) % SEC_PER_MIN;
            int16_t m = (current_time / (MSEC_PER_SEC * SEC_PER_MIN)) % MIN_PER_HOUR;
            int16_t h = current_time / (MSEC_PER_SEC * SEC_PER_MIN * MIN_PER_HOUR);
            LOG_INF("Current time: %02u:%02u:%02u.%03u", h, m, s, ms);
        }
    
    	k_work_init_delayable(&delay_work, delay_work_handler);
    	k_work_schedule(&delay_work, K_SECONDS(60));
    
    	return 0;
    }
    

    Result:

    [00:00:00.004,821] <inf> main: Current time: 00:00:00.004
    ...
    [00:00:58.598,693] <inf> main: Current time: 00:01:00.005

  • It appears to be a division error caused by CONFIG_SYS_CLOCK_TICKS_PER_SEC being set to '1000' instead of the default '32768' value in this project. Commenting this line in the prj.conf file made the problem go away. I will check with the team why this was added in the first place. Git history shows it has been there from the start.

Reply Children
No Data
Related