NCS v3.3.0: mpsl_init: MPSL ASSERT: 109, 615

Hey Team!
We are developing a radio-heavy BLE application on the nRF SDK version v.3.3.0. In normal operation, it is working fine, but after some time, we are encountering a MPSL fault always on the BLE central.

00> [ 1948.118408] <err> mpsl_init: MPSL ASSERT: 109, 615

00> [ 1948.118408] <err> os:   Fault escalation (see below)
00> [ 1948.118408] <err> os: ARCH_EXCEPT with reason 3
00> 
00> [ 1948.118469] <err> os: >>> ZEPHYR FATAL ERROR 3: Kernel oops on CPU 0
00> [ 1948.118469] <err> os: Fault during interrupt handling
00> 
00> [ 1948.118499] <err> os: Current thread: 0x2000ce30 (unknown)


We don't have any custom MPSL slots active (CONFIG_MPSL_TIMESLOT_SESSION_COUNT=0) and as it is only happening on the BLE central.

The application is running on a custom board with a nRF52840 which only has a RC oscillator as our LF clock source. CONFIG_CLOCK_CONTROL_NRF_K32SRC_RC=y and CONFIG_CLOCK_CONTROL_NRF_K32SRC_RC_CALIBRATION=y are set.


For now, as the assert only happens after an undefined time (which can be hours), we only managed to reproduce it using a production build. We are currently trying to get more information by also trying to reproduce it with a debug build. Also, afterwards, we will then disable the channel survey feature to see if the assert also happens without it. For now, it would really help us to understand what the reason of the MPSL assert could be.

I will add more information as soon as we have it.

Parents Reply Children
  • Hi, I'm currently working with  on this problem and we are still able to reproduce this problem with QoS Channel Survey disabled and also on nRF SDK 3.3.0. 

    Can you maybe explain how this specific MPSL assertion is triggered or possible problems?

    Thanks.

  • This was a backtrace from today running NCS 3.3.0 (the error is now MPSL ASSERT: 109, 603)

    #0 arch_system_halt (reason=reason@entry=4) at /Users/user/Repos/project/zephyr/kernel/fatal.c:30
    #1 0x00054a2c in k_sys_fatal_error_handler (reason=reason@entry=4, esf=esf@entry=0x20021618 <z_interrupt_stacks+1816>)
    at /Users/user/Repos/project/zephyr/kernel/fatal.c:44
    #2 0x00054ad8 in z_fatal_error (reason=4, esf=0x20021618 <z_interrupt_stacks+1816>)
    at /Users/user/Repos/project/zephyr/kernel/fatal.c:119
    #3 0x0003e632 in z_arm_fault (msp=<optimized out>, psp=<optimized out>, exc_return=<optimized out>, callee_regs=<optimized out>)
    at /Users/user/Repos/project/zephyr/arch/arm/core/cortex_m/fault.c:1083
    #4 0x0003e77c in z_arm_usage_fault () at /Users/user/Repos/project/zephyr/arch/arm/core/cortex_m/fault_s.S:102
    #5 <signal handler called>
    #6 assert_post_action (file=file@entry=0x76c28 "WEST_TOPDIR/nrf/subsys/mpsl/init/mpsl_init.c", line=line@entry=333)
    at /Users/user/Repos/project/zephyr/lib/os/assert.c:43
    #7 0x00052860 in m_assert_handler (file=<optimized out>, line=603) at /Users/user/Repos/project/nrf/subsys/mpsl/init/mpsl_init.c:333
    #8 0x00023fa6 in sym_S2UAPMFVIQXDUOA6CV7GJMB33TYHEUH5D6LHO5Q ()
    #9 0x0002165a in sym_KDG2QV44ALYOJ3VXK5EAFUKKY3P3LYTXXEA7CSI ()
    #10 0x000228c2 in sym_VXQCALTLFQP2RYOCOYFZIKONUJ65GN3KDIZZUXQ ()
    #11 0x000229fa in sym_BAQIADYWPZENQ44VKMPSGYJS2KDZRY46H3J2JFY ()
    #12 0x0002b1a4 in sym_V5ILJEYQUHITD7K5U5BVWARUBFYNVMXXWVRAE5Q ()
    #13 0x0001fb3a in sym_EMR2H7KEGRFFKS6A4P3ZPPGOJSTVXLGRI25MEYQ ()
    #14 0x00052816 in mpsl_radio_isr_wrapper_body () at /Users/user/Repos/project/nrf/subsys/mpsl/init/mpsl_init.c:312
    #15 mpsl_radio_isr_wrapper () at /Users/user/Repos/project/nrf/subsys/mpsl/init/mpsl_init.c:310
    #16 <signal handler called>
    #17 0x0003ec2c in arch_cpu_idle () at /Users/user/Repos/project/zephyr/arch/arm/core/cortex_m/cpu_idle.c:106
    #18 0x00055138 in k_cpu_idle () at /Users/user/Repos/project/zephyr/include/zephyr/kernel.h:6608
    #19 idle (unused1=<optimized out>, unused2=<optimized out>, unused3=<optimized out>) at /Users/user/Repos/project/zephyr/kernel/idle.c:75
    #20 0x000355f2 in z_thread_entry (entry=0x55119 <idle>, p1=<optimized out>, p2=<optimized out>, p3=0x0)
    at /Users/user/Repos/project/zephyr/lib/os/thread_entry.c:60

    And another backtrace, maybe it helps:

    #0  arch_system_halt (reason=reason@entry=4) at /Users/user/Repos/project/zephyr/kernel/fatal.c:30
    #1  0x00054a2c in k_sys_fatal_error_handler (reason=reason@entry=4, 
        esf=esf@entry=0x20021598 <z_interrupt_stacks+1688>) at /Users/user/Repos/project/zephyr/kernel/fatal.c:44
    #2  0x00054ad8 in z_fatal_error (reason=4, esf=0x20021598 <z_interrupt_stacks+1688>)
        at /Users/user/Repos/project/zephyr/kernel/fatal.c:119
    #3  0x0003e632 in z_arm_fault (msp=<optimized out>, psp=<optimized out>, exc_return=<optimized out>, 
        callee_regs=<optimized out>) at /Users/user/Repos/project/zephyr/arch/arm/core/cortex_m/fault.c:1083
    #4  0x0003e77c in z_arm_usage_fault () at /Users/user/Repos/project/zephyr/arch/arm/core/cortex_m/fault_s.S:102
    #5  <signal handler called>
    #6  assert_post_action (file=file@entry=0x76c28 "WEST_TOPDIR/nrf/subsys/mpsl/init/mpsl_init.c", 
        line=line@entry=333) at /Users/user/Repos/project/zephyr/lib/os/assert.c:43
    #7  0x00052860 in m_assert_handler (file=<optimized out>, line=603)
        at /Users/user/Repos/project/nrf/subsys/mpsl/init/mpsl_init.c:333
    #8  0x00023fa6 in sym_S2UAPMFVIQXDUOA6CV7GJMB33TYHEUH5D6LHO5Q ()
    #9  0x0002165a in sym_KDG2QV44ALYOJ3VXK5EAFUKKY3P3LYTXXEA7CSI ()
    #10 0x000228c2 in sym_VXQCALTLFQP2RYOCOYFZIKONUJ65GN3KDIZZUXQ ()
    #11 0x000229fa in sym_BAQIADYWPZENQ44VKMPSGYJS2KDZRY46H3J2JFY ()
    #12 0x0002b1a4 in sym_V5ILJEYQUHITD7K5U5BVWARUBFYNVMXXWVRAE5Q ()
    #13 0x0001dbb2 in sym_R7R7SVVRUJQKXPLQZZ5BIUOS4GQQKFIX2YPMRXY ()
    #14 0x0000dc90 in sym_D7VILJSB67T37R6D555P4BIMJ7P743FM46AZ55A ()
    #15 0x0000e622 in sym_EM7WJZPDUM5SP6C4AQ433UOIX6YWTRYFRCULQ4A ()
    #16 0x0001fb3a in sym_EMR2H7KEGRFFKS6A4P3ZPPGOJSTVXLGRI25MEYQ ()
    #17 0x00052816 in mpsl_radio_isr_wrapper_body ()
        at /Users/user/Repos/project/nrf/subsys/mpsl/init/mpsl_init.c:312
    #18 mpsl_radio_isr_wrapper () at /Users/user/Repos/project/nrf/subsys/mpsl/init/mpsl_init.c:310
    #19 <signal handler called>
    #20 0x0006bb02 in memcpy ()
    #21 0x00055444 in z_impl_k_msgq_get (msgq=msgq@entry=0x20003250 <gatt_rx_msgq>, 
        data=data@entry=0x200199c8 <thread_stack_area_gatt_rx+2632>, timeout=...)
        at /Users/user/Repos/project/zephyr/kernel/msg_q.c:289
    #22 0x0002e488 in k_msgq_get (timeout=..., data=0x200199c8 <thread_stack_area_gatt_rx+2632>, msgq=0x20003250 <gatt_rx_msgq>) at /Users/user/Repos/project/project/applications/aerobot/build/aerobot/zephyr/include/generated/zephyr/syscalls/kernel.h:1329
    #23 gatt_rx_thread_entry () at /Users/user/Repos/project/project/lib/bt/gatt.c:85
    #24 0x000355f2 in z_thread_entry (entry=0x2e469 <gatt_rx_thread_entry>, p1=<optimized out>, p2=<optimized out>, p3=0x0) at /Users/user/Repos/project/zephyr/lib/os/thread_entry.c:60

  • Something else we just discovered, is that we get the error always after the same time.
    Reducing the clock calibration period (CONFIG_CLOCK_CONTROL_NRF_CALIBRATION_PERIOD) triggers this error faster.

    CONFIG_CLOCK_CONTROL_NRF_CALIBRATION_PERIOD = 4000 (default) -> Assertion after approx. 25mins
    CONFIG_CLOCK_CONTROL_NRF_CALIBRATION_PERIOD = 1000 -> Assertion after approx. 7mins
    CONFIG_CLOCK_CONTROL_NRF_CALIBRATION_PERIOD = 500 -> Assertion after approx. 3.5mins
    CONFIG_CLOCK_CONTROL_NRF_CALIBRATION_PERIOD=500 and CONFIG_CLOCK_CONTROL_NRF_CALIBRATION_MAX_SKIP=0 -> Assertion after approx. 2mins
    CONFIG_CLOCK_CONTROL_NRF_CALIBRATION_PERIOD=250 and CONFIG_CLOCK_CONTROL_NRF_CALIBRATION_MAX_SKIP=0 -> Assertion after approx. 1min

    Edit: I added some logs to `
    mpsl_calibration_work_handler` and it seems like the assertion always triggers shortly after the 254th calibration is done.
  • Hello,

    Is it possible you have MCUBOOT enabled? If yes: Can you check what is the clock source (LFXO or LFRC) that is used for MCUBOOT?

    We suspect you may have configured LFXO for MCUBOOT and LRFC for application.

    Kenneth

  • Hi,

    Yes, we use MCUboot. The clock source is defined in our board files, therefore it should be used for both images.

    A look into .config of the mcuboot image also shows:

    CONFIG_NRFX_CLOCK=y
    # CONFIG_NRFX_CLOCK_USE_LFRC_CALIBRATION is not set
    # CONFIG_NRFX_CLOCK_LFXO_TWO_STAGE_ENABLED is not set
    CONFIG_NRFX_CLOCK_LF_CAL_ENABLED=y
    CONFIG_NRFX_CLOCK_LF_SRC_RC=y
    # CONFIG_NRFX_CLOCK_LF_SRC_XTAL is not set
    # CONFIG_NRFX_CLOCK_LF_SRC_SYNTH is not set
    # CONFIG_NRFX_CLOCK_LF_SRC_LOW_SWING is not set
    # CONFIG_NRFX_CLOCK_LF_SRC_FULL_SWING is not set
    
    ...
    
    CONFIG_CLOCK_CONTROL_NRF=y
    CONFIG_CLOCK_CONTROL_NRF_K32SRC_FREQUENCY=32768
    CONFIG_CLOCK_CONTROL_NRF_K32SRC_RC=y
    # CONFIG_CLOCK_CONTROL_NRF_K32SRC_XTAL is not set
    # CONFIG_CLOCK_CONTROL_NRF_K32SRC_SYNTH is not set
    # CONFIG_CLOCK_CONTROL_NRF_K32SRC_EXT_LOW_SWING is not set
    # CONFIG_CLOCK_CONTROL_NRF_K32SRC_EXT_FULL_SWING is not set
    CONFIG_CLOCK_CONTROL_NRF_K32SRC_RC_CALIBRATION=y
    CONFIG_CLOCK_CONTROL_NRF_CALIBRATION_LF_ALWAYS_ON=y
    CONFIG_CLOCK_CONTROL_NRF_CALIBRATION_PERIOD=4000
    CONFIG_CLOCK_CONTROL_NRF_CALIBRATION_MAX_SKIP=1
    CONFIG_CLOCK_CONTROL_NRF_CALIBRATION_TEMP_DIFF=2
    CONFIG_CLOCK_CONTROL_NRF_K32SRC_500PPM=y
    # CONFIG_CLOCK_CONTROL_NRF_K32SRC_250PPM is not set
    # CONFIG_CLOCK_CONTROL_NRF_K32SRC_150PPM is not set
    # CONFIG_CLOCK_CONTROL_NRF_K32SRC_100PPM is not set
    # CONFIG_CLOCK_CONTROL_NRF_K32SRC_75PPM is not set
    # CONFIG_CLOCK_CONTROL_NRF_K32SRC_50PPM is not set
    # CONFIG_CLOCK_CONTROL_NRF_K32SRC_30PPM is not set
    # CONFIG_CLOCK_CONTROL_NRF_K32SRC_20PPM is not set
    CONFIG_CLOCK_CONTROL_NRF_ACCURACY=500


    Thanks.

Related