mpsl assert file "107", line 292

Hi guys,

For a client of mine I'm working on an openthread RCP implementation using a SolidRun N8 that contains the NRF52833.

I'm using ncs v2.9.0-nRF54H20-1.

The problem is that after only ~2 minutes of 5 ping packets per second, I get the following assert (seen here in gdb)

Breakpoint 1, m_assert_handler (file=0x20005ebc <z_interrupt_stacks+636> "107", line=292) at /home/cristic/ncs/v2.9.0-nRF54H20-1/nrf/subsys/mpsl/init/mpsl_init.c:304

The code is pretty basic, based on an older sample. It initializes a single instance and runs an endless while:

    while (!otSysPseudoResetWasRequested())
    {
        otTaskletsProcess(instance);
        otSysProcessDrivers(instance);
    }

Here's the full project .config file.

6886.config.txt

Any hint on what I could change to make this work?

Parents
  • Hi,

    that contains the NRF52833.
    I'm using ncs v2.9.0-nRF54H20-1.

    The release you are using is for nRF54H20 only. It's not a qualified release for nRF52833.

    Please use e.g. NCS v2.9.2 instead, and see if that solves your issue.

  • HI Sigurd,

    Thanks a lot for the reply and pointing out the version problem. That was the latest release when I started working with it.

    I have switched to v2.9.2 and I get the same behavior.

    (gdb) bt
    #0  m_assert_handler (file=0x20005ebc <z_interrupt_stacks+636> "107", line=292) at /home/cristic/ncs/v2.9.2/nrf/subsys/mpsl/init/mpsl_init.c:304
    #1  0x000035e0 in sym_S2UAPMFVIQXDUOA6CV7GJMB33TYHEUH5D6LHO5Q ()

    8358.config.txt

    From what I understand it happens during RTC0 interrupt handling. Is there something in my clock configuration that is wrong?

  • (Updated)

    Please try also with:

    CONFIG_MAIN_STACK_SIZE=4096

    CONFIG_SYSTEM_WORKQUEUE_STACK_SIZE=4096

     

    Regarding the crash in svc instruction, the crash is caused by a call to k_panic() , unfortunately this is a macro.

     

    Please set breakpoints at following functions:

    nrf_802154_assert_handler 

    z_thread_abort 

     

    And run the reproduction scenario, to see if it hits. If it hits please collect the gdb backtrace.

    --

    Hi, 

    Sorry for the late reply. Sigurd is out of the office, so I take this case.

    We see the similar issue in other case, and R&D is trying to reproduce to figure out what might cause this assertion. I will update later when I collect enough information. Please give us more time. Thanks for your patience.

    Regards,
    Amanda H.

  • Hi Amanda,

    due to unknown issues I can no longer reproduce the svc exception in the opensource (non-MPSL) case. When I start ping from the router the NRF child detaches with logs like this:

    ot-daemon-ncs[980]: 00:37:45.420 [I] Mac-----------: Frame tx attempt 16/16 failed, error:NoAck, len:71, seqnum:227, type:Data, src:a611f14676479c9b, dst:6a4d5db8ee09725a, sec:no, ackreq:yes
    ot-daemon-ncs[980]: 00:37:45.420 [D] Mac-----------: ============================[TX ERR len=016]============================
    ot-daemon-ncs[980]: 00:37:45.420 [D] Mac-----------: | 61 DC E3 22 22 5A 72 09 | EE B8 5D 4D 6A 9B 9C 47 | a..""Zr...]Mj..G |
    ot-daemon-ncs[980]: 00:37:45.420 [D] Mac-----------: ------------------------------------------------------------------------
    ot-daemon-ncs[980]: 00:37:45.420 [D] SubMac--------: RadioState: Transmit -> Receive
    ot-daemon-ncs[980]: 00:37:45.420 [D] Mac-----------: ==============================[TX len=071]==============================
    ot-daemon-ncs[980]: 00:37:45.420 [D] Mac-----------: | 61 DC E3 22 22 5A 72 09 | EE B8 5D 4D 6A 9B 9C 47 | a..""Zr...]Mj..G |
    ot-daemon-ncs[980]: 00:37:45.421 [D] Mac-----------: | 76 46 F1 11 A6 7F 33 F0 | 4D 4C 4D 4C E7 3B 00 15 | vF....3.MLML.;.. |
    ot-daemon-ncs[980]: 00:37:45.421 [D] Mac-----------: | CA 66 00 00 00 00 00 01 | 02 8A DE 42 13 98 B3 A6 | .f.........B.... |
    ot-daemon-ncs[980]: 00:37:45.421 [D] Mac-----------: | 24 8B 0B D6 2B FE B9 BF | B7 11 42 88 DB 4F 50 B8 | $...+.....B..OP. |
    ot-daemon-ncs[980]: 00:37:45.421 [D] Mac-----------: | F1 7D D0 4C 42 F3 EF    |                         | .}.LB..          |
    ot-daemon-ncs[980]: 00:37:45.421 [D] Mac-----------: ------------------------------------------------------------------------
    ot-daemon-ncs[980]: 00:37:45.421 [D] Mac-----------: Finishing operation "TransmitDataDirect"
    ot-daemon-ncs[980]: 00:37:45.421 [N] MeshForwarder-: Failed to send IPv6 UDP msg, len:87, chksum:e73b, ecn:no, to:6a4d5db8ee09725a, sec:no, error:NoAck, prio:net
    


    The router says it drops them as duplicates:

    Jul 08 15:10:13 raspberrypi otbr-agent[1320]: 02:27:01.194 [I] MeshForwarder-: Received IPv6 UDP msg, len:87, chksum:16eb, ecn:no, from:a611f14676479c9b, sec:no, prio:net>
    Jul 08 15:10:13 raspberrypi otbr-agent[1320]: 02:27:01.194 [I] MeshForwarder-:     src:[fe80:0:0:0:a411:f146:7647:9c9b]:19788
    Jul 08 15:10:13 raspberrypi otbr-agent[1320]: 02:27:01.194 [I] MeshForwarder-:     dst:[fe80:0:0:0:684d:5db8:ee09:725a]:19788
    Jul 08 15:10:13 raspberrypi otbr-agent[1320]: 02:27:01.194 [W] Mle-----------: Failed to process UDP: Duplicated
    


    Can we please focus on the MPSL case? We will need this to work with MPSL in the end. With MPSL and the stack sizes you mention I can run ping at 200msec for several minutes (~5 minutes in latest tests).
    Then I'm hitting the original problem with m_assert_handler being called for file 107, line 292.

  • cristic said:
    Can we please focus on the MPSL case? We will need this to work with MPSL in the end. With MPSL and the stack sizes you mention I can run ping at 200msec for several minutes (~5 minutes in latest tests).
    Then I'm hitting the original problem with m_assert_handler being called for file 107, line 292.

    Yes, could you provide a simple project to help us reproduce the issue on the nRF52833DK?

  • Hi Amanda,

    As I've explained before I don't have a DK available. My setup is in Israel and nobody delivers there nowadays. Did you try modifying the coprocessor example with the files I attached above and running it on a DK?

    The dataset I use is:

    > dataset active
    dataset active
    Active Timestamp: 1
    Channel: 15
    Channel Mask: 0x07fff800
    Ext PAN ID: c0de1ab5c0de1ab5
    Mesh Local Prefix: fdde:ad00:beef:0::/64
    Network Key: 1234c0de1ab51234c0de1ab51234c0de
    Network Name: SleepyEFR32
    PAN ID: 0x2222
    PSKc: 992c3b39534992571a6a9045db5319e3
    Security Policy: 672 onrc 0
    Done
    

    I have routereligible disabled and this connects it to a RPI running vanilla OTBR with a Silabs module as RCP. I then run from rpi:

    ping -i 0.2 fdde:ad00:beef:0:0:ff:fe00:40f

    where 0x040f is the RLOC of the NRF child.

    Does this sound simple enough?

  • Hi Cristian, 

    We were unable to reproduce the reported issue.

    In the latest NCS release (v3.2.0), the REM scheduler has been reworked. We recommend migrating to this version, as the updated implementation may resolve the issue.

    Regards,
    Amanda H.

Reply Children
Related