Multiprotocol application - MPSL ASSERT: 106, 684 and MPSL ASSERT: 112, 2094

Hello,

Version - nrf Connect SDK v2.7.0

The background to the system I am developing is described here

Multiprotocol Service Layer (MPSL) - BLE coexistence with proprietary communication stack
and here

 Non-volatile storage with BLE and custom wireless stack using MPSL 

In short, I have an application containing a BLE Peripheral, NVS and our custom 6TiSCH wireless stack. All three components are working simultaneously with access to hardware managed by the MPSL. Application works fine in our test environment, except for the occasional critical error that I want to discuss with you.

The first two are MPSL ASSERTS I have no clue how to interpret.:

(1) Assert from MPSL, stack pointer on idle thread, Link Register points to assert function, Program Counter in C:/ncs/v2.7.0/zephyr/lib/os/printk.c:209

ASSERTION FAIL [0] @ WEST_TOPDIR/nrf/subsys/mpsl/init/mpsl_init.c:301
MPSL ASSERT: 112, 2094

[18:15:50.145,141] <err> os: ***** HARD FAULT *****
[18:15:50.145,172] <err> os: Fault escalation (see below)
[18:15:50.145,172] <err> os: ARCH_EXCEPT with reason 4
.....
[14:24:44.787,170] <err> os: xpsr: 0x01000018

(2) Assert from MPSL, the stack pointer in the thread that manages the NVM storage of the firmware image, Link Register points to assert function, Program Counter in C:/ncs/v2.7.0/zephyr/lib/os/printk.c:209

ASSERTION FAIL [0] @ WEST_TOPDIR/nrf/subsys/mpsl/init/mpsl_init.c:301
MPSL ASSERT: 106, 684

[14:24:44.787,017] <err> os: ***** HARD FAULT *****
[14:24:44.787,048] <err> os: Fault escalation (see below)
[14:24:44.787,048] <err> os: ARCH_EXCEPT with reason 4
....
[14:24:44.787,170] <err> os: xpsr: 0x01000000

The third is connected with our software directly. It can be caught in the same place and under the same circumstances.

(3) This one has Ling Register set to our function iterating through a buffer within a Critical Section. This is not too excessive work for MCU.

[18:17:40.708,801] <err> os: ***** USAGE FAULT *****
[18:17:40.708,831] <err> os: Illegal use of the EPSR
...
[18:17:40.708,923] <err> os: xpsr: 0x60000200

Q1: How to interpret the number stated in MPSL assertions: MPSL ASSERT: 112, 2094 and MPSL ASSERT: 106, 684.
Q2: How long can I keep the MCU in the critical section with interrupts disabled?

Parents
  • One more clarification. Our library has porting requirements, among them the critical section input and output. It can be called from the ISR context, and also from the proc function from the Zephyr thread context. My current implementation:

    static int volatile irqNestCounter;
    static uint32_t volatile previousIrqState;

    void MYLIB_CRITICAL_SECTION_Enter(void) {
        uint32_t irqState = __get_PRIMASK();
        __disable_irq();
        if (0 == irqNestCounter) {
            previousIrqState = irqState;
        }
        ++irqNestCounter;
    }

    void MYLIB_CRITICAL_SECTION_Exit(void) {
        --irqNestCounter;
        if (irqNestCounter < 0) {
            irqNestCounter = 0;
        }
        if (0 == irqNestCounter && 0 == previousIrqState) {
            __enable_irq();
        }
    }
  • I am waiting for some feedback internally, but I would expect that the maximum time you can disable interrupts are in the very few us if the softdevice is involved.

    Kenneth

  • Thank you! In the upcoming days I will try to benchmark all critical sections used.

    It is also possible to say something more on the subject of MPSL assertions? Is there any option to increase the verbosity of the assert messages?

  • Hi, 

    Some comments internally:

    To give a general answer, you could for example disable interrupts in your MPSL timeslot or for a few instructions. If interrupts are disabled it might conflict with the radio interrupts that are required for the BLE to work. 

    Its not easy to give a definitive number on how long interrupts can be disabled. For example, say the peripheral has an ACL link with a 4 second connection interval. There is a long time between connection events where interrupts can be disabled without interfering with the radio interrupt for the BLE role, but if the interrupts happen to be disabled for a short time right before the next connection event there is a high likelihood it will result in problems.

    Kenneth

  • Perfectly understandable. The operations of the BLE controller have to be performed in strictly defined time intervals.
    I have a spare solution to my problem. I can perfectly define which peripheral I'm using interacts with specific pieces of code. I don't need to disable interrupts completely. I'm going to try narrowing down the range of blocked interrupts and see if that works.

    Thanks.

    I wanted to ask again about the interpretation of what Assertion has written out.

Reply
  • Perfectly understandable. The operations of the BLE controller have to be performed in strictly defined time intervals.
    I have a spare solution to my problem. I can perfectly define which peripheral I'm using interacts with specific pieces of code. I don't need to disable interrupts completely. I'm going to try narrowing down the range of blocked interrupts and see if that works.

    Thanks.

    I wanted to ask again about the interpretation of what Assertion has written out.

Children
Related