Upgrading through DFU Target results in MPSL Assertion (112, 2142)

Using ncs 2.1.2 on nrf52840. Softdevice controller, bluetooth mesh and bluetooth gatt coexisting application.

I have been having an issue attempting to perform DFU using the dfu_target interface on our nrf52840. For whatever reason, an MPSL assertion is being raised during this process. It is more consistently raised the larger I make the buffer passed through dfu_target_mcuboot_set_buf(). I've enabled the maximum (debug) logging from all the MPSL sources in KConfig, but I have not seen any additional logs. CONFIG_SOC_FLASH_NRF_RADIO_SYNC_MPSL is enabled, so my understanding is that should deal with any synchronization required. A log is shown below.

[00:01:26.549,102] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x0007e000
[00:01:27.718,017] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x0007f000
[00:01:28.861,694] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x00080000
[00:01:29.995,269] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x00081000
[00:01:31.164,489] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x00082000
[00:01:32.339,294] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x00083000
[00:01:33.468,994] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x00084000
[00:01:34.672,882] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x00085000
[00:01:35.801,696] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x00086000
[00:01:36.970,489] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x00087000
[00:01:38.137,115] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x00088000
[00:01:39.272,125] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x00089000
[00:01:40.406,616] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x0008a000
[00:01:41.575,500] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x0008b000
[00:01:42.738,739] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x0008c000
[00:01:43.869,415] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x0008d000
[00:01:45.040,130] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x0008e000
[00:01:46.189,117] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x0008f000
[00:01:47.317,932] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x00090000
[00:01:48.473,114] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x00091000
[00:01:49.641,021] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x00092000
[00:01:50.785,522] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x00093000
[00:01:51.945,953] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x00094000
[00:01:53.111,816] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x00095000
[00:01:54.249,938] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x00096000
[00:01:55.416,595] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x00097000
[00:01:56.581,695] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x00098000
[00:01:57.776,275] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x00099000
[00:01:58.952,484] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x0009a000
[00:02:00.088,195] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x0009b000
[00:02:01.219,482] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x0009c000
[00:02:02.381,866] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x0009d000
[00:02:03.550,964] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x0009e000
[00:02:04.718,780] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x0009f000
[00:02:05.891,052] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x000a0000
[00:02:07.029,693] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x000a1000
[00:02:08.200,744] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x000a2000
[00:02:09.368,774] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x000a3000
[00:02:10.502,593] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x000a4000
[00:02:11.673,065] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x000a5000
[00:02:12.861,755] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x000a6000
[00:02:13.996,246] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x000a7000
[00:02:15.163,269] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x000a8000
[00:02:16.329,406] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x000a9000
[00:02:17.466,461] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x000aa000
[00:02:18.637,481] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x000ab000
[00:02:19.793,518] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x000ac000
[00:02:20.917,968] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x000ad000
[00:02:22.090,728] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x000ae000
[00:02:23.252,349] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x000af000
[00:02:24.394,104] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x000b0000
[00:02:25.563,507] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x000b1000
[00:02:26.725,860] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x000b2000
[00:02:27.947,998] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x000b3000
[00:02:29.115,051] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x000b4000
[00:02:30.250,396] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x000b5000
[00:02:31.415,008] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x000b6000
[00:02:32.586,853] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x000b7000
[00:02:33.717,224] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x000b8000
[00:02:34.886,779] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x000b9000
[00:02:36.052,154] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x000ba000
[00:02:37.188,446] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x000bb000
[00:02:38.356,781] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x000bc000
[00:02:39.524,169] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x000bd000
[00:02:40.659,301] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x000be000
[00:02:41.822,753] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x000bf000
[00:02:43.003,967] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x000c0000
[00:02:44.174,316] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x000c1000
[00:02:45.343,017] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x000c2000
[00:02:46.472,106] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x000c3000
[00:02:47.609,558] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x000c4000
[00:02:48.774,597] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x000c5000
[00:02:49.943,969] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x000c6000
[00:02:51.076,110] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x000c7000
[00:02:52.244,079] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x000c8000
[00:02:53.388,488] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x000c9000
[00:02:54.521,545] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x000ca000
[00:02:55.689,453] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x000cb000
[00:02:56.856,262] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x000cc000
[00:02:57.995,483] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x000cd000
[00:02:59.164,947] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x000ce000
[00:03:00.331,756] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x000cf000
[00:03:01.470,123] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x000d0000
[00:03:02.640,655] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x000d1000
[00:03:03.795,196] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x000d2000
[00:03:04.929,748] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x000d3000
[00:03:06.175,964] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x000d4000
[00:03:07.311,157] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x000d5000
[00:03:08.481,323] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x000d6000
[00:03:09.646,759] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x000d7000
[00:03:10.788,269] <dbg> STREAM_FLASH: stream_flash_erase_page: Erasing page at offset 0x000d8000
[00:03:10.908,721] <err> mpsl_init: MPSL ASSERT: 112, 2142
ASSERTION FAIL [z_spin_lock_valid(l)] @ WEST_TOPDIR/zephyr/include/zephyr/spinlock.h:142
        Recursive spinlock 0x2000a7f0
[00:03:10.908,813] <err> os: ***** HARD FAULT *****
[00:03:10.908,813] <err> os:   Fault escalation (see below)
[00:03:10.908,843] <err> os: ARCH_EXCEPT with reason 4

[00:03:10.908,843] <err> os: r0/a1:  0x00000004  r1/a2:  0x0000008e  r2/a3:  0x00000003
[00:03:10.908,874] <err> os: r3/a4:  0x0000008e r12/ip:  0x00000000 r14/lr:  0x0005d773
[00:03:10.908,874] <err> os:  xpsr:  0x61000018
[00:03:10.908,905] <err> os: Faulting instruction address (r15/pc): 0x000637dc
[00:03:10.908,935] <err> os: >>> ZEPHYR FATAL ERROR 4: Kernel panic on CPU 0
[00:03:10.908,935] <err> os: Fault during interrupt handling

[00:03:10.908,966] <err> os: Current thread: 0x20003c88 (MPSL Work)
[00:03:11.283,386] <err> fatal_error: Resetting system

Since the MPSL is completely opaque, I really have no idea what's going on here. My best guess is that some time critical behavior is being violated, but I am not sure what. I'm also not sure what to do about it, since the flash erase and write times are essentially out of my control.

Parents
  • Hieu,

    Of course. We started with the Mesh and Peripheral Coexistence sample. We replaced the light button service with our own service (which is fairly similar to the NUS), replaced the mesh model with our own model, added a serial connection to the host processor on our system, and enabled CONFIG_DFU_TARGET_MCUBOOT=y. We then use serial commands to write packets of the update image to flash. These are handled by an application thread which is reading from a ring buffer populated using the async serial API.

    I want to add that we seem to be having issues with delayable work/timeouts in the async serial uarte driver and the bluetooth mesh extended advertiser. Essentially, it seems like the timeout takes an entire RTC counter wrap around to occur. I'm going to trial using the SysTick as a source to see if the behavior is within the RTC timer driver. It may be related, but I don't see the MPSL assertion when we are not writing to flash.

  • Peter,

    I plan to replicate your issue here and try some debugging. It seems I will make do with the Mesh and Peripheral Coexistence sample as is then.

    Could you please give some more details on the "added a serial connection" part? Which peripheral (1/2/3) did you use, and what Kconfig and DTS config did you set it up with?

    If you feel like it can be quite complicated to replicate your issue, please let me know if you want to send us your project.

  • I did some further testing and was able to reproduce the SysTick issue on the u-blox evaluation kit. If you'd like to recreate the sort of behavior I'm seeing, consider building the following:

    west build --pristine -b nrf52840dk_nrf52840 tests/kernel/timer/timer_behavior -- -DCONFIG_NRF_RTC_TIMER=n -DCONFIG_CORTEX_M_SYSTICK=y -DCONFIG_SYS_CLOCK_HW_CYCLES_PER_SEC=64000000 -DCONFIG_SYS_CLOCK_TICKS_PER_SEC=10000

    For this to build successfully, you will either need to apply a device tree overlay containing the following or modifying dts/arm/nordic/nrf_common.dtsi to enable the systick node.

    &systick {
        status = "okay";
    };

    If you view the serial output, you will notice that it takes an extremely long time to run the tests. However, if you connect to the RTT using J-Link RTT Viewer, the test will run in real time. For whatever reason, it also happens to fail this test. To ensure this works correctly, perform the following procedure:

    1. Build and flash the test.
    2. Power cycle the board.
    3. Connect your favorite serial viewer to VCOM0 to monitor test output.
    4. Whenever you're satisfied that the test is taking way too long to complete, connect to the RTT. I've used the nRF Terminal plugin for VSCode and the J-Link RTT Viewer.

    I have not tried other samples, but I imagine the experience is similar. It may be possible to cobble together a timer based hello_world to use for examining this issue.

  • Hi Peter,

    Thank you very much for the details. My plan was to understand the issue so that I can work with R&D to get some support. Thanks to your comprehensive explanation, I believe I reached that point now.

    However, I also concluded that I will not be able to be efficiently continue with you and R&D due to my lacking knowledge in this area. Thus, instead of going to R&D now, I will find another support engineer with the right expertise to take over the case.

    My sincere apology that it has taken two whole weeks, and the progress is only that much.

    Hieu

  • Hi  ,

    There is an ongoing PR regarding the nrf_rtc_timer to fix a bug with description very similar to yours. Would you like to take a look at it and see if it can help your above problems with the SysTicks?

    Here is the link to it: github.com/.../54780

    Hieu

  • Hieu,

    That does look like exactly the issue we were having. I know Andrzej has put in a couple of PRs relating to the nrf_rtc_timer code recently. I'm really glad he's looked into this specific bug addressed in #54780 as I think that's exactly what was going on. Since our UART Rx timeout period was on the order of a tick or two we would be getting an excessive amount of timeouts due on the next tick.

    We've done some testing with v2.2.99-dev3 and v2.3.0-rc1 and the issue seems much improved. I suspect this is at least in some part due to the UART fixes released in v2.1.3 and/or the previous improvements to the robustness of the timer. If we can we'll also try pulling in the changes from the PR and going some overnight tests to see if the heartbeat period is consistent.

    Do you have any vision on when this PR would be in an official ncs release version?

  • As you already saw, v2.3.0 is already at rc1, while the fix is still not pulled from the Zephyr upstream. I know there are some internal discussions, but I don't know the details, nor am I allowed to comment about it on DevZone. Sorry for the inconvenience.

    If it is of strong interest to you, perhaps a Regional Sales Manager (RSM) can help.

    The RSM contact for your location is: [REDACTED (to other DevZone users, please open a ticket if you wish to know your location's RSM contact)] 

Reply
  • As you already saw, v2.3.0 is already at rc1, while the fix is still not pulled from the Zephyr upstream. I know there are some internal discussions, but I don't know the details, nor am I allowed to comment about it on DevZone. Sorry for the inconvenience.

    If it is of strong interest to you, perhaps a Regional Sales Manager (RSM) can help.

    The RSM contact for your location is: [REDACTED (to other DevZone users, please open a ticket if you wish to know your location's RSM contact)] 

Children
No Data
Related