Environment
- SoC: nRF54LM20A (cpuapp, non-secure / TF-M)
- nRF Connect SDK: v3.3.0
- sQSPI soft peripheral driving an AMOLED display via the FLPR
- BLE active via MPSL (Multiprotocol Service Layer)
We've hit two related issues using the sQSPI soft peripheral for continuous display transfers while BLE is running. Hoping for guidance on expected behaviour and best practice.
Issue 1 — frequent DMA aborts when MPSL/BLE is active
When the BLE stack is running we see a steady rate of transfer aborts (NRF_SQSPI_RESULT_* abort events) on otherwise valid sQSPI transfers — both data and command transfers. With the radio idle the aborts effectively disappear, so they correlate strongly with MPSL/radio activity.
Our working assumption is bus contention or scheduling delays between the FLPR and the radio, but we don't have a clear picture of why a transfer aborts. Questions:
Is a non-trivial abort rate expected when sQSPI runs concurrently with MPSL?
Is there a recommended way to reduce it (priority, timeslot awareness, buffering, transfer sizing)?
Is there any status/cause information exposed beyond the abort result to tell us which condition triggered it?
Issue 2 — secure fault (MPC → TF-M halt) when re-triggering a transfer too soon after an abort
The more serious one. When we re-submit a transfer immediately after an abort, we intermittently get a secure-side memory protection fault (MPC, master port = FLPR) for a read from a bad/0x0 address. Under TF-M this becomes a silent system halt — the core halts, the display freezes on its last frame, and the debugger loses access (J-Link returns all-zero registers / 0xDEADBEEF sentinels), so it's very hard to diagnose in the field.
Our hypothesis is that immediately after an abort the FLPR/sQSPI is still tearing down the aborted DMA, and re-arming a new transfer in that window leaves the DMA source register in an invalid state, so it reads from 0x0.
Workaround that has stabilised it for us: serialise all transfers onto a single worker thread (one in flight at a time) and insert a short yielding backoff after any abort before re-submitting, to let the FLPR settle. This works, but it's a blind timing delay.
Questions:
Is there a deterministic "transfer torn down / engine idle" status we can poll after an abort, instead of relying on a fixed delay?
What is the recommended minimum settling time, or the correct re-arm sequence, after an aborted sQSPI transfer?
Should an aborted transfer ever be able to trigger a read from an invalid address? That feels like it should be contained within the peripheral rather than surfacing as an MPC fault.
Any insight on the intended abort-recovery sequence would be much appreciated.
Feature request — pre-queue the next buffer / watermark callback
Both issues above ultimately stem from the the potential gap between one transfer ending and the next being armed — that window is where aborts happen and where the early re-trigger fault happens. A clean way to eliminate it would be the ability to hand the peripheral the next buffer ahead of time, so transfers chain seamlessly without the driver re-arming into a teardown window. Two possible forms:
Queued / ping-pong descriptors: let us submit the next transfer (buffer pointer + length) while the current one is still in flight, so the FLPR advances to it automatically with no idle/re-arm gap.
Watermark callback: a callback fired when the current DMA buffer drains past a configurable watermark, from which we can supply the next buffer pointer. This would let us keep the engine continuously fed for streaming workloads like a display framebuffer.
For our use case (continuous slice/chunk streaming to a display under concurrent BLE), either mechanism would let us keep the pipeline full, avoid the abort-recovery round-trips, and sidestep the early re-arm fault entirely — rather than serialising and inserting timing backoffs as we do today.
Many thanks and look forward to your reply!
Regards Dominic