Framing Errors seen on nRF54L15 UART. Is it super sensitive to stop bit timing?

Hi,

I am getting framing error notifications from the nrfx uarte driver on a zephyr platform when receiving data from a certain external device, but only when there are large bursts of data.

The data is at 115200bps, 8 bit, no parity, so not particularly fast.

The device I am receiving from has very tight timing (the stop bit is exactly one bit time well within tolerance), and often sends large (multiple hundreds of bytes) streams with no idle time between stop bit and following start bit. This device cannot have its serial settings modified (to increase the number of stop bits to 2 for example).

Looking at the data with both an oscilloscope and a very oversampled logic analyser (which will report formatting problems), the actual data on the line seems fine, but I can't be sure that it is consistent.

When I look at the data received from the driver, I generally just see missing bytes, rather than bad data (with a few exceptions), which seem to come from the data being dropped when the driver cancels the rx, and I suspect that I can't re-enable it fast enough.

As far as I can tell, I have the external oscillator enabled, and constant latency mode enabled, but I am not sure if there is any interplay with the nordic/zephyr code that may be altering this.

I am using the zephyr async API, with a slab buffer with 8 128 byte receive buffers handled, which seems to me to be the highest performing approach.

I have previously seen some UARTs which required ever so slightly more than a single bit time for the STOP before a new START bit, with very similar symptoms to what I have seen here.

My questions are:

  • Is the baud rate divisor in SDK 2.9.1 for 115200 bps still correct? I have noticed that there are forum posts that state that this shouldn't be modified, but also that it doesn't line up with the formula also reported on the forums. Is the special adjusted baud rate divisor only for running from internal RC oscillator (and therefore tuned to it)?
  • Is there a surefire way to disable any change on clock source/latency settings so that I can be sure that I am testing under the correct conditions?

My concern is that effectively the same code was talking to a cellular modem at a much high baud rate with total reliability, suggesting to me that the UARTE hardware might be a bit fussy about stop bit length.

Regards,

Nathan Boyd

Parents
  • Hello Nathan,

    We encountered a very similar issue and had been following this thread closely. In our case, a framing error occurred unexpectedly, and a few bytes were lost in the following frame.

    After analyzing the timing more closely, we noticed that the issue consistently happened when the interval between the end of the current frame and the start of the next frame was close to the configured frame timeout. The behavior then made sense: the RX reception stops and restarts when the frame timeout is triggered - right around the time the next frame begins transmission.

    I'm not certain if this matches the issue you're seeing, but I thought it might be good to share our experience.

    Best regards,

    Sathiya

  • I am seeing the same behavior on NCS 2.9.0. If the delay between two bursts of data is equal to the frame timeout then the receiver will drop about 5 bits of data at 115200 baud, which can lead to framing errors and/or corrupted data. I believe that the issue is that the duration between the RX being stopped by the FRAME_TIMEOUT_STOPRX short to the subsequent re-enable is too long, but I haven't had an opportunity to dig deeply into this yet.

Reply
  • I am seeing the same behavior on NCS 2.9.0. If the delay between two bursts of data is equal to the frame timeout then the receiver will drop about 5 bits of data at 115200 baud, which can lead to framing errors and/or corrupted data. I believe that the issue is that the duration between the RX being stopped by the FRAME_TIMEOUT_STOPRX short to the subsequent re-enable is too long, but I haven't had an opportunity to dig deeply into this yet.

Children
  • Hi Nick, Sathiya,

    Nick's comments about the handling of the frame timeout seem to be closer to what I am seeing. I did try tuning the rx timeout, but the device I am receiving from has inconsistent timing. I haven't 100% tracked down which portion of data (and associated idle periods) are triggering my problem.

    I may take a look at the driver code more closely, as I did see a similar problem on an Energy Micro device that didn't have an in-built rx timeout, but I was able to use a general purpose timer (reset on each byte receive) to detect a timeout and force the dma to switch buffers. When I didn't have this working properly (incorrect relative priority of the UART interrupt and the TIMER interrupt), I got lost data due to some of the dma state being incorrect (data was received, but not processed), mainly as the DMA engine didn't keep track of received bytes per buffer, just the state of the active DMA transfer).

    I am going to try:

    - Getting rid the disabling of the RX on an error. I didn't get this working (or it didn't help) before.

    - Not handling the FRAME_TIMEOUT_STOPRX and just trying to forward the partial data seen.

      - I suspect there may be a race condition to avoid here to make sure the partial data is retrieved, but the frame timeout is restarted.

    I have previously been trying to track down behaviour with some pretty heavy debug output, but I am considering trying to reproduce the problem with plain ascii, and putting some non-ascii data in the stream to indicate when things have occurred (framing error, break).

    I think it might be useful to know more about the 5 byte HW FIFO as well, which seems to feed the receive DMA. It isn't really documented, but can be inferred looking at the driver code (where, upon stopping RX, the DMA is completed, then the remains of the FIFO are drained manually.

    Regards,

    Nathan Boyd.

Related