UART TX timing differs between ISR callback and workqueue

Hello Nordic team,

Environment:

SDK 3.1.0
nRF54L15
UART / RS-232

I am seeing different UART TX timing depending on whether I transmit from inside the UART RX callback vs using a k_work handler.

On our nRF‑based device at 9600 baud, each byte correctly takes ~1.04 ms.
However, the gap between bytes (idle time between stop bit of one byte and start bit of the next) changes drastically depending on context:

1. UART RX callback → UART TX

If we call our transmit function directly inside the UART RX callback (or a parser function invoked from it), the inter‑byte gap is only ~53 µs.

2. k_work_delayable handler → UART TX

If we schedule the exact same transmit function using k_work_schedule(), the inter‑byte gap increases to ~313 µs, and the target device (RS‑232 peer) accepts the frame reliably.

Nothing else changes; same bytes, same baud rate, same framing.

This difference in inter‑byte spacing is causing a downstream RS‑232 device (non‑Nordic) to fail to parse frames correctly unless TX happens from a workqueue instead of inside the RX callback.

I will attach logic analyzer screenshots showing both cases.

Questions:

Is it expected that UART TX behaves differently when called inside the RX callback vs from a workqueue?
Is there a known restriction that UART TX should not be triggered directly inside the RX callback?

Any guidance or clarification would be appreciated. I will provide LA captures and source snippets in the attachments. Thank you.

// This psuedo code is placed in the same location within an UART RX callback.
// My device recieves a UART message then transmits a UART acknoweldgement. 

// Non-working code
void ProcessUartRx()
{
    SendAck(message->header.message_id);
}

// Working code
void delayed_ack_handler(struct k_work *work);
static uint8_t g_pending_ack_msg_id;
K_WORK_DELAYABLE_DEFINE(g_delayed_ack_work, delayed_ack_handler);

void delayed_ack_handler(struct k_work *work)
{
    SendAck(g_pending_ack_msg_id);
}

void ProcessUartRx()
{
    _pending_ack_msg_id = message->header.message_id;
    k_work_schedule(&g_delayed_ack_work, K_MSEC(0));
}

Saleae Logic Analyzer captures

failed-uart.sal working-uart.sal

Screenshots

Parents

0 Kenneth 24 days ago

Hello,

Not sure what api you are working on, but normally you just send the entire string or buffer in one chunk instead of sending it on a byte level. Is there any reason why you need to send individual bytes? That doesn't really explain the difference, but just to mention the intention.

Kenneth
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 kamaln16 24 days ago in reply to Kenneth

Hi Kenneth,

Thanks for the clarification. There wasn’t any specific reason we were using uart_poll_out(). The original implementation just evolved that way. Based on your suggestion, we’ve now switched over to using uart_tx() to send the whole buffer at once, and now I see the correct timing on the logic analyzer capture.

I’m still interested in understanding why the timing difference occurs between calling TX inside the RX callback versus calling it from a workqueue/different thread, so any insight from the Nordic side on that behavior would be appreciated.

Thanks again,
Kamal
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Vidar Berg 23 days ago in reply to kamaln16

Hi Kamal,

When uart_poll_out() is called from an interrupt context (your RX callback in this case), the driver will enter a spin loop if a transfer is already in progress and busy wait until the UART transmitter is ready again. However, when the same function is called from thread context, it will instead enter the wait_tx_ready() function which will call k_msleep(1) if it has to wait for more than 100 us for the transmitter to become ready, which I expect to happen at this baud rate. I think this call to k_msleep(1) is the likely explanation for why you see the extra delay between each transfer.

Best regards,

Vidar
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 kamaln16 23 days ago in reply to Vidar Berg

Hi Vidar,

Thanks for your input. I understand what you are saying but if k_msleep(1) was called in the working case then the time gap between bytes would be at least 1 ms where as the logic analyzer shows a 313 µs time gap.

Also in the failing case, if the UART functionality is entering a spin loop and busy wait until the TX is ready, it must wait at least ~104 µs to follow the 9600 baud rate. My theory is that the receiving device is not correctly processing my failing case UART because the time gap between bytes is not at least ~104 µs. Meaning the receiving device may not see that ~53 µs gap resulting in failed UART decoding.

Best,
Kamal
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Vidar Berg 23 days ago in reply to kamaln16

Hi Kamal,

If my math is not completely off today it should take about 1.04 ms to transmit one byte at 9600 baud: (1 start bit + 8 data bits + 1 stop bit) / 9600. And I expect the UART transmitter to continue transmitting while the thread is sleeping in k_msleep() since the busy wait is only 100us, so this time should not be complety wasted either. I think a simple test you could do is to increase the busy wait from 100 us to 2000 us and see how this affects the gap between the bytes.

Best regards,

Vidar
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Reply

0 Vidar Berg 23 days ago in reply to kamaln16

Hi Kamal,

If my math is not completely off today it should take about 1.04 ms to transmit one byte at 9600 baud: (1 start bit + 8 data bits + 1 stop bit) / 9600. And I expect the UART transmitter to continue transmitting while the thread is sleeping in k_msleep() since the busy wait is only 100us, so this time should not be complety wasted either. I think a simple test you could do is to increase the busy wait from 100 us to 2000 us and see how this affects the gap between the bytes.

Best regards,

Vidar
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Children

0 kamaln16 17 days ago in reply to Vidar Berg

I have to move onto a new task so I can't test out different wait times. Going back to the original question, why is uart_poll_out() not following the baud rate? I should have no issue using uart_poll_out() since they lower levels (Zephyr/Nordic/Hardware) should be handling the timing between bytes to ensure the 9600 baud rate is followed.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Kenneth 16 days ago in reply to kamaln16

I think part of the problem here is that that END event that trigger the UART callback is triggered on the last bit of the transmitted byte, the UART hardware at the same time now start the to transmit the STOP bit. It may look like that if you trigger a new byte during while the STOP bit is still in progress that this cause some conflict in the hardware itself. I guess the reason you see it in specific is that you are running at very low baudrate and don't use the EasyDMA functionality that most do.

We might need to look at this deeper after the summer holidays, but for now write the entire buffer of bytes.

Kenneth
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel