nRF5340 function bt_nus_send() taking 40ms to execute, how to speed up?

I am looking to modify the sample code for peripheral UART and central UART, so that the central one takes data received from peripheral via BLE, and transmits said data out to some external processor via UART. External component must receive 10-byte packets with at most 1ms delay between end of one packet and start of another packet.

I am currently using 2 nRF5340-DKs to accomplish this. One of them is flashed with firmware from a direct copy of the central sample code (see nrf\samples\bluetooth\central_uart). The other has modified peripheral code (see nrf\samples\bluetooth\peripheral_uart), with the following changes:

  • main.c, defined array of hardcoded data - const uint8_t data[] (with size of roughly 16,000)
  • main.c, modified function ble_write_thread() to look like:

void ble_write_thread(void)
{
	/* Don't go any further until BLE is initialized */
	k_sem_take(&ble_init_ok, K_FOREVER);

	const uint16_t len2 = 10;
	uint16_t ndx = 0;
    int send_err;
    
	for (;;) {
		// send 1 packet
        dk_set_led(DK_LED3, 0);
		if ( (send_err = bt_nus_send(NULL, data + ndx, len2)) ) {
			LOG_WRN("Failed to send data over BLE connection");
            printk("BLE, FAILED TO SEND (ERROR=%d)\n", send_err);
		}
		// prepare for next time
		else {
            dk_set_led(DK_LED3, 1);
			if (ndx % (len2 * 100) == 0) {
				printk("BLE, DATA PROGRESS (LASTBYTE=0x%02x)\n", data[ndx + len2 - 1]);
			}
			ndx += len2;
			if (ndx >= sizeof(data)) {
				printk("\n\nBLE, ALL DATA SENT; RESTARTING\n\n");
				ndx = 0;
				break;
			}
		}
		// wait 1ms before sending next pkt
		k_sleep(K_MSEC(1));
	}
}

K_THREAD_DEFINE(ble_write_thread_id, STACKSIZE, ble_write_thread, NULL, NULL,
		NULL, PRIORITY, 0, 0);

I set up a logic analyzer to read the data on the central device's UART lines, and am getting the data with no loss. But there is a 45ms delay between the end of one packet and the start of another. I noticed a similar delay on the peripheral device (used the logic analyzer on the GPIO pin for LED3, and added dk_set_led() calls to use LED's off-time to measure it).

Any ideas, thoughts, suggestions on what I can do to reduce the delay (down to at most 1ms) and speed up the BLE communication? Thanks in advance.

------------------------------------------------------------------------------------------

Development Setup (for both peripheral and central devices):

  • Board: nRF5340-DK
  • Development Environment: VS Code
  • SDK: nRF Connect 2.5.0
  • OS: Windows 10
Parents
  • Hi,

    Could you give us some more information about the application ? 
    Please beware that the minimum connection interval in BLE is 7.5ms. So if you want lower latency than that you would need to go for proprietary. In your case it's most likely the connection interval was 45ms. 

    Is it possible to combine multi packet in to one and processing them at 7.5ms latency ? 

    If the 1ms latency is a hard requirement, then you may want to take a look at our LLPM Bluetooth proprietary extension here: https://developer.nordicsemi.com/nRF_Connect_SDK/doc/latest/nrf/samples/bluetooth/llpm/README.html

    It's made for applications such as gaming mouse or keyboard. 

  • We are using the nRF's in a medical application; as such, we need the device to be operating essentially in real-time. That's about as much info as I can provide at this time.

    I will need to investigate further on whether the multi-packet, 7.5ms latency setup is acceptable. Our 1ms delay restriction is only the external component/processor end's input, so we do have some flexibility with the wireless communication. Would you be able to point me to some examples that show how the connection interval can be changed please? I've checked a couple examples so far and haven't found anything yet. I think the interval change in the LLPM example also relates to HCI which seems specific to LLPM and not the typical BLE functions.

    I took a look at the LLPM example, but after reading up on the README, I found the SoftDevice Controller info stating that "Low Latency Packet mode is not supported on the nRF53 Series". So then is achieving low latency on these nRF53's impossible then? From my limited understanding, SoftDevice is a library; would it technically be possible to write "our own library" (custom lower-level code) to make it happen?

    --------

    EDIT1: Additional question -do you anticipate support for LLPM on the nRF53 in the future?

    EDIT2: More info on the application - this portion of BLE in the project is anticipated to be only for the nRF devices specifically. As we don't need to handle other devices, going for proprietary is definitely an option.

  • Hi Afok,

    afok-esmart said:
    Given my team's timeline, we are also considering using the nRF54. Do you have any updates on when and/or how we could get our hands on nRF54 eval. boards (at least 2 of them)?

    Please contact our local Sales representative for the request. If you are located in California please contact Mr. Mike Obot at: mike.obot [at] nordicsemi.no

    Regarding the question with lower than 10ms interval. I would need to check with our expert in ESB. Please give us more detail on what's the error you see. 

  • I'll get in touch w/ Mr. Obot; thanks for the contact info!

    Sure. So if I changed the TX's k_sleep() call to delay 1ms, then connect the nRF Serial Terminal to the nRF52 w/ TX code, I'm seeing the terminal output usually alternate between "TX SUCCESS EVENT" and "TX FAILED EVENT" (seeing failed message roughly 50% of the time). I hooked up a logic analyzer to the LED GPIO pins of the nRF52 w/ RX code, and notice the LEDs not toggling at a regular interval (most of the time there's no activity, then some randomly-spaced toggling every couple seconds or so). Let me know if you need more details.

  • Hi Afok, 

    Could you try disable logging ? 


    CONFIG_LOG=n

    It could be that the PRX was too slow printing log on the content of the packet it receives that it couldn't serve the radio quickly enough. 


    I did a test here with k_sleep(K_MSEC(1)); and don't see any failed packet. You should turn off TX_SUCCESS printing out on PTX as well. 

    I assume you are testing using 2 nRF52840 DK ? 

  • Yes, I believe you're correct. Disabling PRX logging (w/ CONFIG_LOG=n) and turning off PTX success messages helped with that. I still saw PTX outputting "TX FAILED EVENT" but based on the logic analyzer, I believe it corresponds to longer time between packets that is needed when PTX retransmits the packet (since the LED sequence continues as normal i.e., the packet was not "lost"). I did also notice disabling PTX logging also helped a little more in allowing me to shorten the delay.

    But at least for my project's purposes, and because I'm just quickly testing the fastest speed I can send packets/events, the "failed events" are fine for now. As an FYI, I was able to bring down the time between packets to as low as roughly 0.5ms, by primarily using k_sleep(K_USEC()) for that PTX delay. I also played around with ESB configuration options that may have helped, such as: 

    • config.retransmit_count = 0; (in PTX, esb_initialize())
    • config.selective_auto_ack = false; (in PTX and PRX, esb_initialize())
    • config.retransmit_delay = 400; (in PTX, esb_initialize())
    • CONFIG_ESB_NEVER_DISABLE_TX=y (in prj.conf)

    The above was indeed from using 2 nRF52840 DKs. Also happy to report that the code ran just as well on 2 nRF5340 DKs (build configuration set to use network core).

    I'll need to investigate some of the more intricate details for ESB in the following days. I'll circle back to this post again if/once I have more questions or have come to a resolution.

  • Hi Afok, 


    You can crank it up even more by using use_fast_ramp_up in the ESB configuration.

    This will select the fast ramping up mode of the radio, allowing it to have shorter latency. The only drawback is that it will make it not compatible with old device running normal ESB. So if you have control over both ends of the communication, you can use this mode. 

Reply Children
  • I did experiment with "use_fast_ramp_up" as well (most likely would use it for my project if I got it working, as I do indeed have control over both ends of comms). It did lower the time between packets to 0.25-0.3ms, but only if my PTX "tx_payload" had length 6 bytes or less.

    It would stop working if the payload had more bytes (none of the LEDs on PRX would ever toggle). I still need to check if PTX is even sending them to begin with, and also see if PRX has anything else that makes it unable to serve the radio fast enough. Do you have any thoughts about what else I can try looking into?

  • Hi Afok, 
    I haven't tested myself but have you made sure you use the same configuration on both sides PTX and PRX. 
    Also you may want to send packet with noack. You would need to set selective_auto_ack = true to noack to take effect.

    There is a TX complete call back on the PTX that you can add to check if the packet is sent.

Related