Wi-Fi driver throws out packets if sent too fast. How can we know when the driver queue is full so we can wait and not overload it?

We need very fast Wi-Fi transfer rates at times.  I have been able to tune the code and Wi-Fi driver settings to get up to 30-35 Mbps which is great.  But if I don't throttle the transmission then the faster the packets are sent, the higher the percentage are thrown out by the Wi-Fi driver.

The application level zsock_send(sock, packet, size, 0) function blocks, so I would think that would automatically throttle how fast the data is sent from the application so that it doesn't overflow any downstream buffers such as in the Wi-Fi driver.  But that is not the case.  If the data is sent too quickly, then a large percentage of packets are lost.  I traced it down to the Wi-Fi driver:

nrf_wifi_if_send() -> nrf_wifi_fmac_start_xmit() -> nrf_wifi_fmac_tx() -> tx_process() - tx_enqueue()

If the queue is full then it just silently discards any additional packets.  This does not block the application, so it just keeps sending without knowing that 10-25% (depending on the sending speed) of the packets are being ignored and never sent.

So if the queue being full doesn't bubble up and cause the zsock_send() call to block, how do I know when I'm sending too fast and the driver queue is overflowing?  I tried increasing the queue size with CONFIG_NRF700X_MAX_TX_PENDING_QLEN, but that doesn't really help since we have sustained high speeds for a few seconds.  If the queue is being filled faster than it is being emptied, no matter how big it is it will overflow.  So I need some way to automatically slow the sending of the packets to avoid the overflow.

  • Is there a function to call to get the low level driver queue state?
  • Is there a way to get a count of how many packets the tx_enqueue or tx_process functions are discarding?  Maybe I could poll that as a way to infer when overflow is occurring.  But I'd rather not have any overflow at all, because that means lost packets and I won't know which ones.
  • Is there maybe a callback or something that can send an event when the queue is getting close to full?
  • Is there a way to make zsock_send() truly block if the downstream queue is full?

The zperf speed test has a loop that generates successive packets and calls zsock_send() to send them.  If the loop time is too short for the target data rate it is trying to get then it will insert delays with k_sleep() to slow things down.  But if the target speed is set too high, the Wi-Fi driver starts throwing out packets.  And if the delay is removed completely so the loop runs as fast as the zsock_send() blocking allows, then even more packets get thrown out.

There has to be a way to send as fast as possible, but wait when the Wi-Fi Tx queue gets full so it can catch up.  How can I do this?

Parents Reply Children
  • Thanks Runar.  I look forward to the responses.  It might help if I simplify or clarify my question a bit:

    I understand that the packet data is passed between different queues and different threads.  But it seems like the most direct way for this to work would be to have the application level zsock_send() block whenever the driver Wi-Fi queue is full if that is what the socket is connected to.  Then we'd be guaranteed to not overflow the Wi-Fi queue since zsock_send() wouldn't execute until there was room for the packet to be sent.

  • Hi 

    I will just forward the information I got from the Wi-Fi team first and then I will do another check regarding your post

    Is there a function to call to get the low level driver queue state?

    No public API, but we have this shell command wifi_util tx_stats 0 they can either use this command or call the code https://github.com/nrfconnect/sdk-nrf/blob/main/drivers/wifi/nrf700x/src/wifi_util.c#L265 from their program (please note that this isn't supposed to be called publicly, so, appropriate locking is missing and needs to be taken care)

    1. This gives insight in to TX data path in the nRF70 driver.
    Is there a way to get a count of how many packets the tx_enqueue or tx_process functions are discarding?  Maybe I could poll that as a way to infer when overflow is occurring.  But I'd rather not have any overflow at all, because that means lost packets and I won't know which ones.

    These are part of the Wi-Fi statistics available via both SHELL wifi statistics and net-mgmt API:  See the data structure  https://docs.nordicsemi.com/bundle/ncs-latest/page/zephyr/connectivity/networking/api/net_stats.html#c.net_stats_wifi and SHELL/API code https://github.com/nrfconnect/sdk-zephyr/blob/main/subsys/net/l2/wifi/wifi_shell.c#L881

    1. errors.tx/rx -> Total dropped packets in nRF70 driver

    Is there maybe a callback or something that can send an event when the queue is getting close to full?

    Zephyr doesn't have a way for drivers to indicate to the networking stack (and there by applications) about the overflow on data queues and also ability stop/start (in Linux we have https://docs.kernel.org/networking/multiqueue.html#intro-kernel-support-for-multiqueue-devices and once a certain watermark is reached driver stops/starts the queues).

    Is there a way to make zsock_send() truly block if the downstream queue is full?

    There is unfortunately no per-packet feedback so my understanding it is not possible. However I will do another check with your question just to verify. 

    nRF70 driver does have a pending_q i.e., Queue to be used when nRF70 is busy and doesn't accept any further frames, but it has a build-time limit (for memory control), so, if Queue is full, it will start dropping packets. 

    I will update again as soon as I have something

    Regards

    Runar

  • Update: Is really just on short what I wrote above. we don't have flow control in Zephyr to make sure not to overwhelm the queues and not cause packet drops

    Regards

    Runar

Related