ESB library RX acknowledgements


Hi,

I am working with the ESB (Enhanced Shockburst) library using the nRF Connect SDK 2.6.1.
The library seems easy to use and has the main things that I need.

However, I want to get RX acknowledgement with specific payload for each transmission. It should be supported by the library, but I have difficulty to get it working.

The very simple sample code (esb_ptx and esb_prx) does not address this since it only acknowledges the first transmission and nothing after that.
The question is that how to reliably acknowledge every transmission, and not just the first one, or some subset.

My basic use case on the RX side is to flush the ack TX buffer (esb_flush_tx()) and re-write it (esb_write_payload(&tx_payload)) before every expected RX (and subsequent ack TX).
Flushing and rewriting the ack TX buffer ensures that I always have current data waiting as a response, whether previous transmission succeeded or not.
But when I do that, then clearly acknowledgement does not reliably happen. Something about the flush makes it not to work.

This old post is giving some context why flushing and rewriting the ack TX buffer is needed, and it is also suggested in the answer.
https://devzone.nordicsemi.com/f/nordic-q-a/22009/modifying-queued-esb-ack-payload

I found old posts about the same/similar problem. These relate to the old SDK, but the problem is the same: flushing makes ack not to happen.
https://devzone.nordicsemi.com/f/nordic-q-a/74008/problem-with-esb-ack-and-tx-flush/306402
https://devzone.nordicsemi.com/f/nordic-q-a/65016/my-esb-ptx-receives-delayed-duplicate-ack-packets-from-my-prx
https://devzone.nordicsemi.com/f/nordic-q-a/27042/esb-on-nrf52-sdk11-not-sending-ack-payloads

There are many others. The last link above is particularly relevant and it suggests a bug in the old SDK which is acknowledged, even offering a fix.
I tried to compare the old SDK code with the current one, but it is difficult. However, I am not convinced that the problem has been fixed.

My question is: Is it possible that this problem is not yet fixed? Is there still some problem with RX acknowledgements related to flushing the TX buffer?
My work-around is the same that is discussed in the posts above: Initializing the whole ESB library (esb_init(&config)) before every RX, but it seems ridiculous.

Parents
  • Hi Petri

    ESB is not designed to allow you to send an ACK payload as an immediate response to a packet from the PTX. Essentially you need to prepare the ACK payload before the packet from the PTX is received, otherwise you will just send an empty ACK without any data. 

    The simplest workaround to this is to send two packets back to back from the PTX. The first packet is used to signal to the PRX that the ACK payload should be prepared, then the PRX can upload the ACK payload, and it will be sent when the second packet from the PTX is received. 

    Another approach is to do something similar to BLE where you keep sending packets from the PTX as long as you receive ACK payloads from the PRX. Then you can ensure that the buffers on the PRX side are not filling up with old packets. 

    Best regards
    Torbjørn

  • Hi,

    I think I am doing it how you describe the correct approach to be. I understand that the ACK payload needs to be written to the TX buffer before the actual packet from PTX is received.

    I have a time-slotted system where PTX sends a packet once per second. PRX will respond with an ACK+payload. PRX will prepare the next ACK payload during the idle time before the next transmission is received.

    This works if packets are never missed (RX and TX nodes are close to each other with no obstacles):
    T=0 : PTX sends a packet and PRX receives it responding with an ACK payload
    T<1 : PRX prepares the next ACK packet (esb_write_payload())
    T=1 : PTX sends a packet and PRX receives it responding with an ACK payload
    T<2 : PRX prepares the next ACK packet (esb_write_payload())
    etc.

    The issue is that it is possible that one of the TX packets is missed and the ACK is never sent. Then PRX needs to prepare for the next ACK, but the previous ACK is still in the TX buffer. You then want to flush it first before writing the new payload. The normal thing is to flush always before writing a new payload. You would assume that flushing "just in case" the empty TX buffer should work, and clearly this is the intent when reading the code.

    The pseudo code then becomes:
    T=0 : PTX sends a packet and PRX receives it (or not) responding with an ACK payload
    T<1 : PRX prepares the next ACK packet (esb_flush_tx() + esb_write_payload())
    T=1 : PTX sends a packet and PRX receives it (or not) responding with an ACK payload
    T<2 : PRX prepares the next ACK packet (esb_flush_tx() + esb_write_payload())
    etc.

    But this does not work! Quite often the ACK is not sent even though RX is successful. I think this is the problem discussed in the posts I linked.

    The work-around which works is:
    T=0 : PTX sends a packet and PRX receives it (or not) responding with an ACK payload
    T<1 : PRX prepares the next ACK packet (esb_init() + esb_write_payload())
    T=1 : PTX sends a packet and PRX receives it (or not) responding with an ACK payload
    T<2 : PRX prepares the next ACK packet (esb_init() + esb_write_payload())

    So when I re-initialize PRX every time with esb_init(), it successfully flushes the TX buffer. But when I do it with esb_flush_tx(), then it does not work reliably.
    This work-around works now, but I think having to call esb_init() between every RX is not good and it starts wasting power when I am optimizing the TX/RX sequence for minimum wake-time.

    Best regards,
    Petri

  • Hi Petri

    Thanks for the detailed description, based on your description I agree this sounds like an issue. I am not sure how relevant it is to the older cases though, the changes introduced in the 'buffer improvement' repo that I linked to in the older case should be included in the nRF Connect SDK version of the ESB library. 

    Approximately how often will the ACK payload get dropped when you receive a packet on the PRX side? Does it happen consistently, or only in rare cases? 

    I am currently on travel, but will set aside some time next week to try and reproduce the issue. If you have any stripped down code that can be used to reproduce it feel free to share. 

    Are you testing this on standard nRF5340DK's in both ends, or are you using custom hardware? 

    Best regards
    Torbjørn

  • Thanks for looking into this!

    I have not found a clear pattern when the ACK will fail, but it will fail most of the time.
    Typically, the first few ACKs will go OK. Then there may be a period where every other ACK will succeed. I have at times seen long periods where every other ACK will work. It even seemed so systematic that I thought I could use it, but at some point the ACK stopped altogether. I have not been able to reproduce that systematic long "every other ACK" case in a simple stripped-down code. It was part of a sequence of desperate trials of using various ESB function call orders.
    The normal pattern is that it will work after boot for a brief period and then the ACK will start 90-100% failing after some 10-20 transmissions.

    I am developing and testing with unmodified nRF5340DKs, and I have also verified the same problem with nRF52480DK. It seems not to be 5340 specific which was my first assumption.

    I will include in the next reply a very simple modification of the esb_ptx and esb_prx samples which I have used to reproduce the problem. It is *really* simple so I hope I have not done something incredibly stupid...

    At least in my case this example will produce an ACK payload from PRX for a while (10 seconds) while perhaps missing a few, and then it will start missing all the ACK payloads. Every TX will still work and the PRX node gets every packet, but it will simply not send an ACK payload. I am only including the modification to the actual code to make the RX loop, but I have also added config options to enable RTT logging. I am assuming that RTT logging cannot have anything to do with this problem;-)

    Best regards,

    Petri

  • Reproducing missing ESB ACK payload
    - Take the samples esb_prx and esb_ptx
    - Take esb_ptx as is, and only modify the transmission period from 100ms to 1000ms (=k_sleep(K_MSEC(1000)) in the while-loop in the main-function).
    - Take esb_prx and replace the forever-listen with a looping listen which starts and stops the RX so that each packet is received in its own RX session.
    - The end of the main()-function in my case is:

    LOG_INF("Initialization complete");

    while (1)
    {
    err = esb_write_payload(&tx_payload);
    if (err) {LOG_ERR("Write payload, err %d", err);return 0;}

    err = esb_start_rx();
    if (err) {LOG_ERR("RX setup failed, err %d", err);return 0;}

    while (!event_happened)
    {
    k_sleep(K_MSEC(100));
    }
    err = esb_stop_rx();
    if (err) {LOG_ERR("RX stop failed, err %d", err);return 0;}
    event_happened = false;
    err = esb_flush_tx(); // This will not work!
    //err = esb_initialize(); // This WILL work!
    if (err) {LOG_ERR("Flush or init failed, err %d", err);return 0;}
    k_sleep(K_MSEC(100));
    }
    /* return to idle thread */
    return 0;
    - Above, the flag event_happened is added to control when RX has happened. In the main body of main.c, add
    static bool event_happened = false;
    - And then in the event handler (in the beginning before switch) add
    event_happened = true;

    That's all. When you now follow the logs of the PTX node then you will notice that ACK payloads only arrive a short while after boot, and then they completely stop.
    But if you comment out the line esb_flush_tx() and replace it with esb_initialize(), then ACK payloads will keep on coming and everything is fine.

Reply
  • Reproducing missing ESB ACK payload
    - Take the samples esb_prx and esb_ptx
    - Take esb_ptx as is, and only modify the transmission period from 100ms to 1000ms (=k_sleep(K_MSEC(1000)) in the while-loop in the main-function).
    - Take esb_prx and replace the forever-listen with a looping listen which starts and stops the RX so that each packet is received in its own RX session.
    - The end of the main()-function in my case is:

    LOG_INF("Initialization complete");

    while (1)
    {
    err = esb_write_payload(&tx_payload);
    if (err) {LOG_ERR("Write payload, err %d", err);return 0;}

    err = esb_start_rx();
    if (err) {LOG_ERR("RX setup failed, err %d", err);return 0;}

    while (!event_happened)
    {
    k_sleep(K_MSEC(100));
    }
    err = esb_stop_rx();
    if (err) {LOG_ERR("RX stop failed, err %d", err);return 0;}
    event_happened = false;
    err = esb_flush_tx(); // This will not work!
    //err = esb_initialize(); // This WILL work!
    if (err) {LOG_ERR("Flush or init failed, err %d", err);return 0;}
    k_sleep(K_MSEC(100));
    }
    /* return to idle thread */
    return 0;
    - Above, the flag event_happened is added to control when RX has happened. In the main body of main.c, add
    static bool event_happened = false;
    - And then in the event handler (in the beginning before switch) add
    event_happened = true;

    That's all. When you now follow the logs of the PTX node then you will notice that ACK payloads only arrive a short while after boot, and then they completely stop.
    But if you comment out the line esb_flush_tx() and replace it with esb_initialize(), then ACK payloads will keep on coming and everything is fine.

Children
  • Hi Petri

    Thanks for the detailed description. I will set aside some time to reproduce this, and get back to you with an update in a day or two. 

    Best regards
    Torbjørn

  • Hi Petri

    I have now had some time to test it, and was easily able to reproduce it. Essentially the esb_flush_tx() function wasn't updated properly when the 'buffer improvement' changes were introduced, and this rendered the flush command more or less unusable in PRX mode. 

    Could you try to replace the implementation of this function in esb.c with the code below? 

    This seems to fix the issue on my side.

    int esb_flush_tx(void)
    {
    	if (!esb_initialized) {
    		return -EACCES;
    	}
    
    	unsigned int key = irq_lock();
    
    	tx_fifo.count = 0;
    	tx_fifo.back = 0;
    	tx_fifo.front = 0;
    
    	if (esb_state == ESB_STATE_PRX) {
    		for (size_t i = 0; i < CONFIG_ESB_TX_FIFO_SIZE; i++) {
    			ack_pl_wrap[i].in_use = false;
    			ack_pl_wrap[i].p_next = 0;
    		}
    
    		memset(rx_pipe_info, 0, sizeof(rx_pipe_info));
    	}
    
    	irq_unlock(key);
    
    	return 0;
    }

    Best regards
    Torbjørn

  • Hi Petri

    Did you have time to test my proposed fix? 

    If this solves your issue I would like to try to push it into the next SDK release. 

    Best regards
    Torbjørn

Related