ESB link in PRX mode requires re-initialization after minutes with nRF54L15

We are experiencing some ESB link failures on the receiver end of our system: nrf54L15.  After minutes to hours of use the receiver will stop getting data (we know the transmitter is still sending).  If we re-initialize esb subsystem everything is back to normal.  Unfortunately we have no good means to detect the link is down or know why. Its still in ESB_MODE_PRX we just are no longer getting any interrupts.  We have some timeouts on the configured `event_handler` but those are too long to be very useful.  Any advice you have to debug would be helpful but I also have two questions:

1. Is there some detailed documentation on how the ESB system/driver works on the nRF54L15? the driver code is difficult to understand on its own: https://github.com/nrfconnect/sdk-nrf/blob/main/subsys/esb/esb.c and the documentation page seems generic and high level (does not explain much of what the driver does): https://docs.nordicsemi.com/bundle/ncs-2.7.99-cs2/page/nrf/protocols/esb/index.html
2. Can you provide some ESB status register we might be able to monitor/read from to get more information beyond the three interrupt events and the mode that the ESB is in?

Thanks,

Galen

Parents Reply Children
  • 1 - we are using CONFIG_SOC_NRF_FORCE_CONSTLAT already
    2 - well the PRX is always the PRX but we do send some messages back to the PTX other then ack's. We are receiving transmissions at 12Hz and replying at 1Hz (from the problematic PRX)

    One of my colleagues looked into migrating to v3.0.0 but was having some issues getting things to run on the EngA silicon (what is on our boards).  I went through it and cherry picked things that looked like they might have an effect, specifically: https://github.com/nrfconnect/sdk-nrf/commit/be1549932bd0200a7d2da4c61e2c7b8132d514d2 (didn't improve anything really).

    Lastly what i did do as a hack to improve things after noticing it was hung at the ack state is add the following:

    static void on_radio_disabled_rx_ack(void)
    {
    	esb_fem_for_ack_rx();
    
    	if (IS_ENABLED(CONFIG_ESB_FAST_SWITCHING)) {
    		nrf_radio_shorts_set(NRF_RADIO, radio_shorts_common);
    		nrf_radio_task_trigger(NRF_RADIO, NRF_RADIO_TASK_RXEN);
    	} else {
    		nrf_radio_shorts_set(NRF_RADIO, (radio_shorts_common |
    						 NRF_RADIO_SHORT_DISABLED_TXEN_MASK));
    	}
    
    	update_rf_payload_format(esb_cfg.payload_length);
    
    	nrf_radio_packetptr_set(NRF_RADIO, rx_payload_buffer);
    	on_radio_disabled = on_radio_disabled_rx;
    
    	esb_state = ESB_STATE_PRX;
    
    #if defined(CONFIG_ESB_USE_PRX_ACK_TIMEOUT)
    
    #if IS_EMPTY(CONFIG_ESB_PRX_ACK_TIMEOUT_US)
    #error "No ESB_PRX_ACK_TIMEOUT_US provided but ESB_USE_PRX_ACK_TIMEOUT enabled"
    #endif
    
    	/* Configure timer to produce an ISR after retransmit_delay */
    	nrf_timer_task_trigger(esb_timer.p_reg, NRF_TIMER_TASK_CLEAR);
    	nrfx_timer_clear(&esb_timer);
    	nrfx_timer_compare(&esb_timer, NRF_TIMER_CC_CHANNEL1,
    		CONFIG_ESB_PRX_ACK_TIMEOUT_US, true);
    
    	/* Configure PPI to start the timer when transmission ends */
    	esb_ppi_for_wait_for_rx_set();
    
    	nrf_timer_event_clear(esb_timer.p_reg, NRF_TIMER_EVENT_COMPARE1);
    
    	on_timer_compare1 = on_timeout;
    
    #endif
    
    }
    
    #if defined(CONFIG_ESB_USE_PRX_ACK_TIMEOUT)
    static void on_timeout(){
    	LOG_ERR("Timeout function called on_timeout");
    	nrf_timer_int_disable(esb_timer.p_reg,  nrf_timer_compare_int_get(NRF_TIMER_CC_CHANNEL1));
    	esb_ppi_for_wait_for_rx_clear();
    
    	clear_events_restart_rx();
    	LOG_ERR("Timeout function called on_timeout return");
    }
    #endif


    For the most part this seems to be working to add a timeout to that state that will clear the events and restart rx via `clear_events_restart_rx()`.  If you had any advice on any unforeseen implications that might occur here that would be very helpful. Additionally I'm not sure if there are other states the system will hang id need to add something like this too, but ill cross that bridge when i get there.  Otherwise for now this seems like a good stopgap until we can migrate to v3.0.0 and Rev1/2 on our next spin of boards.

Related