This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

esb handler crash

I am trying to use ESB and BLE concurrently on NRF51822 SDK v11 SD130.. The BLE part is working fine but the ESB part keep causing error randomly. I put a break-point in app_error_handler and found that this is related to nrf_esb_event_handler.

I started my project from the example ble_app_uart.ESB_Timeslot. The operation is that the master device request data from slave device through ESB. The request is driven by app_timer every 1 second.

The code is as below:

void main_timeout_handler(void * p_context){
    uint32_t err_code;
    if (m_read_period){
	    m_read_period--;
	    if (m_MINIPM_read_period == 0){
		    m_read_period = C_DEFAULT_READ_PERIOD;
		    err_code = esb_timeslot_send_str(tx_buf, cmd1_LEN);
		    APP_ERROR_CHECK(err_code);
	    }
    }
}

Within 1-2minutes, I get an error that causes reset. Here is the call stack when error occur. image description

The error appear in the APP_ERROR_CHECK_BOOL after fifo_get_pkt():

void nrf_esb_event_handler(nrf_esb_evt_t const * p_event)
{
if (p_event->evt_id == NRF_ESB_EVENT_TX_FAILED)
{ 
    printf("f");            //Edited (2017/11/23)
    nrf_esb_flush_tx();

    m_tx_attempts += 1;
    m_state        = STATE_RX;
}

if (p_event->evt_id == NRF_ESB_EVENT_TX_SUCCESS)
{
    nrf_esb_payload_t payload;
    uint32_t          payload_len;

    printf("o");            //Edited (2017/11/23)

    /* Successful transmission. Can now remove packet from Tx FIFO. */
    payload_len = sizeof(payload);

    fifo_get_pkt(&m_transmit_fifo, (uint8_t *) &payload, &payload_len);
    APP_ERROR_CHECK_BOOL(payload_len == sizeof(payload));

    m_tx_attempts = 0;
}

if (p_event->evt_id & NRF_ESB_EVENT_RX_RECEIVED)
{
    printf("r");            //Edited (2017/11/23)
    /* Data reception is handled in a lower priority interrup. */
    /* Call UESB_RX_HANDLE_IRQHandler later. */
    NVIC_SetPendingIRQ(UESB_RX_HANDLE_IRQn);
}
}

--------Edit (2017/11/23)---------

I added debug message to see what is going on.

uint32_t esb_timeslot_send_str(uint8_t * p_str, uint32_t length)
{
    static nrf_esb_payload_t tx_payload;
    bool success;

    memset(&tx_payload, 0, sizeof(tx_payload));
    memcpy(tx_payload.data, p_str, length);
    tx_payload.length = length;
    tx_payload.pipe   = 0;
    
    printf("s");

    CRITICAL_REGION_ENTER();
    success = fifo_put_pkt(&m_transmit_fifo, (uint8_t *)&tx_payload, sizeof(tx_payload));
    CRITICAL_REGION_EXIT();
    
    return (success? NRF_SUCCESS: NRF_ERROR_NO_MEM);
}

//Short capture of this handler
void TIMESLOT_BEGIN_IRQHandler(void)
{
    ...
    if (m_tx_attempts < MAX_TX_ATTEMPTS)
    {  
        printf("t");
        fifo_peek_pkt(&m_transmit_fifo, (uint8_t *) &tx_payload, &tx_payload_len);
        APP_ERROR_CHECK_BOOL(tx_payload_len == sizeof(tx_payload));
    }
    ...
}

In uart, I can see "sttttto", "sttttttttttttto", "stttttstttt".

"sttttto": This is when it is normal.

"sttttttttttttto": This is pretty strange since it does not stop when it reach max attempt of 10 times.

"stttttstttt": If a new message is added before the previous is successfully sent, the error occur.

"f" never appear since it cannot go into NRF_ESB_EVENT_TX_FAILED in nrf_esb_event_handler.

It seems to be a problem that nrf_esb_event_handler is somehow not called such that it endlessly retry and cause other problems.

//----------Edit end (2017/11/23)--------

Basically I did not change much from the example. Can anyone help how I can solve this random reset issue? Any advise?

In addition to this random crash, there is another problem that unknown characters are being sent to UART. This only happen when ESB is enabled. I am not sure if this is also related to the random crash.

Uart Start!
ƒÖƒÖ
Uart Start!
ƒÖƒÖ
  • Hi

    What is the S130 stack doing when the problem occurs?
    How many links do you have running, and what are the connection intervals?

    Have you made any changes to the ESB event handler, or is it copied directly from the example?

    Best regards

  • What should I do to check the S130 stack?

    I am running one to one connection and the interval is 1second.

    I have directly copied the the example from github link text

    Yesterday, I tried to change the connection interval to 5seconds or to make it driven by button (pressed every 5~10s).

    It took longer time but the same error would still occur.

    When I add debug message in the event handler, it is found that nrf_esb_event_handler is NOT called such that it endlessly retry since m_tx_attempt-- is not called.

    The error appear when a new payload is added to fifo before the endlessly retrying message is transmitted.

    I will add this finding to the post.

  • Hi

    I was able to reproduce the issue on my end, and it seems to happen pretty consistently as a result of the UART and ESB interrupts happening simultaneously.

    I don't think changing the S130 connection parameters will make a big difference. I tried to increase the advertising interval from 40ms to 400ms, and somehow it just made the issue worse...

    I will have to do some more digging next week to try to get to the bottom of this.

    Best regards

  • Thank you for your helping.

    The issue seems related to the re-transmission in nrf_esb.c.

    NRF_ESB_EVENT_TX_FAILED is happened only when re-transmission is done and fail.

    However, when the timeslot ends before all re-transmissions are done, the re-transmission count is reset and thus re-transmission never reach the end within one timeslot.

    It seems I can avoid the reset issue by changing the re-transmission count to 0 (but still do re-attempts in esb_timeslot.c).

    Unfortunately, a new problem appear after that. The time it takes to do one tx attempt is somehow lengthened gradually. It takes like 10 seconds to do one tx attempt which is quite unfeasible.

  • Hi

    Could you try to increase the length of the timeslots on the TX side?
    That seemed to help for me. It has been running for a while now without any resets.

    I simply changed the TS_LEN_US and TX_LEN_EXTENSION_US defines at the top of esb_timeslot.c to 25000.

    This also allowed me to increase the MAX_TX_ATTEMPTS define, so I changed it to 16 to get less failed packets.

    Best regards

Related