This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

BUG: Race condition in SPI driver

Hi,

I noticed that I'd occasionally hit an error when calling into nrf_drv_spi_transfer using the SPIM driver (SPI0_USE_EASY_DMA set). nrf_drv_spi_transfer was returning NRF_ERROR_BUSY:

if (p_cb->transfer_in_progress)
{
    return NRF_ERROR_BUSY;
}

The problem turned out to be a race condition, where transfer_in_progress was set after the SPIM module was started; thus, occasionally it would complete the SPI transaction (and clear transfer_in_progress and call the callback) before the transfer_in_progress flag was set. Thus, transfer_in_progress was essentially getting set when the transaction was complete, thus preventing further transactions from occurring.

I could reproduce this issue very easily by setting the compiler optimization to -O0. It would occasionally happen with -O3.

Attached is a patch which I believe fixes the issue - it sets the flag before initiating the transfer.

If you could, please confirm this is the correct patch - thanks! diff.txt

  • hi,

    Thank you for reporting the bug, we will fix it in next release. Patch looks reasonable.

  • I have also observed this, and tried to apply this patch to nrf_drv_spi.c, in v0.9.1 SDK, and I found that error was still there. Both compiler optimization, -O0 and -O3, produce this error.

    It would be good if someone could verify that I applied the patch correctly. Please observe my code below for nrf_drv_spi_transfer:

    ret_code_t nrf_drv_spi_transfer(nrf_drv_spi_t const * const p_instance,
                                    uint8_t const * p_tx_buffer,
                                    uint8_t         tx_buffer_length,
                                    uint8_t       * p_rx_buffer,
                                    uint8_t         rx_buffer_length)
    {
        spi_control_block_t * p_cb  = &m_cb[p_instance->drv_inst_idx];
        ASSERT(p_cb->state == NRF_DRV_STATE_POWERED_ON);
        ASSERT(p_tx_buffer != NULL || tx_buffer_length == 0);
        ASSERT(p_rx_buffer != NULL || rx_buffer_length == 0);
    
        if (p_cb->transfer_in_progress)
        {
            return NRF_ERROR_BUSY;
        }
    
        // Finish zero-length transfers immediately.
        if (tx_buffer_length == 0 && rx_buffer_length == 0)
        {
            return NRF_SUCCESS;
        }
    
        // Activate Slave Select signal, if it is to be used.
        if (p_cb->ss_pin != NRF_DRV_SPI_PIN_NOT_USED)
        {
            nrf_gpio_pin_clear(p_cb->ss_pin);
        }
    
        CODE_FOR_SPIM
        (
            // EasyDMA requires that transfer buffers are placed in Data RAM region;
            // signal error if they are not.
            if ((p_tx_buffer != NULL && !nrf_drv_is_in_RAM(p_tx_buffer)) ||
                (p_rx_buffer != NULL && !nrf_drv_is_in_RAM(p_rx_buffer)))
            {
                return NRF_ERROR_INVALID_ADDR;
            }
    
            NRF_SPIM_Type * p_spim = p_instance->p_registers;
    
            nrf_spim_tx_buffer_set(p_spim, p_tx_buffer, tx_buffer_length);
            nrf_spim_rx_buffer_set(p_spim, p_rx_buffer, rx_buffer_length);
    
            nrf_spim_event_clear(p_spim, NRF_SPIM_EVENT_ENDTX);
            nrf_spim_event_clear(p_spim, NRF_SPIM_EVENT_ENDRX);
            p_cb->tx_done = false;
            p_cb->rx_done = false;
    				// Patch as per: devzone.nordicsemi.com/.../ 
    //        nrf_spim_event_clear(p_spim, NRF_SPIM_EVENT_STOPPED); // Removed as per patch 
    
    //        nrf_spim_task_trigger(p_spim, NRF_SPIM_TASK_START); // Removed as per patch 
    
            if (p_cb->handler)
            {
                p_cb->transfer_in_progress = true;
            }
       //     else  // Removed as per patch 
    				nrf_spim_event_clear(p_spim, NRF_SPIM_EVENT_STOPPED); // Added as per patch 
    				nrf_spim_task_trigger(p_spim, NRF_SPIM_TASK_START); // Added as per patch 
    				
    				if (!p_cb->handler)// Added as per patch 
            {
                while (!nrf_spim_event_check(p_spim, NRF_SPIM_EVENT_ENDTX) ||
                       !nrf_spim_event_check(p_spim, NRF_SPIM_EVENT_ENDRX)) {}
    
                // Stop the peripheral after transaction is finished.
                nrf_spim_task_trigger(p_spim, NRF_SPIM_TASK_STOP);
                while (!nrf_spim_event_check(p_spim, NRF_SPIM_EVENT_STOPPED)) {}
            }
        )
    ....
    ....
    ....
    ....
    
  • So, as per my comment to the other answer, The patch did not solve this for me, albeit it's possible I did not apply it correctly. (see comment above).

    BUT, I did discover a simpler fix that solved this for me, and that's simply inserting one line of code:

    p_cb->transfer_in_progress = false;
    

    Right before the last line of nrf_drv_spi_transfer as shown below:

    ret_code_t nrf_drv_spi_transfer(nrf_drv_spi_t const * const p_instance,
                                    uint8_t const * p_tx_buffer,
                                    uint8_t         tx_buffer_length,
                                    uint8_t       * p_rx_buffer,
                                    uint8_t         rx_buffer_length)
    
        //SPI stuff...
        //SPI stuff...
        //SPI stuff...
        
        p_cb->transfer_in_progress = false; 
        return NRF_SUCCESS;
    }
    
  • The patch in your first comment looks correct to me. Note that if you're not setting p_cb->transfer_in_progress = true; before triggering the event, the driver and your application will never know when the SPIM transfer is actually in progress, so I don't believe that's a legitimate patch...

  • Your patch looks good for the SPIM code (probably that's what you're using, @mikevoyt). I found that a similar fix is needed for the SPI code as well. This may be the problem you're having, @papyrus.

    Fixed code for SPI:

    CODE_FOR_SPI
    (
        NRF_SPI_Type * p_spi = p_instance->p_registers;
    
        p_cb->p_tx_buffer       = p_tx_buffer;
        p_cb->tx_buffer_length  = tx_buffer_length;
        p_cb->p_rx_buffer       = p_rx_buffer;
        p_cb->rx_buffer_length  = rx_buffer_length;
        p_cb->bytes_transferred = 0;
    
    			//set transfer_in_progress before doing anything to events
    			//fixes a race condition that was causing issues with transfer
    			//completing before flag being set.
    			//see: devzone.nordicsemi.com/.../
        if (p_cb->handler)
        {
            p_cb->transfer_in_progress = true;
        }
    
        nrf_spi_event_clear(p_spi, NRF_SPI_EVENT_READY);
    
        // Start the transfer by writing some byte to the TXD register;
        // if TX buffer is not empty, take the first byte from this buffer,
        // otherwise - use over-run character.
        nrf_spi_txd_set(p_spi,
            (tx_buffer_length > 0 ?  p_tx_buffer[0] : p_cb->orc));
        // TXD register is double buffered, so next byte to be transmitted can
        // be written immediately, if needed, i.e. if TX or RX transfer is to
        // be more that 1 byte long. Again - if there is something more in TX
        // buffer send it, otherwise use over-run character.
        if (tx_buffer_length > 1)
        {
            nrf_spi_txd_set(p_spi, p_tx_buffer[1]);
        }
        else if (rx_buffer_length > 1)
        {
            nrf_spi_txd_set(p_spi, p_cb->orc);
        }
    
        // For blocking mode (user handler not provided) wait here for READY
        // events (indicating that the byte from TXD register was transmitted
        // and a new incoming byte was moved to the RXD register) and continue
        // transaction until all requested bytes are transferred.
        // In non-blocking mode - IRQ service routine will do this stuff.
        if (!p_cb->handler)
        {
            do {
                while (!nrf_spi_event_check(p_spi, NRF_SPI_EVENT_READY)) {}
                nrf_spi_event_clear(p_spi, NRF_SPI_EVENT_READY);
            } while (transfer_byte(p_spi, p_cb));
        }
    )
    
Related