Zephyr SPI on nRF52 - spi_transceive_dt() delay between bytes.

Hi, I am wondering if anyone with SPI expertise can answer this Q, I'm using an nRF52840 with spi_transceive_dt()

In my devicetree I've configured SPIM on SPI3 to talk to a sensor at 10MHz.

I am simply doing a register read to a sensor device. So calling spi_transceive_dt() with 1 Tx byte and 2 Rx bytes (the sensor requires 1 dummy byte for each read). It responds correctly with 0x00, 0x24 and I get the data fine.

I am wondering why is there is big ~14us gap between bytes. From the red line I have highlighted, you can see SCK is just flat, waiting during this time, until it starts to clock again and we see the next byte from the slave (0x00), and then the same gap until the next byte (0x24).

I am wondering if this is to do with the MCU or some inefficiency in the SPI driver somewhere in the stack of nRFX drivers to Zephyr SPI API?

  • Hi John, 
    You are right. When you have multiple arrays/chunks in the buffer the SPI driver will perform multiple SPI transaction. After  each NRFX_SPIM_EVENT_DONE it will move to the next array/chunk. 

    I don't see this as a limitation. If you want to have continuous transaction why not just have one single array of 3 bytes and you can ignore the first dummy byte. But I assume it's how the bmi270 driver works. You may consider to re-write the driver if you prefer it the other way. I am not so familiar with the BMI270 so you would need to double check if the latency between the first dummy read byte and the actual read is needed. 

  • Hi Hung,

    We are working on a high speed data acquisition project so we are aiming to optimise timings as much as possible.

    We can definitely optimize the way we use spi_transceive_dt in Zephyr.

    We were more just wondering, what causes this delay in when the SPIM driver moves to next array/chunk? Is it simply just overhead in setting up DMA transaction? Or is it something that could be written better for our application?

  • Hi again, 
    I would assume in your data acquisition application you will have more than 2 bytes on each transaction , correct ? 
    My suggestion is to try having as large as possible chunk size so that you don't have to have multiple transactions. 


    You can take a look at the spi_nrfx_spim.c file to see how it's implemented. Basically when an buffer is finished transferring (NRFX_SPIM_EVENT_DONE event) the next chunk will be prepared and then start transferring. But I'm not so sure what else can be improved to make it faster other than trying not to have multiple transactions.  

  • I have a finding and a solution.

    When the tx-buffer and rx-buffer not same size, it still works but we see the clock stopping.

    no clock stop

    static int bmp388_reg_read_spi(const union bmp388_bus *bus,
                       uint8_t regaddr, uint8_t *buf, int size)
    {
        int ret;
        uint8_t addr[size + 2];
        uint8_t rxdata[size + 2];
    
        memset(&addr,0x00,ARRAY_SIZE(addr));
    
        const struct spi_buf tx_buf = {
            .buf = &addr,
            .len = ARRAY_SIZE(addr),
        };
        const struct spi_buf_set tx = {
            .buffers = &tx_buf,
            .count = 1
        };
    
        struct spi_buf rx_buf;
        const struct spi_buf_set rx = {
            .buffers = &rx_buf,
            .count = 1,
        };
    
        addr[0] = (regaddr) | 0x80;
        rx_buf.buf = &rxdata;
        rx_buf.len = ARRAY_SIZE(rxdata);
    
        ret = spi_transceive_dt(&bus->spi, &tx, &rx);
        if (ret) {
            LOG_DBG("spi_transceive FAIL %d\n", ret);
            return ret;
        }
    
        memcpy(buf, (uint8_t *)&rxdata[2], size);
    
        return 0;
    }
    

    clock stop

    static int bmp388_reg_read_spi(const union bmp388_bus *bus,
                       uint8_t regaddr, uint8_t *buf, int size)
    {
        int ret;
        uint8_t addr[3] = { 0 };
        uint8_t rxdata[size + 2];
    
    

    You can see the '0' pushed when tx and rx same size.
    but gap and ff if not. So something happen under the hood if the core need to handle the lager receive buffer.

  • Hi Hung,

    Sure, like i said we can definitely optimize our sensor driver to join together SPI transactions.

    We will also look at the nRFX SPIM driver and see if there is some potential for us to write a more optmised method to prepare next transaction after NRFX_SPIM_EVENT_DONE. If this is at all possible.

Related