This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

Clarification on SPI double buffer and EVENTS_READY

First: nRF52840 dongle or nRF52840-Preview_DK (doesn't matter which I get the same issue), SDK17.0.2, Segger, Windows 10.  I am trying to dump a FIFO buffer from a sensor as fast as possible.  I thought I had it working, but then I realized that my rx_buffer in the nrf52840 was all 0s.  Then I saw that I might need to use the EVENTS_READY flag in this post.  I have been trying to figure out how to get this to work using the datasheet so that I don't need to use the interrupt system (too slow), but the code I have doesn't quite do it.  I get 24 clocks, I see the correct data on the MOSI pin, and I'm pretty sure on the MISO pin as well, but I never see the CS go high.  I guess this tells me that somehow the last while (!EVENTS_READY) is hanging, but if I remove it I don't get the correct result either.

// First time through, fill the double buffer
NRF_SPI0->TXD = FIFO_DATA_OUT_L | 0x80; // TXD = FIFO, RXD-1 = 0xFF
NRF_SPI0->TXD = 0xFF;  // TXD = 0xFF, TXD+1 = FIFO, RXD = 0xFF, RXD-1 = Byte1

// Read garbage (0xFF) from address TX when Event is ready
while (!NRF_SPI0->EVENTS_READY);
NRF_SPI0->EVENTS_READY = 0;

(void)NRF_SPI0->RXD; // After this line: TXD = 0xFF, TXD+1 = FIFO, RXD = Byte1, RXD-1 = ?

// Now we can push the last TX into the double buffer
NRF_SPI0->TXD = 0xFF;

// Now read data when Event is ready
while (!NRF_SPI0->EVENTS_READY);
NRF_SPI0->EVENTS_READY = 0;

ble_buff[ble_idx++] = (uint8_t)NRF_SPI0->RXD;

// Now read data when Event is ready
// while (!NRF_SPI0->EVENTS_READY);
// NRF_SPI0->EVENTS_READY = 0;

ble_buff[ble_idx++] = (uint8_t)NRF_SPI0->RXD;

nrf_delay_us(3);
NRF_GPIO->OUTSET = 1 << SPIM_CS;

How can I get three bytes from SPI as fast as possible without using the interrupts?  (obviously(?) this will go into a loop to clear out the sensor FIFO)

Parents
  • Seems you trying to use the old legacy implementation of SPI where you read and write individual bytes, it is better to use the SPIM module where you can prepare two buffers (rx and tx) and only execute the start task and the SPIM module will handle the rest until all data is sent/received.

    If you want to read and write low level registers using SPI peripheral you will need to follow the description in the datasheet carefully, you can also find an old implementation of spi_master_tx_rx() in for instance nRF5 SDKv5.2 (\spi_master) from 2015 that you may find useful:
    http://developer.nordicsemi.com/nRF5_SDK/nRF51_SDK_v5.x.x/ 

  • Okay, in an effort to further troubleshoot, I have been through many permutations of possibilities. I am beginning to think either there is something wrong with the nRF52840 or there is a typo somewhere in the datasheet. I have the following code:

    while (ble_idx < (fifo_buffer_size * 2))
    {
        NRF_SPI0->TXD = FIFO_DATA_OUT_L | 0x80;
        while (!NRF_SPI0->EVENTS_READY);
        NRF_SPI0->EVENTS_READY = 0;
        tmp = NRF_SPI0->RXD;
        printf("Finished first read, tmp is %" PRIu8 ", ER is %" PRIu32 "\r\n", tmp, NRF_SPI0->EVENTS_READY);
    
        NRF_SPI0->TXD = 0xFF;
        //while (!NRF_SPI0->EVENTS_READY);
        //NRF_SPI0->EVENTS_READY = 0;
        tmp0 = (uint8_t)NRF_SPI0->RXD;
        printf("Finished second read, tmp0 is %" PRIu8 "\r\n", tmp0);
    
        NRF_SPI0->TXD = 0xFF;
        //while (!NRF_SPI0->EVENTS_READY);
        //NRF_SPI0->EVENTS_READY = 0;
        tmp1 = (uint8_t)NRF_SPI0->RXD;
        printf("Finished third read, tmp1 is %" PRIu8 "\r\n", tmp1);
    
        nrf_delay_us(3);
        NRF_GPIO->OUTSET = 1 << SPIM_CS;
        __NOP();
        __NOP();
        __NOP();
        __NOP();
        NRF_GPIO->OUTCLR = 1 << SPIM_CS;
    }

    When I put in the last two while (!NRF_SPI0->EVENTS_READY) lines, I only get the first print statement (which tells me that tmp = 0 and EVENTS_READY = 0) and I only see two bytes of clocks on the oscilloscope. This pretty clearly indicates that the while loop hangs forever, which the datasheet indicates is not correct behavior.

    When I comment out the last two while (!NRF_SPI0->EVENTS_READY) lines (as in the code sample above) I get the three transmissions on the oscilloscope, but tmp, tmp0, and tmp1 are all 0 even though the MISO line has non-zero data.  I also get a fourth transmit when the outer loop starts over, but then it hangs indefinitely - again indicating that the while (!NRF_SPI0->EVENTS_READY) line is hanging in contradiction to the datasheet.

    Also, for completeness I ran the equivalent code on several other Cortex M4F-based MCUs just to check, and sure enough, they all read it correctly. I know that Nordic products work differently than some other MCUs, but it seems odd.

    Edit: I get the same behavior on two different nRF52840 Dongles as well, which seems to indicate that it is a problem with the nRF52840.  Also, in looking over the old spi_master example from 2015, I fail to see the effective difference between that code and what I have here.  If anyone could point it out (since I guess that would be equivalent to telling me how my code contradicts the datasheet) it would be extremely helpful.

    Edit edit: I have a thought that somehow it could be the MISO pin.  I can't find anywhere that says I can't use 0.24 for MISO - but if not, then that might explain why I always see 0 instead of what I should see (at least I should be seeing 0xFF for the first read).  Then it might follow that since the RXD is never being read, the EVENTS_READY register might not ever update?

    Here is my pic config:

    #define SPI_CS                         NRF_GPIO_PIN_MAP(0, 17) // Connected to P0.17
    #define SPI_SCK                        NRF_GPIO_PIN_MAP(0, 20) // Connected to P0.20
    #define SPI_MOSI                       NRF_GPIO_PIN_MAP(0, 22) // Connected to P0.22
    #define SPI_MISO                       NRF_GPIO_PIN_MAP(0, 24) // Connected to P0.24

    I have verified that the jumpers on the DK are correct and the dongle doesn't require jumpers for pin 0.24.

  • Good idea... I have NRF_SPI0->INTCLR = (1 << 2) earlier in the code to prevent hitting another SPI interrupt, but maybe that is wrong?  What is the proper way to disable SPI interrupts?  The datasheet doesn't give an examle, but the register list has the "A" ID in position 2 for both INTENSET and INTENCLR.  When I read INTENSET I get a 64 (0x40) instead of a 0x00.  So maybe I am doing that wrong?

  • I'm not sure I understand why the SPIM master is better if it is so much slower, therefore leaving SPI enabled for longer - eating up the very tiny battery I have to work with.  Maybe you know how to use it more effectively than the SDK example?  The best I could do was 14us between transactions (I have to have CS pin cycle every three bytes).

    I have used both while (!NRF_SPI0->EVENTS_READY) and while (NRF_SPI0->EVENTS_READY == 0), both to no avail. 

  • 0x40 is the EVENTS_END interrupt bit mask. One source of confusion is often that the SPI/SPIM/TWI/TWIM peripheral is a general purpose peripheral that supports all 4 modes, which means bits have meaning even if not expected. Side-effects are supposed to be benign. EVENTS_READY is EVENTS_RXREADY in TWI but has no listing in SPIM.

    I also mistyped earlier; 0x04 is the interrupt enable/disable mask, READY is probably just 0 or 1 as you are using, it is not defined. I would disable all interrupts though, with 0xFFFFFFFF.

    SPIM using hardware registers is no slower than SPI, and the cpu is sleeping during transfer although of course DMA is running; unless I am missing something? SPIM3 works at 32MHz with cpu asleep; SPIM0 16MHz. 3-byte transfers are supported, but there is a bug with 1-byte transfers.

    Edit: also I would suggest allowing for the bus issues:

        NRF_GPIO->OUTSET = 1 << SPIM_CS;
        // Data Synchronization Barrier: completes when all explicit memory accesses before this instruction complete
        __DSB();
        __NOP();
        __NOP();
        __NOP();
        __NOP();
        NRF_GPIO->OUTCLR = 1 << SPIM_CS;
        // Data Synchronization Barrier: completes when all explicit memory accesses before this instruction complete
        __DSB();

  • Try this, SPIM not using interrupts - not tested:

    uint8_t txBuf[1] = FIFO_DATA_OUT_L | 0x80;
    static NRF_SPIM_Type* pSPIM = NRF_SPIM0;
    
      // Clear all events
      pSPIM->EVENTS_STARTED = 0;
      pSPIM->EVENTS_STOPPED = 0;
      pSPIM->EVENTS_ENDTX = 0;
      pSPIM->EVENTS_ENDRX = 0;
      pSPIM->EVENTS_END = 0;
    
      // Enable SPI and set up buffers
      pSPIM->ENABLE     = 7;
      pSPIM->TXD.PTR    = (uint32_t)txBuf;
      pSPIM->TXD.MAXCNT = 1;
      pSPIM->ORC        = 0xFF;    // Unused Tx bytes 2nd and 3rd, set to 0xFF
      pSPIM->RXD.PTR    = (uint32_t)ble_buff;
      pSPIM->RXD.MAXCNT = 3;
      // Disable all interrupts
      pSPIM->INTENCLR = 0xFFFFFFFFUL;
    Loop
    {
      pSPIM->TASKS_START = 1;
      while(!pSPIM->EVENTS_END) ;
      pSPIM->EVENTS_END = 0;
    //either:
      //ble_idx += 3;
      //pSPIM->RXD.PTR = (uint32_t)&ble_buff[ble_idx];
    //or, faster:
      pSPIM->RXD.PTR += 3;
    // CS stuff ..
    }

  • I need to give this a try still, but great idea bumping the rxd.ptr while keeping the rxd.maxcnt at 3.  I'm not sure why, but I didn't think you could do that - so last time I tried using SPIM, I was copying the contents to the buffer every time, which took way too long.

    I believe I owe you a beer.  Well played.

Reply Children
  • Fingers crossed; a pint? - London Pride or Guinness, please :-)

  • Works great!  Getting around 6Mbps, which is impressive for 8MHz clock plus SS toggle.  On to the next problem!

    Glad to supply you with any kind of beer you like, in any quantity, any time!

  • 6Mbps? Pretty good .. how about adding the piece de resistance? 

       // Enable cache and hit/miss tracking
       NRF_NVMC->ICACHECNF = 0x101;

    This will remove 2 wait-states per instruction, assuming you don't have tons of interrupts thrashing away in the background.

  • No other interrupts should be firing, just dumping sensor buffer then transmitting to central.  I was unaware of this functionality, or maybe I should say I haven't gotten this far down the rabbit hole yet.  The speed I'm getting hits my requirement, but I can always use a current reduction.  I guess the idea is to monitor hits vs misses and multiply by the number of cycles then use that to determine power consumption, but what about hits?  I'm not seeing figures on the average current for the cache.  Or maybe it's just better to do a hard measurement over a full cycle both ways to determine which is optimal.  Do you happen to have a feel (since you appear to be some sort of wizard) for if the power consumption for cache is consistent across the product, or if there is a large variance between specimens?

  • You can ignore the non-resettable, saturating, hits and misses counters, I just use that to check the hit rate is high enough to be worth using in code which has lots of interrupts; there is a small power cost but a much bigger power saving for loop code like this which doesn't sleep. The cache means this code runs from RAM instead of Flash, so saving 2 wait states for each flash memory access is significant. In this code, dunno; maybe 6MHz bit rate goes up to 6.4 (guess). We'd have to count the flash accesses (look at assembler) then calculate ratio of clocks saved, then allow a bit for AHB bus issues .. nah, just run it and time it :-) Use 0x1 instead of 0x101 if you don't look at hits & misses. It's only turned on once; can always turn if off after the bulk transfer (set 0x0).

Related