Slow SPI performance

Question

I have 25Q64 SPI flash chip connected to SPI bus. Unfortunately it works too slow even with using IRQ routine via SPI0_TWI0_IRQHandler (everything done like in example spi_master-pca10028). Typical timing diagram:
 
 Is this problem in SPI hardware implementation or interrupt routing via softdevice?
Can you propose any ways to speed up SPI throughput?
Do you have correct double-buffering example with checking if TX buffer ready for 1 or for 2 bytes?
As a temporary solution I've use trick with delay: 
 for (int i = 0; i<datalen; i++) {
 MEM_SPI->TXD = data[i];
 nrf_delay_us(1);
 }
 
 It works better, but I'm not sure if this way productive and stable enough for production: 
 
 If this way Ok, I'll use my own nrf_delay_us with fewer NOP's :)

RK · Accepted Answer

Your expectations are a bit high for an interrupt-driven SPI interface on a 16MHz chip I think. Let's add it up. From that screenshot you're showing about 35µs for 4 transmissions, each one looks like it's about 1µs (so I'm assuming 8Mb/s) so the delay between each one is about 10µs. The softdevice is documented to add 3µs overhead to an open interrupt, so that leaves 7µs. 7µs is 110 or so clock cycles, about 100 instructions at a .9 cycle average for the Cortex M0. If you look at the code for the spi_master it's pretty easy to believe it takes about 100 instructions, although optimised for speed I'd expect it to be a little faster. Have you seen what happens optimised? 
 Is double-buffering properly implemented? Sort of. The code does populate TX twice at the start, however if you look at the interrupt code, it only ever writes one new byte (and reads one byte) before exiting. So after the first two quick bytes, it's only going to go as fast as the interrupt handler can fire, add one byte, then return again, at which point it's pretty much guaranteed at the 8Mb/s SPI speed there's another interrupt waiting for it. One change you could make in the spi_master interrupt code is to continually loop while EVENTS_READY is true (remembering to set it to zero each time) and there is still data to send/receive. That would mean one interrupt would, at that kind of SPI speed, probably end up writing most of the data in the buffer as the time taken just to work out whether there is a byte to write and write it is already larger than the time the SPI interface takes to write it over the wire and ask for the next one. 
 Your 'solution' isn't a particularly stable one and doesn't really follow the documentation. Yes it's true that the nRF51 manual doesn't say you can't constantly throw bytes at TXD and have them clocked out while completely ignoring the RXD bytes and the EVENTS_READY flag; but neither does it say you can. The nRF51 doesn't seem to have the concept of overflow on the SPI interface and may indeed continue to send bytes but that's not how the docs highly suggest you work. Also a fixed delay like that isn't very good. If you really want to send bytes in a tight loop then something like this which constantly checks the EVENTS_READY to trigger the next byte send and reads the RXD would seem more in accordance with the docs and also give you the max performance. I make that about 10 instructions or so, so it would feed the SPI just about at full speed. 
 while( true )
{
 if( MEM_SPI->EVENTS_READY )
 {
 uint8_t dummy = MEM_SPI->RXD;
 MEM_SPI->EVENTS_READY=0;
 MEM_SPI->TDX = data [i++];
 if( i >= datelen )
 break;
 }
}
 
 That sort of code is also what I was suggesting could go in the interrupt handler, to fill as many bytes as possible during one interrupt cycle, then you get the benefit of an interrupt-based and a tight-looped based solution hybrid. 
 The basic problem here is you're trying to use a chip with a 16MHz clock to try and keep an SPI running at 8Mb/s, full. If you really want to do that then a tight loop is going to be required as you only get 16 instructions per byte you're trying to clock out to get the next one in the buffer, that's just not very many. You could change the spi_master code a little to get better performance but if you really want to pump data out at that rate, you really want to just loop and write data, the interrupt-based solution doesn't really keep the buffers full at anything above about 1Mb/s.

Slow SPI performance

Top Replies