SPIM3 peripheral not reliable

Environment: NCS v2.0.2

Chip: NRF52840_xxAA_REV2

Board: Custom pre-production board

Hi,

I know that there are several existing tickets that revolve around SPIM3, but I have tried all suggestions int the 20-some-odd tickets that I've researched so far with no change in results.

What I'm trying to accomplish

We have been using the SPIM2 peripheral with no problems whatsoever. However, as we move towards production we would like to squeeze more speed out of the bus if possible. In the case of our chip, that would mean moving to the SPIM3 peripheral that offers up to 32Mhz operation. We have two devices on the SPI lines: a WiFi module (25MHz max) and an SD card. All physical SPI pins are high-speed capable. Due to the WiFi chip's speed limitation we only really need to go to 16 Mhz. So far, SPIM3 is not being reliable at any speed.

Symptoms

The bus seems to work fine for a small amount of time. I can write maybe 500kbit to the SD card or can start to configure the wifi module, but after a small amount of traffic the bus seems to have a hiccup and stops working, In the case of the SD card I get a file write error, and with the wifi module it just stops sending the correct next command to continue configuration.

Troubleshooting

I have scoped the lines at 8, 16, and 32Mhz and they have essentially identical form. It does not seem to be raw signal related. I know the devices can work at 8Mhz because they work fine when using SPIM2. I tried using SPIM3 at 8Mhz and even down to 4Mhz and get the exact same behavior. So again, that leads me to believe that the issue is not speed or signal related, but that is just an assumption on my part.

I also printed all of the spi traffic from the wifi module to see where things start to differ. It always differs at the exact same spot in configuration. It makes an 2 byte transmit and expects a 2 byte response of 0x00 0x58. I get the correct response when using SPIM2, but on SPIM3 it always responds with 0x00 0xb0. It may be a coincidence, but 0x58 left-bit-shifted by 1 is equal to 0xb0 so it may be that the RX value is being shifted to the left by one. (0x58 = 01011000; 0xb0 = 1011000).

I've read the various errata related to SPIM3 such as anomaly 198 and am working through ensuring that all relevant anomalies are being addressed.

Thank you for any advice or help,

Louis

  • In case this is something you have not considered, SPIM3 AHB bus master has the lowest priority of all the peripherals, including SPIM2, a bit silly but then SPIM3 was a later addition. Also as a good general rule, avoid situations where more than one bus master is accessing the same slave.

    Worth a try: The RAM interface is divided into 9 RAM AHB slaves. RAM AHB slave 0-7 is connected to 2x4 kB RAM sections each and RAM AHB slave 8 is connected to 6x32 kB sections, as shown in Memory layout on page 20. Allocate one RAM section each to SPIM3 receive and SPIM3 transmit buffers, details on how to do this are sprinkled throughout the devzone. Other stuff (data) can reside in each of these two RAM sections, but they must not be frequently accessed by higher-speed peripherals which means none of the other peripherals as we've seen the SPIM3 is at the bottom of the barrel.

    I have run SPIM3 at maximum speed with no problems, but standalone, which would imply this is not a hardware bit error although the error here is hard to explain.

  • Two things to clarify the 'scope traces: 1) Adjust the compensation of each probe to square up the signals, usually a tiny screw on the probe head, and 2) offset the traces slightly so that the signal edges can be better viewed

  • With the screenshots above I see "working" is indeed Mode 0, shift on negative edge and sample on positive edge; however when it's "not working" the shift is occurring on the positive edge which implies either that the 'scope is showing traces from two different devices which have different Modes or that the device was inadvertently changed to use a different Mode.

    If two different devices with different modes just change the Mode prior to accessing each device.

  • Yeah, I agree with you here. I'll double check to see what the mode for the other device is. I am however checking the NRF_SPIM3->CONFIG register on each transaction and it's always reporting as 0 which I was interpreting as the mode not changing. My understanding is that the mode is set through the spi_config struct by assigning the CPOL and CPHA values.

    I'm doing that in the device driver when I assign the bus to it:

    .spi = SPI_DT_SPEC_INST_GET(index,
    					    SPI_OP_MODE_MASTER |
    					    0 << 1 |                <--- CPOL
    					    0 << 2 |                <--- CPHA
    					    SPI_WORD_SET(8) |
    					    SPI_TRANSFER_MSB,
    					    0U),

    (I know shifting a 0 is pointless, but just for illustration)

    Edit: I have confirmed that both devices are being told to operate in mode 0.

    Thanks for your suggestions!

  • It's the slave on the "Not Working" device that is not in Mode 0, not the Master .. try changing the Mode on the Master just before accessing that fail case and all should be well

Related