nrf5340 Anomaly 159 Workaround : not a workaround? Breaks QPSI Flash use on NCS 2.8.0

Updating from NCS2.6 to 2.8, the access to my QPSI Flash chip got broken (mx25r64, connected and configured using QPSI exactly as per the nrf5340DK board)

DTS entry:

&qspi {					// external flash
	status = "okay";
	pinctrl-0 = <&qspi_default>;
	pinctrl-1 = <&qspi_sleep>;
	pinctrl-names = "default", "sleep";
	mx25r64: mx25r6435f@0 {
		compatible = "nordic,qspi-nor";
		reg = <0>;
		/* MX25R64 supports only pp and pp4io */
		writeoc = "pp4io";
		/* MX25R64 supports all readoc options */
		readoc = "read4io";
		sck-frequency = <8000000>;
		jedec-id = [c2 28 17];
		sfdp-bfp = [
			e5 20 f1 ff  ff ff ff 03  44 eb 08 6b  08 3b 04 bb
			ee ff ff ff  ff ff 00 ff  ff ff 00 ff  0c 20 0f 52
			10 d8 00 ff  23 72 f5 00  82 ed 04 cc  44 83 68 44
			30 b0 30 b0  f7 c4 d5 5c  00 be 29 ff  f0 d0 ff ff
		];
		size = <67108864>;
		has-dpd;
		t-enter-dpd = <10000>;
		t-exit-dpd = <35000>;
	};
};

When accessing it, I now get these logs:

[00:00:15.432,556] <err> qspi_nor: nRF5340 anomaly 159 conditions detected
[00:00:15.440,032] <err> qspi_nor: Set the CPU clock to 64 MHz before starting QSPI operation

and then any flash access (eg to filesystem access using ELM) fails. On NCS2.6 this all worked fine.

This appears to be a 'fix' for anomly 159 (which I was not aware of experiencing!) which notes that if the HCLK_192M divider is not 0 _and_ the CPU clock is >64MHz then QSPI accesses can fail. 

Digging into this, the log comes from zephyr/drivers/flasqh/nrf_qspi_nor.c (which uses modules/hal/nordic/nrfx/drivers/src/nrf_qpsi.c). The log is actually just when the code translates the underlying error of  NRFX_ERROR_FORBIDDEN into a ECANCELED:

#if NRF53_ERRATA_159_ENABLE_WORKAROUND
    case NRFX_ERROR_FORBIDDEN:
        LOG_ERR("nRF5340 anomaly 159 conditions detected");
        LOG_ERR("Set the CPU clock to 64 MHz before starting QSPI operation");
        return -ECANCELED;
#endif
And the nrf error codes from nrf_qspi.c, which has this code:
 
static bool .qspi_errata_159_conditions_check(void)
{
#if NRF_CLOCK_HAS_HFCLK192M && NRF53_ERRATA_159_ENABLE_WORKAROUND
    if ((nrf_clock_hfclk192m_div_get(NRF_CLOCK) != NRF_CLOCK_HFCLK_DIV_1) ||
      (nrf_clock_hfclk_div_get(NRF_CLOCK) != NRF_CLOCK_HFCLK_DIV_2))
    {
      return true;
    }
    else
#endif
    {
      return false;
    }
}
All the qspi access calls call this, and if it returns false, return a NRF_ERROR_FORBIDDEN!
So this is not a workaround, just a detection of the 'bad' case to fail the qspi operations!
However, the code in nrf_qspiçnor.c already seems to deal with forcing the first condition for the workaround of the anomaly 159 (HCLK_192M divider should be 0), but not the second (CPU clock must be 64HMz ie divider set to 1).
Given that the flash qspi access already does the job of changing the hclk192M divider before and after access, why doesn't it also change the CPU clock to avoid the issue? Instead it just logs that it detected it and fails..
So, I updated zephyr/drivers/flash/nrf_qspi_nor.c to add the change of the CPU clock down to 64MHz and back again:
static int _prev_hclk_div=0;
static inline void qspi_clock_div_change(void)
{
#ifdef CONFIG_SOC_SERIES_NRF53X
	/* Make sure the base clock divider is changed accordingly
	 * before a QSPI transfer is performed.
	 */
	nrf_clock_hfclk192m_div_set(NRF_CLOCK, BASE_CLOCK_DIV);
	k_busy_wait(BASE_CLOCK_SWITCH_DELAY_US);
#ifdef NRF53_ERRATA_159_ENABLE_WORKAROUND
	_prev_hclk_div = nrf_clock_hfclk_div_get(NRF_CLOCK);
	nrf_clock_hfclk_div_set(NRF_CLOCK, NRF_CLOCK_HFCLK_DIV_2);
	//nrfx_clock_divider_set(NRF_CLOCK_DOMAIN_HFCLK, NRF_CLOCK_HFCLK_DIV_1);
#endif
#endif
}

static inline void qspi_clock_div_restore(void)
{
#ifdef CONFIG_SOC_SERIES_NRF53X
	/* Restore the default base clock divider to reduce power
	 * consumption when the QSPI peripheral is idle.
	 */
	nrf_clock_hfclk192m_div_set(NRF_CLOCK, NRF_CLOCK_HFCLK_DIV_4);
#ifdef NRF53_ERRATA_159_ENABLE_WORKAROUND
	// set back to previous cpu clk speed
	nrf_clock_hfclk_div_set(NRF_CLOCK, _prev_hclk_div);
#endif
#endif
}
This seems to work : no more 'anomaly case' detection in nrf_qspi.c, and flash access works...
Can someone from Nordic comment on a) why this wasn't done already, and b) can this be a durable fix (PR to zephyr)? Maybe its just luck that its working right now...
  • BrianW said:

    This is bit I don't understand : what exactly are the consequences of the QSPI driver modifying the CPU clock speed during the QPSI access? Given that the driver can restore it to the previous state after, so the application doesn't see any impact, it seems like there are only consequences if you DONT change the speed!

    Maybe the devs can explain exactly what the risk for the application would be to incorporate the CPU clock changing code?

    Instead of forwarding this to the devs right away:
    This is opinions you have on how the driver should work right. Im not saying they are wrong, I see the logic.

    But I want to instead ask you: Does the explanation from the dev help you fix the errors you explain in the post?

  • But I want to instead ask you: Does the explanation from the dev help you fix the errors you explain in the post?

    No, I'm afraid not. 

    As I said, I need the fix (so don't want to 'disable' the workaround code currently in there).

    However, the QSPI driver is used by multiple sub systems provided by Nordic/Zephyr (FAT-FS disk driver, NVS library, mcuboot...), so adding the CPU clock change at the application level will be a lot of work and always with a risk of missing a case (for example, I would have to modify mcuboot to change the CPU speed when it uses the secondary slot on the QSPI flash!)

    Hence, I would like the QSPI driver provided in NCS to include the change to the CPU clock during accesses to protect all of the QSPI driver users. This is why I asked what the dev feel the risk is in doing this... If they wish to make it selectable with e KConfig, thats fine also.

Related