nrf5340 Anomaly 159 Workaround : not a workaround? Breaks QPSI Flash use on NCS 2.8.0

Updating from NCS2.6 to 2.8, the access to my QPSI Flash chip got broken (mx25r64, connected and configured using QPSI exactly as per the nrf5340DK board)

DTS entry:

&qspi {					// external flash
	status = "okay";
	pinctrl-0 = <&qspi_default>;
	pinctrl-1 = <&qspi_sleep>;
	pinctrl-names = "default", "sleep";
	mx25r64: mx25r6435f@0 {
		compatible = "nordic,qspi-nor";
		reg = <0>;
		/* MX25R64 supports only pp and pp4io */
		writeoc = "pp4io";
		/* MX25R64 supports all readoc options */
		readoc = "read4io";
		sck-frequency = <8000000>;
		jedec-id = [c2 28 17];
		sfdp-bfp = [
			e5 20 f1 ff  ff ff ff 03  44 eb 08 6b  08 3b 04 bb
			ee ff ff ff  ff ff 00 ff  ff ff 00 ff  0c 20 0f 52
			10 d8 00 ff  23 72 f5 00  82 ed 04 cc  44 83 68 44
			30 b0 30 b0  f7 c4 d5 5c  00 be 29 ff  f0 d0 ff ff
		];
		size = <67108864>;
		has-dpd;
		t-enter-dpd = <10000>;
		t-exit-dpd = <35000>;
	};
};

When accessing it, I now get these logs:

[00:00:15.432,556] <err> qspi_nor: nRF5340 anomaly 159 conditions detected
[00:00:15.440,032] <err> qspi_nor: Set the CPU clock to 64 MHz before starting QSPI operation

and then any flash access (eg to filesystem access using ELM) fails. On NCS2.6 this all worked fine.

This appears to be a 'fix' for anomly 159 (which I was not aware of experiencing!) which notes that if the HCLK_192M divider is not 0 _and_ the CPU clock is >64MHz then QSPI accesses can fail. 

Digging into this, the log comes from zephyr/drivers/flasqh/nrf_qspi_nor.c (which uses modules/hal/nordic/nrfx/drivers/src/nrf_qpsi.c). The log is actually just when the code translates the underlying error of  NRFX_ERROR_FORBIDDEN into a ECANCELED:

#if NRF53_ERRATA_159_ENABLE_WORKAROUND
    case NRFX_ERROR_FORBIDDEN:
        LOG_ERR("nRF5340 anomaly 159 conditions detected");
        LOG_ERR("Set the CPU clock to 64 MHz before starting QSPI operation");
        return -ECANCELED;
#endif
And the nrf error codes from nrf_qspi.c, which has this code:
 
static bool .qspi_errata_159_conditions_check(void)
{
#if NRF_CLOCK_HAS_HFCLK192M && NRF53_ERRATA_159_ENABLE_WORKAROUND
    if ((nrf_clock_hfclk192m_div_get(NRF_CLOCK) != NRF_CLOCK_HFCLK_DIV_1) ||
      (nrf_clock_hfclk_div_get(NRF_CLOCK) != NRF_CLOCK_HFCLK_DIV_2))
    {
      return true;
    }
    else
#endif
    {
      return false;
    }
}
All the qspi access calls call this, and if it returns false, return a NRF_ERROR_FORBIDDEN!
So this is not a workaround, just a detection of the 'bad' case to fail the qspi operations!
However, the code in nrf_qspiçnor.c already seems to deal with forcing the first condition for the workaround of the anomaly 159 (HCLK_192M divider should be 0), but not the second (CPU clock must be 64HMz ie divider set to 1).
Given that the flash qspi access already does the job of changing the hclk192M divider before and after access, why doesn't it also change the CPU clock to avoid the issue? Instead it just logs that it detected it and fails..
So, I updated zephyr/drivers/flash/nrf_qspi_nor.c to add the change of the CPU clock down to 64MHz and back again:
static int _prev_hclk_div=0;
static inline void qspi_clock_div_change(void)
{
#ifdef CONFIG_SOC_SERIES_NRF53X
	/* Make sure the base clock divider is changed accordingly
	 * before a QSPI transfer is performed.
	 */
	nrf_clock_hfclk192m_div_set(NRF_CLOCK, BASE_CLOCK_DIV);
	k_busy_wait(BASE_CLOCK_SWITCH_DELAY_US);
#ifdef NRF53_ERRATA_159_ENABLE_WORKAROUND
	_prev_hclk_div = nrf_clock_hfclk_div_get(NRF_CLOCK);
	nrf_clock_hfclk_div_set(NRF_CLOCK, NRF_CLOCK_HFCLK_DIV_2);
	//nrfx_clock_divider_set(NRF_CLOCK_DOMAIN_HFCLK, NRF_CLOCK_HFCLK_DIV_1);
#endif
#endif
}

static inline void qspi_clock_div_restore(void)
{
#ifdef CONFIG_SOC_SERIES_NRF53X
	/* Restore the default base clock divider to reduce power
	 * consumption when the QSPI peripheral is idle.
	 */
	nrf_clock_hfclk192m_div_set(NRF_CLOCK, NRF_CLOCK_HFCLK_DIV_4);
#ifdef NRF53_ERRATA_159_ENABLE_WORKAROUND
	// set back to previous cpu clk speed
	nrf_clock_hfclk_div_set(NRF_CLOCK, _prev_hclk_div);
#endif
#endif
}
This seems to work : no more 'anomaly case' detection in nrf_qspi.c, and flash access works...
Can someone from Nordic comment on a) why this wasn't done already, and b) can this be a durable fix (PR to zephyr)? Maybe its just luck that its working right now...
Related