qspi_nor: Failed to schedule device sleep: -16

Hi

I'm using nrf toolchain/sdk 2.5.2.

Got a strange problem with my QSPI NOR flash interface.  when i run flash_erase i get error message "qspi_nor: Failed to schedule device sleep: -16" pretty much immediately. but the flash does seem to erase correctly.

in func qspi_erase() in nrf_qspi_nor.c if i put a breakpoint on ln689 which calls qspi_device_uninit(dev) and wait for 20+ seconds i do not receive the error. This corresponds roughly with whole long a full flash erase takes for my mx25r0835f flash chip. It looks like the qspi drivers aren't waiting for the flash to actually erase before deinit?

qspi_wait_for_completion returns immediately if that is relevant?

Any idea how i can resolve this issue?

Regards

Robert

call to flash_erase, where flash_dev points to the dts device below, address=0 and size = 1048576

return flash_erase(flash_dev, address, size);

Relevant section of my dts:

&qspi {
    compatible= "nordic,nrf-qspi";
    status = "okay";
    pinctrl-0 = <&qspi_default>;
    pinctrl-1 = <&qspi_sleep>;
    pinctrl-names = "default", "sleep";
    mx25r08: mx25r0835f@0 {
        compatible = "nordic,qspi-nor";
        reg = <0>;
        sck-frequency = <50000000>;
        jedec-id = [ c2 28 14  ];
        size = <0x0800000>; /* flash capacity in bits */
        has-dpd;
        t-enter-dpd = <10000>;
        t-exit-dpd = <35000>;
    };
};
Parents Reply Children
  • Another update: i tried running my code on the NRF52840DK and got the same error. Can you reproduce if you have CONFIG_PM_DEVICE_RUNTIME enabled?  

  • robsrick said:
    I also disabled CONFIG_PM__DEVICE_RUNTIME as a test and it fixed the issue but my power consumption jumped from 3uA to 1mA which isn't acceptable for our application.

    Good to know, it narrows in our s and that you need PM here, so we will figure out what goes wrong.

    robsrick said:
    i tried running my code on the NRF52840DK and got the same error. Can you reproduce if you have CONFIG_PM_DEVICE_RUNTIME enabled? Sigurd Hellesvik 

    I tried, but I had some issues with writing code to reproduce it, faced other errors along the way.
    If you are able to share the code that you used for the DK, that would speed up my testing.

    Let me know if I should convert this ticket to private for that.

  • my source is pretty large but in essence I'm just calling flash_erase as described in my original post, with the include

    #include <zephyr/drivers/flash.h>

    is that enough?

  • I do not see the error if I erase at size 0x1000. But that is quite fast either way, no?

    However if I increase the erase size to 0x10000, I get error -5 instead.  I do not get -5 if I step through the code, so that is interesting.

    Alas, this is not the same error code as you get.

    Do I do something you would not expect in my code? Or am I missing anything to reproduce your problem?
    Here is a sample as simple as I can think of to try and reproduce:

    flash_erase_test.zip

  • Hi Sigurd,

     I wanted to revisit this to see if we can come up with a solution.

    I had originally wanted to create a hack for this by adding a sleep to the function qspi_erase in v2.6.0\zephyr\drivers\flash\nrf_qspi_nor.c but it did not like me using k_sleep() inside that file.

    Since then there have been some updates to the nrf sdk and i can now modify the function to be as follows:

    /* QSPI erase */
    static int qspi_erase(const struct device *dev, uint32_t addr, uint32_t size)
    {
    	const struct qspi_nor_config *params = dev->config;
    	int rc, rc2;
    
    	rc = qspi_nor_write_protection_set(dev, false);
    	if (rc != 0) {
    		return rc;
    	}
    	while (size > 0) {
    		nrfx_err_t res = !NRFX_SUCCESS;
    		uint32_t adj = 0;
    
    		if (size == params->size) {
    			/* chip erase */
    			res = nrfx_qspi_chip_erase();
    			adj = size;
    		} else if ((size >= QSPI_BLOCK_SIZE) &&
    			   QSPI_IS_BLOCK_ALIGNED(addr)) {
    			/* 64 kB block erase */
    			res = nrfx_qspi_erase(NRF_QSPI_ERASE_LEN_64KB, addr);
    			adj = QSPI_BLOCK_SIZE;
    		} else if ((size >= QSPI_SECTOR_SIZE) &&
    			   QSPI_IS_SECTOR_ALIGNED(addr)) {
    			/* 4kB sector erase */
    			res = nrfx_qspi_erase(NRF_QSPI_ERASE_LEN_4KB, addr);
    			adj = QSPI_SECTOR_SIZE;
    		} else {
    			/* minimal erase size is at least a sector size */
    			LOG_ERR("unsupported at 0x%lx size %zu", (long)addr, size);
    			res = NRFX_ERROR_INVALID_PARAM;
    		}
    
    		k_sleep(K_MSEC(20000));
    		qspi_wait_for_completion(dev, res);
    		if (res == NRFX_SUCCESS) {
    			addr += adj;
    			size -= adj;
    		} else {
    			LOG_ERR("erase error at 0x%lx size %zu", (long)addr, size);
    			rc = qspi_get_zephyr_ret_code(res);
    			break; 
    		}
    	}
    
    	rc2 = qspi_nor_write_protection_set(dev, true);
    
    	return rc != 0 ? rc : rc2;
    }
    

    My change is  adding k_sleep(K_MSEC(20000)); right before the line

    qspi_wait_for_completion(dev, res);

    This is fine for local build but we release our firmware using Github Actions for CI. This uses the nrf docker image found at https://github.com/NordicPlayground/nrf-docker for building and so i can't add my fix because it is in the sdk files themselves.

    so i'm stuck in an annoying position where i have to release a special, locally compiled firmware with the fix -> run it on the board once, then flash the proper firmware that our CI workflow generates.

    This makes our production quite complicated and i'd much rather not do it.

    Do you have any idea when this bug will be fixed and if there are any workarounds i can do in the meantime?

Related