qspi_nor: Failed to schedule device sleep: -16

Hi

I'm using nrf toolchain/sdk 2.5.2.

Got a strange problem with my QSPI NOR flash interface.  when i run flash_erase i get error message "qspi_nor: Failed to schedule device sleep: -16" pretty much immediately. but the flash does seem to erase correctly.

in func qspi_erase() in nrf_qspi_nor.c if i put a breakpoint on ln689 which calls qspi_device_uninit(dev) and wait for 20+ seconds i do not receive the error. This corresponds roughly with whole long a full flash erase takes for my mx25r0835f flash chip. It looks like the qspi drivers aren't waiting for the flash to actually erase before deinit?

qspi_wait_for_completion returns immediately if that is relevant?

Any idea how i can resolve this issue?

Regards

Robert

call to flash_erase, where flash_dev points to the dts device below, address=0 and size = 1048576

return flash_erase(flash_dev, address, size);

Relevant section of my dts:

&qspi {
    compatible= "nordic,nrf-qspi";
    status = "okay";
    pinctrl-0 = <&qspi_default>;
    pinctrl-1 = <&qspi_sleep>;
    pinctrl-names = "default", "sleep";
    mx25r08: mx25r0835f@0 {
        compatible = "nordic,qspi-nor";
        reg = <0>;
        sck-frequency = <50000000>;
        jedec-id = [ c2 28 14  ];
        size = <0x0800000>; /* flash capacity in bits */
        has-dpd;
        t-enter-dpd = <10000>;
        t-exit-dpd = <35000>;
    };
};
Parents
  • Hi,

    You can see this with a simple sample where you only write/erase from external flash, right?

    Do you have a nRF52840DK?
    If so, can you test the same code on the nRF52840DK, to see if you see the same error on that?
    It would be useful to test if the error is related to the specific external flash or the firmware.

    Regards,
    Sigurd Hellesvik

  • Hi,

    Using a DK with the built in flash does not result in this error so it seems to be linked to the flash chip I've got on my custom board.


    The test code i've got set up is simply erasing and writing to the flash in a single thread.

    Regards

    Robert

  • Try to use the JESD216 sample to read out information from your external flash. Then compare that to your configuration to double check that it is correct.

  • HI Sigured

    To check the code runs as expected, I've tested running that code sample on the NRF52840-DK rev 2.0.2 with no changes using toolchain/sdk 2.5.1 and building for board NRF52840DK_NRF52840 in vsc nrf connect plugin.

    i get:

    *** Booting nRF Connect SDK v2.5.1 ***
    mx25r6435f@0: device not ready

    It looks like there might be some problem with my set up  or the sample?

  • It works for my nRF52840DK rev 2.0.2 SDK v2.5.1:

    *** Booting nRF Connect SDK v2.5.0 ***
    mx25r6435f@0: SFDP v 1.6 AP ff with 3 PH
    PH0: ff00 rev 1.6: 16 DW @ 30
    Summary of BFP content:
    DTR Clocking not supported
    Addressing: 3-Byte only
    4-KiBy erase: uniform
    Support QSPI XIP
    Support 1-1-1
    Support 1-1-2: instr 3Bh, 0 mode clocks, 8 waits
    Support 1-1-4: instr 6Bh, 0 mode clocks, 8 waits
    Support 1-2-2: instr BBh, 0 mode clocks, 4 waits
    Support 1-4-4: instr EBh, 2 mode clocks, 4 waits
    Flash density: 8388608 bytes
    ET1: instr 20h for 4096 By; typ 48 ms, max 384 ms
    ET2: instr 52h for 32768 By; typ 240 ms, max 1920 ms
    ET3: instr D8h for 65536 By; typ 480 ms, max 3840 ms
    Chip erase: typ 52000 ms, max 312000 ms
    Byte program: type 32 + 1 * B us, max 192 + 6 * B us
    Page program: typ 896 us, max 5376 us
    Page size: 256 By
    Suspend: B0h ; Resume: 30h
    DPD: Enter B9h, exit ABh ; delay 40000 ns ; poll 0x3d
    HOLD or RESET Disable: unsupported
    QER: 2
    0-4-4 Mode methods: entry 0x9 ; exit 0x2f
    4-4-4 Mode sequences: enable 0x00 ; disable 0x0
    Soft Reset and Rescue Sequence support: 0x10
    Status Register 1 support: 0x70
    size = <67108864>;
    sfdp-bfp = [
    	e5 20 f1 ff  ff ff ff 03  44 eb 08 6b  08 3b 04 bb
    	ee ff ff ff  ff ff 00 ff  ff ff 00 ff  0c 20 0f 52
    	10 d8 00 ff  23 72 f5 00  82 ed 04 cc  44 83 48 44
    	30 b0 30 b0  f7 c4 d5 5c  00 be 29 ff  f0 d0 ff ff
    	];
    PH1: ffc2 rev 1.0: 4 DW @ 110
    sfdp-ffc2 = [
    	00 36 50 16  9d f9 c0 64  fe cf ff ff  ff ff ff ff
    	];
    PH2: ff84 rev 1.0: 2 DW @ c0
    sfdp-ff84 = [
    	00 00 f0 ff  ff ff ff ff
    	];
    jedec-id = [c2 28 17];
    

    So either something changed to rev 2.0.2, something is different  in your SDK by mistake or there is something broken/misconfigured on your DK.

    For rev2.0.2, here is the changelog:

    https://www.nordicsemi.com/Products/Development-hardware/nRF52840-DK/Download?lang=en#infotabs

  • Hi Sigurd,

    it looks like you are using 2.5.0 from the log. I switched to 2.5.0 and it worked on both the dk and my custom board with some mods to enable RTT logging.

    mx25r6435f@0: SFDP v 1.6 AP ff with 3 PH
    PH0: ff00 rev 1.6: 16 DW @ 30
    Summary of BFP content:
    DTR Clocking not supported
    Addressing: 3-Byte only
    4-KiBy erase: uniform
    Support QSPI XIP
    Support 1-1-1
    Support 1-1-2: instr 3Bh, 0 mode clocks, 8 waits
    Support 1-1-4: instr 6Bh, 0 mode clocks, 8 waits
    Support 1-2-2: instr BBh, 0 mode clocks, 4 waits
    Support 1-4-4: instr EBh, 2 mode clocks, 4 waits
    Flash density: 1048576 bytes
    ET1: instr 20h for 4096 By; typ 48 ms, max 384 ms
    ET2: instr 52h for 32768 By; typ 240 ms, max 1920 ms
    ET3: instr D8h for 65536 By; typ 480 ms, max 3840 ms
    Chip erase: typ 6144 ms, max 36864 ms
    Byte program: type 32 + 1 * B us, max 192 + 6 * B us
    Page program: typ 896 us, max 5376 us
    Page size: 256 By
    Suspend: B0h ; Resume: 30h
    DPD: Enter B9h, exit ABh ; delay 40000 ns ; poll 0x3d
    HOLD or RESET Disable: unsupported
    QER: 2
    0-4-4 Mode methods: entry 0x9 ; exit 0x2f
    4-4-4 Mode sequences: enable 0x00 ; disable 0x0
    Soft Reset and Rescue Sequence support: 0x10
    Status Register 1 support: 0x70
    size = <8388608>;
    sfdp-bfp = [
            e5 20 f1 ff  ff ff 7f 00  44 eb 08 6b  08 3b 04 bb
            ee ff ff ff  ff ff 00 ff  ff ff 00 ff  0c 20 0f 52
            10 d8 00 ff  23 72 f5 00  82 ed 04 b7  44 83 38 44
            30 b0 30 b0  f7 c4 d5 5c  00 be 29 ff  f0 d0 ff ff
            ];
    PH1: ffc2 rev 1.0: 4 DW @ 110
    sfdp-ffc2 = [
            00 36 50 16  9d f9 c0 64  fe cf ff ff  ff ff ff ff
            ];
    PH2: ff84 rev 1.0: 2 DW @ c0
    sfdp-ff84 = [
            00 00 f0 ff  ff ff ff ff
            ];
    jedec-id = [c2 28 14];

    looking at a diff of the changes they are:

    so i've updated the dts file to reflect the differences between the two flash chips:

    &qspi {/* dedicated MX25L */
        compatible= "nordic,nrf-qspi";
        status = "okay";
        pinctrl-0 = <&qspi_default>;
        pinctrl-1 = <&qspi_sleep>;
        pinctrl-names = "default", "sleep";
        mx25r08: mx25r0835f@0 {
            compatible = "nordic,qspi-nor";
            reg = <0>;
            /* MX25R64 supports only pp and pp4io */
            writeoc = "pp4io";
            /* MX25R64 supports all readoc options */
            readoc = "read4io";
            sck-frequency = <8000000>;
            jedec-id = [ c2 28 14  ]; // this is neptune
            sfdp-bfp = [
                e5 20 f1 ff  ff ff 7f 00  44 eb 08 6b  08 3b 04 bb
                ee ff ff ff  ff ff 00 ff  ff ff 00 ff  0c 20 0f 52
                10 d8 00 ff  23 72 f5 00  82 ed 04 b7  44 83 38 44
                30 b0 30 b0  f7 c4 d5 5c  00 be 29 ff  f0 d0 ff ff
            ];
            size = <0x0800000>; /* flash capacity in bits */
            has-dpd;
            t-enter-dpd = <10000>;
            t-exit-dpd = <35000>;
        };

    and it still gives me the same error!

    so it looks like communication with the flash chip on my custom board is fine but i still don't understand why it's not waiting for the flash to erase properly.

Reply
  • Hi Sigurd,

    it looks like you are using 2.5.0 from the log. I switched to 2.5.0 and it worked on both the dk and my custom board with some mods to enable RTT logging.

    mx25r6435f@0: SFDP v 1.6 AP ff with 3 PH
    PH0: ff00 rev 1.6: 16 DW @ 30
    Summary of BFP content:
    DTR Clocking not supported
    Addressing: 3-Byte only
    4-KiBy erase: uniform
    Support QSPI XIP
    Support 1-1-1
    Support 1-1-2: instr 3Bh, 0 mode clocks, 8 waits
    Support 1-1-4: instr 6Bh, 0 mode clocks, 8 waits
    Support 1-2-2: instr BBh, 0 mode clocks, 4 waits
    Support 1-4-4: instr EBh, 2 mode clocks, 4 waits
    Flash density: 1048576 bytes
    ET1: instr 20h for 4096 By; typ 48 ms, max 384 ms
    ET2: instr 52h for 32768 By; typ 240 ms, max 1920 ms
    ET3: instr D8h for 65536 By; typ 480 ms, max 3840 ms
    Chip erase: typ 6144 ms, max 36864 ms
    Byte program: type 32 + 1 * B us, max 192 + 6 * B us
    Page program: typ 896 us, max 5376 us
    Page size: 256 By
    Suspend: B0h ; Resume: 30h
    DPD: Enter B9h, exit ABh ; delay 40000 ns ; poll 0x3d
    HOLD or RESET Disable: unsupported
    QER: 2
    0-4-4 Mode methods: entry 0x9 ; exit 0x2f
    4-4-4 Mode sequences: enable 0x00 ; disable 0x0
    Soft Reset and Rescue Sequence support: 0x10
    Status Register 1 support: 0x70
    size = <8388608>;
    sfdp-bfp = [
            e5 20 f1 ff  ff ff 7f 00  44 eb 08 6b  08 3b 04 bb
            ee ff ff ff  ff ff 00 ff  ff ff 00 ff  0c 20 0f 52
            10 d8 00 ff  23 72 f5 00  82 ed 04 b7  44 83 38 44
            30 b0 30 b0  f7 c4 d5 5c  00 be 29 ff  f0 d0 ff ff
            ];
    PH1: ffc2 rev 1.0: 4 DW @ 110
    sfdp-ffc2 = [
            00 36 50 16  9d f9 c0 64  fe cf ff ff  ff ff ff ff
            ];
    PH2: ff84 rev 1.0: 2 DW @ c0
    sfdp-ff84 = [
            00 00 f0 ff  ff ff ff ff
            ];
    jedec-id = [c2 28 14];

    looking at a diff of the changes they are:

    so i've updated the dts file to reflect the differences between the two flash chips:

    &qspi {/* dedicated MX25L */
        compatible= "nordic,nrf-qspi";
        status = "okay";
        pinctrl-0 = <&qspi_default>;
        pinctrl-1 = <&qspi_sleep>;
        pinctrl-names = "default", "sleep";
        mx25r08: mx25r0835f@0 {
            compatible = "nordic,qspi-nor";
            reg = <0>;
            /* MX25R64 supports only pp and pp4io */
            writeoc = "pp4io";
            /* MX25R64 supports all readoc options */
            readoc = "read4io";
            sck-frequency = <8000000>;
            jedec-id = [ c2 28 14  ]; // this is neptune
            sfdp-bfp = [
                e5 20 f1 ff  ff ff 7f 00  44 eb 08 6b  08 3b 04 bb
                ee ff ff ff  ff ff 00 ff  ff ff 00 ff  0c 20 0f 52
                10 d8 00 ff  23 72 f5 00  82 ed 04 b7  44 83 38 44
                30 b0 30 b0  f7 c4 d5 5c  00 be 29 ff  f0 d0 ff ff
            ];
            size = <0x0800000>; /* flash capacity in bits */
            has-dpd;
            t-enter-dpd = <10000>;
            t-exit-dpd = <35000>;
        };

    and it still gives me the same error!

    so it looks like communication with the flash chip on my custom board is fine but i still don't understand why it's not waiting for the flash to erase properly.

Children
  • robsrick said:
    it looks like you are using 2.5.0 from the log

    Right, yes. Had an off by one error I guess.

    Good to see that you now got to read out the configurations. If it did not help with this issue, hopefully it will iron out some other issues that you now will not meet further down the road.

    Here is what I found looking more at the issue you see:

    Searching for your external flash I found this blog. And it does not mention your issue either, so that is interesting.
    There have been some recent changes in our QSPI drivers, so maybe you can try with v2.1.0 just to see if the issue has been introduced recently?

    Looking at our driver. From your good observations, I agree that the issue seems to be that the device does not wait for the erase to complete before uniniting flash. My prime suspect then would be that something goes wrong in qspi_wait_for_completion().  This does different things depending on CONFIG_MULTITHREADING. What is that in your project? And what happens if you try the other option?

    I think it is most logical to use multithreading in an RTOS, so if I assume that it is on:
    The semaphore we wait for is given by qspi_complete, called by qspi_handler(). I don't know if it will give much, but you could maybe check if that function is triggered too early? And maybe it is triggered again after the flash is erased or something?

  • Took a little while to get 2.1.0 running.. but i have results

    running 2.1.0 works correctly! when it gets to qspi_device_uninit(dev) called right at the end of qspi_erase() it runs this bit of code and will loop here until the flash erase has finished (15 secs or so)

    Running 2.5.1 it instead runs some power management specific code and ignores that previous loop that checks if the flash is busy.

    when i step through pm_device_runtime_put(dev) it eventually calls qspi_nor_pm_action with the action PM_DEVICE_ACTION_SUSPEND. this checks the same nrfx_qspi_mem_busy_check() but instead of waiting for it to stop being busy it just returns error EBUSY which i believe is the source of my problems.

    Do you agree? and can you suggest a fix / when this can be fixed in the sdk

    Regards

    Robert

  • That is interesting indeed. I will ask our QSPI developers about this.

  • I heard back from our developers, and gave me some updates to v2.5.1 which could be the cause. However, you also see this on v2.5.0 so its not likely it is those.

    Now, this gives me a dilemma.

    We try to take up as little as possible time from our developers, so they can focus on improving our stuff. Ideally, it would be nice if you could test for some newer NCS versions (binary search between v2.1.0 and v2.5.0), so we can figure out exactly when the error happens, so I can give the developers a more detailed report on the issue.

    On the other hand, you took a while to get v2.1.0 running, so I don't really want to ask you to do all of that work again either. If I could reproduce this on my side I could do the work myself, but alas I do not have the flash you have.

    That leaves me with a third option: Looking at the history from the file we suspect has the issue, and trying to reason my way to what could go wrong.
    From the history, these commits are new since v2.1.0:

    Out of these, here are my main suspects:

    Just now I had another look at your v2.5.1 code, and I see that some of it is grayed out because CONFIG_PM__DEVICE_RUNTIME is set. (there is an #else which I missed before)
    Do you need CONFIG_PM__DEVICE_RUNTIME for your project? Can you try to disable this configuration and see if you still get the same error?

  • Hi Sigurd

    My understanding of the power management library is it's critical for putting devices to sleep when not used. For my use case this is very important because our device has to last for years on a small battery. The whole project is set up to use pinctrl style definitions in the device tree which i think depend on power management? So i don't really want to change this.

    Understand you can't debug this directly because you don't have access to my flash chip, but i wonder could you step through the same code on a nrf52840dk and see why it works where mine doesn't?

    Presumably with the flash chip that is on the DK the pm_device_runtime_put() must do something different? the this would probably give a useful clue.

    Regards

    Robert

Related