This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

FDS from SDK 17.2.0 get corrupted

We are using FDS from SDK 17.2.0 on nRF52832.

It is working so far, we can store and read back data.

When putting battery in and out fast, we get sometimes, but often reproducable in a few minutes a situation that we run into a endless loop:

[00:00:00.000,000] <info> app: Initializing fds...
[00:00:00.000,000] <error> app: ERROR 34314 [Unknown error code] at :0
PC at: 0x00000000
[00:00:00.000,000] <error> app: End of error report
[00:00:00.000,000] <warning> app: System reset

34314 -> 0x860A -> FDS_ERR_NO_PAGES
It is here:
void flash_init(void) {
    ret_code_t rc;
    /* Register first to receive an event when initialization is complete. */
    (void) fds_register(fds_evt_handler);
    NRF_LOG_INFO("Initializing fds...");
    rc = fds_init();
    APP_ERROR_CHECK(rc);
    /* Wait for fds to initialize. */
    wait_for_fds_ready();
    /* do a garbage collection for a clean start state */
    fds_gc();
}
It is the APP_ERROR_CHECK of fds_init().
I have read out flash by nRF-Connect (PC-Version via JTAG) and stored it into a hexfile.
Then I reproduced the problem, endless loop and read out again flash and stored it into second hexfile.
I then converted both *.hex into *.bin and further *.txt. Almost everything is equal, beside the FDS blocks.
Correct flash content:

00075000: DE C0 AD DE FF 01 1E F1 FF FF FF FF FF FF FF FF    ................
00075010: FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF    ................
00075020: FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF    ................
00075030: FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF    ................
00075040: FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF    ................

00076000: DE C0 AD DE FE 01 1E F1 00 00 01 00 10 A0 A6 ED    ................
00076010: 03 00 00 00 03 00 00 00 00 A0 01 00 10 A0 93 7B    ...............{
00076020: 04 00 00 00 04 00 00 00 FF FF FF FF FF FF FF FF    ................
00076030: FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF    ................

00077000: DE C0 AD DE FE 01 1E F1 FF FF FF FF FF FF FF FF    ................
00077010: FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF    ................
00077020: FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF    ................
Broken flash content:
00075000: DE C0 AD DE FE 01 1E F1 06 A0 01 00 10 A0 CD 5D    ...............]
00075010: 0E 00 00 00 01 00 00 00 08 A0 01 00 10 A0 22 54    .............."T
00075020: 18 00 00 00 45 0B 00 00 00 00 01 00 10 A0 97 BA    ....E...........
00075030: 25 00 00 00 20 00 00 00 00 A0 01 00 10 A0 56 04    %... .........V.
00075040: 26 00 00 00 21 00 00 00 FF FF FF FF FF FF FF FF    &...!...........
00075050: FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF    ................
00075060: FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF    ................
00075070: FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF    ................
00076000: DE C0 AD DE FE 01 1E F1 FF FF FF FF FF FF FF FF    ................
00076010: FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF    ................
00076020: FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF    ................
00077000: DE C0 AD DE FE 01 1E F1 FF FF FF FF FF FF FF FF    ................
00077010: FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF    ................
00077020: FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF    ................
00077030: FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF    ................
Looks like in broken flash, I have 3 data storage blocks, instead of 1 garbage block and 2 data storage block.
Any idea, what it could be? Or perhaps a known problem?
Parents
  • Hello,

    We see this from time to time, but we are not able to reproduce it in a stable manner. Do you have a way to reproduce this starting off with a clean flash? (clean FDS pages, all 0xFF).

    If so, please let me know.

    However most, if not all, of these cases, the application that leads up to this issue is calling fds_gc() during the start of the application. Is your application doing this? I don't recommend doing this, because of a couple of reasons:

    1: It will beat up the flash, using up one flash write erase cycle every power cycle. This means that the device will only be guaranteed 10 000 power cycles, which isn't a whole lot.

    2: It will at some point trigger this bug in some devices. We are not sure exactly what is happening, which is why I am interrested in whether you have a way of reproducing this, but it is related to the battery running out and the application being reset during fds_gc(). 

    When the battery is running low, you will get the scenario that the nRF will shut down, because the supply voltage is not sufficient. When it shuts down and stops drawing current, the voltage in the battery increases, because no current is running through it's internal resistance, and the nRF powers back up. The length of these power cycles will be shorter and shorter until the voltage is too low and it doesn't turn on anymore.

    This is the reason calling fds_gc() on startup is not a good idea, because at some point, it will power cycle during fds_gc() several times in a row, which may lead to the case that you are seeing. This issue will persist even though the battery is changed/charged, because the FDS pages are tagged incorrectly.

    Looking at your code snippet now Slight smile

        /* do a garbage collection for a clean start state */
        fds_gc();

    Remove this.

    Now, there is a tiny chance that this tiny chance this may happen even though fds_gc() is not called on startup, because you may risk running out of battery when the flash is actually full, but you can imagine that the risk of this is increadibly much smaller than what you are facing now.

    You can implement a workaround, which isn't too hard really. You need to figure out which of the flash pages (with the data tag) that doesn't contain any data. It may be several, but one is sufficient. In your case, this is 0x76000 and 0x77000. Erase that page, and reset the device, and FDS should fix it during the next startup.

    There is API in fds.c to read page tags and erase pages, that you can use for this workaround.

    Best regards,

    Edvin

Reply
  • Hello,

    We see this from time to time, but we are not able to reproduce it in a stable manner. Do you have a way to reproduce this starting off with a clean flash? (clean FDS pages, all 0xFF).

    If so, please let me know.

    However most, if not all, of these cases, the application that leads up to this issue is calling fds_gc() during the start of the application. Is your application doing this? I don't recommend doing this, because of a couple of reasons:

    1: It will beat up the flash, using up one flash write erase cycle every power cycle. This means that the device will only be guaranteed 10 000 power cycles, which isn't a whole lot.

    2: It will at some point trigger this bug in some devices. We are not sure exactly what is happening, which is why I am interrested in whether you have a way of reproducing this, but it is related to the battery running out and the application being reset during fds_gc(). 

    When the battery is running low, you will get the scenario that the nRF will shut down, because the supply voltage is not sufficient. When it shuts down and stops drawing current, the voltage in the battery increases, because no current is running through it's internal resistance, and the nRF powers back up. The length of these power cycles will be shorter and shorter until the voltage is too low and it doesn't turn on anymore.

    This is the reason calling fds_gc() on startup is not a good idea, because at some point, it will power cycle during fds_gc() several times in a row, which may lead to the case that you are seeing. This issue will persist even though the battery is changed/charged, because the FDS pages are tagged incorrectly.

    Looking at your code snippet now Slight smile

        /* do a garbage collection for a clean start state */
        fds_gc();

    Remove this.

    Now, there is a tiny chance that this tiny chance this may happen even though fds_gc() is not called on startup, because you may risk running out of battery when the flash is actually full, but you can imagine that the risk of this is increadibly much smaller than what you are facing now.

    You can implement a workaround, which isn't too hard really. You need to figure out which of the flash pages (with the data tag) that doesn't contain any data. It may be several, but one is sufficient. In your case, this is 0x76000 and 0x77000. Erase that page, and reset the device, and FDS should fix it during the next startup.

    There is API in fds.c to read page tags and erase pages, that you can use for this workaround.

    Best regards,

    Edvin

Children
No Data
Related