Fsstorage randomly erasing

Our team has been using Fstorage for a long time to store customer data in the device for customization. At random times, a very few units get erased completely when Fstorage is executed. Similar to processing the erase, but not the write. All the registers are FF's

When the device is reprogrammed, it functions normally and to date, has never reoccurred on the same unit.  

Can you offer any insight as to what is happening? The device works on a coin cell battery and happens even if the battery is new. Since it is a very rare occurrence, we cannot duplicate it readily to try and debug. 

Parents
  • void update_data(uint8_t *device_data)
    {
    	ret_code_t rc;
        rc = nrf_fstorage_erase(&fstorage, 0x4c000, 1, NULL);
        APP_ERROR_CHECK(rc);
    	
        rc = nrf_fstorage_write(&fstorage, 0x4c000,device_data,data_size, NULL);
        APP_ERROR_CHECK(rc);
    
    }

    Here is how we are writing to the flash.

  • I have a "hunch" that you are possible doing an erase early in main()? In which case if you have some ripple/bounce on VDD when the coin cell battery is insterted it's not unlikely that the MCU have enough time to start an erase, but then the bounce of battery voltage cause the voltage to fall below minimum operating conditions. In such case it's a bit undefined what happens, it can cause the erase operation to "malfunction". Can you add a 100ms in start of main() to ensure that the battery is stable before you start to run the code (and in specific code related to flash operations)? If you have a bootloader, make sure to add this delay in the start of the bootloader.

    Kenneth

Reply
  • I have a "hunch" that you are possible doing an erase early in main()? In which case if you have some ripple/bounce on VDD when the coin cell battery is insterted it's not unlikely that the MCU have enough time to start an erase, but then the bounce of battery voltage cause the voltage to fall below minimum operating conditions. In such case it's a bit undefined what happens, it can cause the erase operation to "malfunction". Can you add a 100ms in start of main() to ensure that the battery is stable before you start to run the code (and in specific code related to flash operations)? If you have a bootloader, make sure to add this delay in the start of the bootloader.

    Kenneth

Children
  • Hi Kenneth,

    Thank you for the suggestion. Though we do not have an erase in main, we obviously initialize the memory location with

    rc=nrf_fstorage_init(&fstorage, p_fs_api, NULL);.

    We also have a read in main. 

    rc = nrf_fstorage_read(&fstorage, 0x4c000, device_data, data_size);

    Could it be happening at initialization of the memory or with a read? We may just put a delay before the init just to be safe. 

    An erase occurs (rarely) only after the user has paired with the unit and the unit has been on for a while. The only time we change the memory is via user input which requires pairing via ANT+ or BLE which mean the code has been running for a bit.When the issue does occur, it occurs only when the user provides an input and changes the contents of the memory. 

    We would appreciate any other suggestions you could think of.

  • Slightly running out of ideas, typically an erase operation is a time consuming task that block all other MCU execution, do you have any indication that you may (in corner case) drawing a lot of current while the erase operation is ongoing? E.g. an erase operation can take up to 100ms, is is possible that you have LED's or other circuitry that may draw excessive current during this period of time, enough for the coin cell to drop below operating conditions?

    Kenneth

  • That is a good idea. We don't have LED's or high current consuming devices. The coin cell voltage to the MCU is regulated so even if the coin cell drops, the regulator holds the supply at 3V. We are at a loss too since it is rare and not repeatable.

  • I would suggest that the flash erase works fine and it is the flash write that fails and hence the flash appears to get erased unexpectedly. Without delving into the flash code (which should monitor VDD before a write) I would recommend monitoring both the VDD and the coin cell voltage before issuing the write as the flash erase may leave the voltage recovering from the power surge caused by the flash erase, which lowers the coin cell voltage input to the regulator (so less headroom, less available energy) and may even lower the regulator output. A simple workaround is to never follow a flash erase by an immediate flash write without allowing a Lithium Recovery Time - say delay at least 200mSecs, more if possible. The voltage dip on the coin cell will be visible on a 'scope. Why no error code? I doubt that the write function realises the flash chip blocked the write due to falling voltage; a read-back is required.

  • That is a great suggestion. We did just that experiment and VDD seems to be holding fairly well. The regulator is a boost with ultra-low dropout. I believe you are correct with regards to the erase function being executed and not the write. We are now thinking that erase / write function is being interrupted in the between the operations with another call to the erase / write function. The erase / write is inside a function which can be called from an interrupt. Could this be the problem?

Related