Flash Data Storage (FDS) fds_record_update() -> Reset -> Two valid records possible?

  • SDK: 15.2
  • SoftDevice: s140_nrf52_6.1.0
  • FreeRTOS
  • Hardware: nRF52840 (Custom boards)

Note: I have ported all fixes for FDS from SDK 17.1.0 to my SDK (15.2).

Hello,

Rarely, I have seen an issue where my device will load an old value from NVM (fds_record_open()) after having previously updated the record with fds_record_update(). From my understanding, this can occur if an error occurred during the write or FDS_ERR_OPERATION_TIMEOUT occurred due to excessive BLE activity.

Question 1: When fds_record_update() updates a record and invalidates the previous entry, is it possible that a device reset (NVIC_SystemReset(), watchdog, etc) or brownout could result in two valid records in flash?

From the Nordic InfoCenter: "When you update a record, FDS actually creates a new record and invalidates the old one. This scheme ensures that data is not lost if there is a power loss in the middle of the the operation."

Source: https://infocenter.nordicsemi.com/topic/sdk_nrf5_v17.1.0/lib_fds_functionality.html?cp=9_1_3_16_1_1_1#lib_fds_functionality_update 

Ie, what happens if the device resets between the time the new record is written, and before the old record is invalidated? Seems like you would have 2 valid records on reset.

Question 2: If the answer to question 1 is "yes", then how does FDS handle this scenario? How should an application handle this scenario?

Edit:

I have a few more questions:

Question 3: Can FDS_CRC_CHECK_ON_READ and FDS_CRC_CHECK_ON_WRITE be enabled without erasing existing records?

For example, if I want to turn this feature on in a future OTA firmware update to a device already in the field?

Question 4: How should an application handle FDS_ERR_OPERATION_TIMEOUT when updating or writing a record? How does the peer manager handle this?

The only solution I know is to always keep your data structures/elements in RAM and write them to flas. If the write fails, at least the RAM copy will be up to date and you can asynchronously queue another write attempt until it succeeds. Looking at the peer manager code, it looks like it only checks for FDS_ERR_OPERATION_TIMEOUT when garbage collection fails (PM_EVT_FLASH_GARBAGE_COLLECTION_FAILED). Why doesn't it check for this when writing bonding information to flash? Maybe it does and I am just missing it.

Thanks!

Derek

Parents
  • Hi,

    Question 1: When fds_record_update() updates a record and invalidates the previous entry, is it possible that a device reset (NVIC_SystemReset(), watchdog, etc) or brownout could result in two valid records in flash?
    Ie, what happens if the device resets between the time the new record is written, and before the old record is invalidated? Seems like you would have 2 valid records on reset.

    Yes. You are correct that a reset at the wrong time, or failure (after a number of retries) to mark the old record for deletion, may leave you with two records of the same Record key + File ID combination.

    Question 2: If the answer to question 1 is "yes", then how does FDS handle this scenario? How should an application handle this scenario?

    FDS allows multiple records with the same Record key, multiple records with the same File ID, and multiple records of the same combination of the two. How to use those two tags, as well as what to do in the case of multiple records, is up to the application.

    Please note that you cannot use search order or compare addresses to decide which record is the old one and which record is the updated one. Record ID, on the other hand, always increase and so the updated version of the record is the one with the highest Record ID. In the case of multiple copies (due to update failing at the wrong time) it should be safe to delete the copy with the lowest Record ID.

    Question 3: Can FDS_CRC_CHECK_ON_READ and FDS_CRC_CHECK_ON_WRITE be enabled without erasing existing records?

    For example, if I want to turn this feature on in a future OTA firmware update to a device already in the field?

    Enabling CRC checking will invalidate existing records, since CRC values are only computed if CRC checking is enabled. This means old records (from before enabling CRC checking) will contain a crc field of 0 instead of the correct crc value.

    Question 4: How should an application handle FDS_ERR_OPERATION_TIMEOUT when updating or writing a record? How does the peer manager handle this?

    You can have a look at the peer manager implementation for reference. Basically it retries a number of times before giving up, with the data in RAM for availability of retries as you suggest.

    Regards,
    Terje

  • Thanks for the info!

    A few more follow-up questions and I should be set.

    1. To catch the scenario where two records with the same file and record key could could exist, I need to always call fds_record_find() twice and take the record with the highest ID and delete the other?

    2. If there are two records with the same file ID and record key, will calling fds_record_update() invalid BOTH of the old records at the same time? This would save me from having to explicitly call delete() on the duplicate record.

    3. If FDS_CRC_CHECK_ON_READ and FDS_CRC_CHECK_ON_WRITE are used, what is the performance impact on FDS operations from a power consumption standpoint? I assume this is negligible but just double checking.

    Thanks!

    Derek

Reply
  • Thanks for the info!

    A few more follow-up questions and I should be set.

    1. To catch the scenario where two records with the same file and record key could could exist, I need to always call fds_record_find() twice and take the record with the highest ID and delete the other?

    2. If there are two records with the same file ID and record key, will calling fds_record_update() invalid BOTH of the old records at the same time? This would save me from having to explicitly call delete() on the duplicate record.

    3. If FDS_CRC_CHECK_ON_READ and FDS_CRC_CHECK_ON_WRITE are used, what is the performance impact on FDS operations from a power consumption standpoint? I assume this is negligible but just double checking.

    Thanks!

    Derek

Children
  • Hi,

    droberson said:
    1. To catch the scenario where two records with the same file and record key could could exist, I need to always call fds_record_find() twice and take the record with the highest ID and delete the other?

    That should do it, yes. You may figure a way to accurately predict when this extra check is needed, but the naive (and more fail-safe) way to do it would be to always search for more records and use the one with highest Record ID, yes.

    droberson said:
    2. If there are two records with the same file ID and record key, will calling fds_record_update() invalid BOTH of the old records at the same time? This would save me from having to explicitly call delete() on the duplicate record.

    No, the update functionality updates only one entry, not all of the same tags.

    droberson said:
    If FDS_CRC_CHECK_ON_READ and FDS_CRC_CHECK_ON_WRITE are used, what is the performance impact on FDS operations from a power consumption standpoint? I assume this is negligible but just double checking.

    For most use cases I agree this should be negligible, yes. There's a few lines of code calculating and reading/writing the CRC value, but from what I can tell the CRC is part of words already written/read to flash anyway so disabling it will not reduce flash operations (which are time and power consuming.) The main overhead then will be the CRC calculation itself. The best way to know would be to measure for your exact use case, which you should do anyway if power consumption is crucial, but I suspect there are other areas more suited for power optimization than the CRC setting for FDS records. You still have the option to choose, for the cases where the trade-off between CRC checks and power consumption matters.

    Regards,
    Terje

Related