Unexpected NRF_EVT_FLASH_OPERATION_SUCCESS with nRF5_SDK v17.1.0 & S140 v7.3.0

We're using sd_flash_write() and sd_flash_page_erase() to store some data to internal NOR FLASH at runtime. We have these calls integrated with the RTOS we're using in a HAL module in our code. Specifically, after making these calls, we make an RTOS blocking call to wait on a RTOS event that is being set by our NRF_SDH_SOC_OBSERVER when either NRF_EVT_FLASH_OPERATION_SUCCESS or NRF_EVT_FLASH_OPERATION_ERROR occur. This allows the calling thread to block and the RTOS to properly schedule other threads or idle while we're waiting for the Flash operation to complete.

In general, this has worked great, but in a recent round of testing we encountered an intermittent data corruption issue when erasing and then writing out several KB of data in 256-byte chunks; when we read the data back to verify it, it was wrong. We were able to identify that this data corruption was caused by our 256-byte buffer getting overwritten with the next chunk of data before the previous chunk had been fully written out. And that in turn was cased by our getting an extra NRF_EVT_FLASH_OPERATION_SUCCESS event, which caused our event to already be set before the next sd_flash_write() call was made, so we weren't blocking before starting to write the next chunk. Furthermore, we're typically seeing this happen following the first sd_flash_write() after a sd_flash_page_erase() call. The order we see when this occurs is this:

  • We call sd_flash_page_erase()
  • We get NRF_EVT_FLASH_OPERATION_SUCCESS
  • We call sd_flash_write()
  • We get NRF_EVT_FLASH_OPERATION_SUCCESS
  • We get NRF_EVT_FLASH_OPERATION_SUCCESS, again!!

But again, the problem is very intermittent. And there's an additional factor we identified by creating some test code that repeatedly performs erases and writes to Flash using our HAL module; when we run this code with SoftDevice enabled but BLE not active (no advertising or connected devices), we never see the problem. But it shows up quickly after we enable BLE advertising while running this test code. Clearly something about having BLE active — which has a noticeable affect on the timing of our test — is causing this extra NRF_EVT_FLASH_OPERATION_SUCCESS event.

I should also mention that we don't have any other code in our application firmware — which includes the nRF5 SDK — that erases or writes to internal NOR Flash, which would seem to eliminate the most likely suspect for an extra NRF_EVT_FLASH_OPERATION_SUCCESS event being set. Specifically, we replaced the standard BLE peer_data_storage component with a custom version that stores peer data elsewhere, and this allowed us to strip out the whole FDS subsystem, so there is literally no code in our application that is calling sd_flash_write() other than the one instance in our HAL module, which is also protected by a mutex.

We also don't think the SoftDevice itself should be writing anything to internal NOR Flash. While the SoftDevice does take over the NVMC peripheral controller on the nRF52840, our understanding was that this was done in order to avoid BLE operations from interfering with Flash operations, not because the SoftDevice actually writes any data to internal NOR Flash itself. For instance, it's not clear where it would write such data other than the UICR regs, plus we're seeing this occur even if we just turn on BLE advertising.

Are correct that this could not be caused by an internal write from the SoftDevice that did not come from our firmware calling sd_flash_write()?

While we have implemented a workaround for this extra NRF_EVT_FLASH_OPERATION_SUCCESS event, the workaround is still theoretically subject to potential race conditions, and so we're trying to understand why we're seeing this extra NRF_EVT_FLASH_OPERATION_SUCCESS event. Can you explain why we might be seeing this occur?

Parents Reply Children
No Data
Related