Beware that this post is related to an SDK in maintenance mode
More Info: Consider nRF Connect SDK for new designs

nRF5 SDK for Mesh v4.1.0 Flash Manager Corrupt Page Due to Power Cycle

Hi,

I have been trying to chase down an issue with the flash manager occasionally raising an assertion in mesh_config_backend_init() after a power cycle of the nRF52840 we are using in our product. Occasionally, when we re-provision the radio and then power cycle the nrf52840 from the external host processor, we encounter the issue detailed below. We have deployed past versions of the nRF5 SDK for Mesh (~2.0.0 I believe) and never saw this issue, but we are worried it is a latent problem we simply haven't encountered. It is certainly a critical issue for the current product, as it effectively bricks the nrf52840 in the field. We are definitely reprovisioning more often in the current product, so that may be contributing to this being more likely to occur.

The flash manager approach to writing and sealing entries seems fairly reasonable, so what I'm seeing should never be possible given my understanding of the flash manager implementation. See memory dump for the flash page in question below. Based on the values which still exist in flash, this page is most likely being used to store the subnet key(s). This page has had it's first header replaced with all zeroes. This is an invalid handle, but because the size of the handle is zero this causes an assertion to be raised in flash_manager_internal.h :: get_next_entry(). Even if it didn't, this would ultimately cause the code to be stuck in an infinite loop by adding zero bytes to the current pointer and the caller flash_manager.c :: get_invalid_bytes() simply getting the same invalid entry over and over and over again without ever making progress.

I understand that previous entries can be invalidated by the flash manager by writing 0x0000 to their handle, causing the next instance of the handle to be considered the valid one. However, writing zero to the size of the handle clearly shouldn't be happening. My only guess is that there's some hardware limitation where a flash write being executed at power off causes the entire word to be written as zero rather than only a half word. Is this a known issue? Is there a workaround/fix, such as changing the handles to be 8 bytes in size so invalidating the handle cannot invalidate the size?

Parents Reply Children
No Data
Related