nRF52840
running nRF5_SDK_15.0.0_a53641a but have also reviewed FDS changes up to nRF5_SDK_17.0.2_d674dde for ideas
#define FDS_VIRTUAL_PAGES 5
I found a scenario where if a Garbage Collection routine is interrupted by device reset such that the old Swap page still needs to be promoted AND an erased page still needs to be tagged as the new Swap page, upon execution of fds_init() during device initialization, the return of pages_init() is NO_SWAP, therefore FDS is corrupt.
We are hitting the FDS_OP_INIT_PROMOTE_SWAP case within init_execute(), but that is where the trouble begins.
case FDS_OP_INIT_PROMOTE_SWAP: { p_op->init.step = FDS_OP_INIT_TAG_SWAP;
// When promoting the swap, keep the write_offset set by pages_init(). ret = page_tag_write_data(m_swap_page.p_addr);
uint16_t const gc = m_gc.cur_page; uint32_t const * const p_old_swap = m_swap_page.p_addr;
// Execute the swap. m_swap_page.p_addr = m_pages[gc].p_addr; m_pages[gc].p_addr = p_old_swap;
// Copy the offset from the swap to the new page. m_pages[gc].write_offset = m_swap_page.write_offset; m_swap_page.write_offset = FDS_PAGE_TAG_SIZE;
m_pages[gc].page_type = FDS_PAGE_DATA; } break;
Upon execution of that page_tag_write_data() call, a queue_process() call is made, which results in a recursive of init_execute(), where the p_op->init.step has been updated to the next operation, but the m_swap_page.p_addr has not been changed. So the recursed call is trying to execute FDS_OP_INIT_TAG_SWAP, but upon the same page as the FDS_OP_INIT_PROMOTE_SWAP routine further up the callstack. The promote swap executes fine, but what should designate the erased page as the new swap instead attempts to tag the newly promoted data page as the swap, which it cannot do without erasing. The entire routine finishes off by tagging all remaining erased pages to data via FDS_OP_INIT_TAG_DATA, which in this case includes the one that was supposed to become the swap page.
I have experimented with the following firmware change to the FDS_OP_INIT_PROMOTE_SWAP case block:
case FDS_OP_INIT_PROMOTE_SWAP:{p_op->init.step = FDS_OP_INIT_TAG_SWAP;
// When promoting the swap, keep the write_offset set by pages_init().
uint16_t const gc = m_gc.cur_page;uint32_t const * const p_old_swap = m_swap_page.p_addr;
// Execute the swap.m_swap_page.p_addr = m_pages[gc].p_addr;m_pages[gc].p_addr = p_old_swap;
// Copy the offset from the swap to the new page.m_pages[gc].write_offset = m_swap_page.write_offset;m_swap_page.write_offset = FDS_PAGE_TAG_SIZE;
m_pages[gc].page_type = FDS_PAGE_DATA;
ret = page_tag_write_data(p_old_swap);} break;
This seems to alleviate the issue as far as I can verify through testing, but I am concerned that this block was written as it was on purpose, and there is some bit of ripple that I do not yet see. Could someone please review my proposal and comment on possible risk? Thanks in advance!