Beware that this post is related to an SDK in maintenance mode
More Info: Consider nRF Connect SDK for new designs
This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

FDS - missing swap page

 Hi,

One of our devices is bricked due to the fds_init returning FDS_ERR_NO_PAGES. After flash analysis, I have discovered that all pages are marked as FDS_PAGE_DATA (no FDS_PAGE_SWAP). One of these pages is erased (just FDS_PAGE_TAG_MAGIC and FDS_PAGE_DATA header).

In our application, FDS is heavily used. During testing, we perform many power cycles. I do not have any clear reproduction path due to the fact that we found it just once (more than 200 devices online for a few months).

We are using SDK 14.2.0 with nRF52832 (custom board designs).

FDS flash dump can be found here: https://drive.google.com/drive/folders/1y_KOyIhVw9d-ZAAIc8SSa8k9vUOobDVz?usp=sharing

Parents
  • Hi,

    Good news:) I have a way to reproduce it.

    Given we have interrupted the GC procedure (power off) in the following state:

    | page_address | page_type  |
    | 0xEF000      | erased     |
    | 0xF0000      | data       |
    | 0xF1000      | swap_dirty |
    | 0xF2000      | data       |
    | 0xF3000      | data       |

    When we initialized fds again it performs the following actions (PROMOTE_SWAP_INST):

    • tag 0xF1000 as the data
    • tag 0xF0000 as the swap (this fails due to NAND flash)
    • tag 0xEF000 as the data

    When the erased page is initialized before swap page then m_gc.cur_page is not initialized correctly. This is my proposition:

                case FDS_PAGE_SWAP:
                {
                    if (swap_set_but_not_found)
                    {
                        m_pages[page].page_type    = FDS_PAGE_ERASED;
                        m_pages[page].p_addr       = m_swap_page.p_addr;
                        m_pages[page].write_offset = FDS_PAGE_TAG_SIZE;
    
                        m_gc.cur_page = page; // FIX
                        page++;
                    }
    
                    m_swap_page.p_addr = p_page_addr;
                    // If the swap is promoted, this offset should be kept, otherwise,
                    // it should be set to FDS_PAGE_TAG_SIZE.
                    page_scan(p_page_addr, &m_swap_page.write_offset, NULL);
    
                    ret |= (m_swap_page.write_offset == FDS_PAGE_TAG_SIZE) ?
                            PAGE_SWAP_CLEAN : PAGE_SWAP_DIRTY;
                } break;

    It is fixing this case, but I am not sure if it does not break anything. Please, could you review it for me?

Reply
  • Hi,

    Good news:) I have a way to reproduce it.

    Given we have interrupted the GC procedure (power off) in the following state:

    | page_address | page_type  |
    | 0xEF000      | erased     |
    | 0xF0000      | data       |
    | 0xF1000      | swap_dirty |
    | 0xF2000      | data       |
    | 0xF3000      | data       |

    When we initialized fds again it performs the following actions (PROMOTE_SWAP_INST):

    • tag 0xF1000 as the data
    • tag 0xF0000 as the swap (this fails due to NAND flash)
    • tag 0xEF000 as the data

    When the erased page is initialized before swap page then m_gc.cur_page is not initialized correctly. This is my proposition:

                case FDS_PAGE_SWAP:
                {
                    if (swap_set_but_not_found)
                    {
                        m_pages[page].page_type    = FDS_PAGE_ERASED;
                        m_pages[page].p_addr       = m_swap_page.p_addr;
                        m_pages[page].write_offset = FDS_PAGE_TAG_SIZE;
    
                        m_gc.cur_page = page; // FIX
                        page++;
                    }
    
                    m_swap_page.p_addr = p_page_addr;
                    // If the swap is promoted, this offset should be kept, otherwise,
                    // it should be set to FDS_PAGE_TAG_SIZE.
                    page_scan(p_page_addr, &m_swap_page.write_offset, NULL);
    
                    ret |= (m_swap_page.write_offset == FDS_PAGE_TAG_SIZE) ?
                            PAGE_SWAP_CLEAN : PAGE_SWAP_DIRTY;
                } break;

    It is fixing this case, but I am not sure if it does not break anything. Please, could you review it for me?

Children
Related