This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

How to fully erase app when using bootloader and DFU?

We are currently working on a production board where debugging is not easy. The board seems to have gotten into a state where FDS is reporting no available pages. I think it may simply be a matter of garbage collecting that was not being done but unfortunately it is not possible to initialize storage to even get to that point. I also believe FDS may be incorrectly identifying unerased pages as program data. 

Is there a way to fully reset (erase) everything beyond the bootloader so we can start from a known good state without cracking the case open?  We are using usb serial dfu. 

We have not been able to reproduce this issue on our dev boards so it is extremely hard to debug. 

Parents
  • Hi,

    Have you changed the number of allocated pages to FDS at some point? The only way for this to happen (that I know of) is if you update the app on the device with a new version with fewer pages allocated to FDS (=FDS_VIRTUAL_PAGES in sdk_config.h), and that the SWAP page happens to be outside of the new region at the time of update. SWAP is moved to a different page every time GC is run, which might explain why it's difficult to replicate. 

    You can dump the memory to confirm if this is why you get the "NO PAGES" error. I provided instructions on how to do that in this thread: https://devzone.nordicsemi.com/f/nordic-q-a/42676/fds_err_no_pages-returned-when-fds_init-is-called/166492#166492.  EDIT: Nevermind, sounds like you can't attach a debugger to the device. Also, I forgot to answer your question. It is possible to erase the app data region to avoid the "no page" error. You can modify the application code to erase the app data region in case FDS initialization fails, or make a new app with more pages allocated to FDS. 

  • I believe what triggered it was switching to release mode but I was confused as to why that would cause it to lose track of the swap. It’s possible I did something else but I don’t remember what it was. 

    Allocating more pages to FDS didn’t solve the problem (I should point out that we are pushing the limits of the nrf82540 and using a lot of that flash). 

    I actually ended up cracking open the case and resetting everything but one question I did have that might help me debug in the future was whether the DFU erases the existing app before writing the new one. Meaning, in the case of going from debug to release where the program size was cut in half, will FDS get to reclaim that space or will it detect those pages as unknown and assume they can’t be used?

  • Yeah, that was my thinking too. Unfortunately after spending quite a bit of time on it I don’t think there’s a way for the application to erase those pages. The issue is that the pages aren’t tagged so the application has no more idea of which ones are valid than does the FDS subsystem. I tried various schemes to work around that but to no avail.

    I may look at adding something to the bootloader but maybe you guys could consider it as a possible future enhancement.

    Thanks

  • I will report this as a feature request internally. I was a bit surprised to see that FDS wouldn't reclaim pages used during DFU. It means that  NRF_DFU_APP_DATA_AREA_SIZE  must be equal to FDS_VIRTUAL_PAGES in the current implementation. 

    You may have seen it already, but the fds.c->flash_bounds_set() can be used to find the flash boundaries for FDS. The function is defined as static, so the module needs to be modified to expose this function to the app. It will become a bit more complicated if you have a requirement to retain valid data records instead of just deleting everything to recover the device though.  Parsing the FDS pages to find which can be freed (either erased or only containing deleted records) could be an option. 

  • I think the issue in the case where the application has “receded” between updates is that those pages aren’t erased and therefore will still appear to FDS to be application pages (FDS_PAGE_UNDEFINED) and FDS will skip over them.  But it’s entirely possible I’m missing something. 

  • I think the problem is that the bootloader has written data into the app data region. If you are able to replicate this again I suggest you readout the entire application region with nrfjprog --memrd <address> --n <bootloader start - app start> > application_region.txt, then compare the memory content with a working device. 

  • I definitely agree. My concern is more about preventing this in the future and recovering if it does happen. I’m going to play around with reserving app data area as I wasn’t aware that was an option. That being said, I still think the ability to do a “full” erase from the bootloader (but while preserving the bootloader) would be helpful. 

Reply Children
Related