Several problems arising with OTA DFU.

Hi, I am currently using SDK 15.3.0 with SD 6.1.1 and am playing around with the buttonless DFU.

I noticed that after uploading the new firmware via nrf connect, many times the MCU just hardfaults. It happened e.g. after I set NRF_LOG_ENABLE to 0 in the sdk_config.h file for the new firmware (it was still enabled in the old version). this lead to an internal error in a ble advertising function - doesn't make sense at all. I also hardfault or run into an error when I only comment out NRF_LOG_INIT() in a newer version. Why doesn't this just work, which memory addresses does the logging module use that a dfu will certainly end in a broken firmware?

So now I turn of logging in an initial version and want to update the device with a new software, now apparently a timer is not working alright. Something generally seems wrong with the DFU process but I cant debug it and don't know which parameters might be wrong. 

Parents
  • Hi,

    Have you confirmed that you get an actual hardfault, or something else? What have you found out by debugging? How exactly did this happen? (There are very many things that can go wrong, so we need to narrow things down as much as possible as early as possible).

    Is it so that the issue issue happened after adding buttonless DFU to your application, or does it happen after a DFU update?

    The application would be identical after a DFU update as if it was flashed directly (assuming you made no changes), but there could be changed in the other parts of the flash in case of a dual bank update. Normally that would not cause problems, but it could potentially trigger a bug in your application.

    In short, please explain in more detail what you did and what exactly fails and how. If you go back to when this issue did not happen, can you isolate at which point things start to fail and what exactly you do in that case? Then, what can you find from debugging (for instance logging and inspecting after en error handler is run or something else, depending on what actually happened)?

  • Ok so here's what I did:

    flash initial software to MCU -> test it, works

    make new software (which has basically no changes in sdk_config, mainly business logic) -> test it, works

    DFU (new software replaces old software) -> sensor crashes

    this is the call stack after attaching a debugger after the DFU

    so I reset the sensor in debug mode, then this crash report appears:

    then I reattached the debugger and this crash report appears

    NRF_LOG is enabled and RTT is enabled on both. 

    does this help?

    I cant see whats going on in that advertising function, it crashes at this handler:

  • Hi,

    This makes sense and is good to hear. I don't know your exa t memory layout(?), but if you have gotten an overview of that and see that the page starting at 0xec000 does not overlap with anything ese and is not used by FDS, and is within the reserved area of the bootloader (that it will never touch), then this is OK.

  • can you give me a hint on how to view my memory layout? That would probably help us both in the assessment. 

  • It is not so much a matter of viewing it, but more drawing it up on a piece of paper or similar, with the addresses you have for the various parts of your project. Some things are given though.

    First, look at the memory layout from the bootloader documentation. That has numbers for various ICs so you can use that as a starting point, also the figure you can re-use.

    Then add the following:

    • first page (0x0): MBR
    • starting at the second page (0x1000). SoftDevice. This will end at the size of the SoftDevice, which you can see form the SoftDevice specification (0x26000 for S140 6.1.1 if that is what you are using)
    • The application starts right after the SoftDevice, so at (0x26000 if using S140 6.1.1). Where it ends depends on how large it is. Look at the build output or check the application hex in nRF Connect Programmer or some other tool
    • The bootloader starts wherever the bootloader project you are using is defined to start in the linker configuration. Let's say the bootloader start at 0xf8000 for now as an example (this is a typical value SDK 15.3 based BLE  bootloaders in release mode).
    • There must be two available flash pages at the end though, one form MBR params (second to last) and one for bootloader settings (the last). 

    These are all the things that are easy to locate. And based on this, you can say more. If you use FDS, they are right below (lower address) than the bootloader. The number of pages are configured in the application sdk_config.h.

    If you don't use FDS at all in your application, you can use a page right below the bootloader for stuff you store directly to flash. If you do, pick a page right below this again. Put all this in a figure so that you see your full flash memory layout.

    And lastly, remember to update NRF_DFU_APP_DATA_AREA_SIZE in the bootloader's sdk_config.h so that it stays away of any flash you use right below the bootloader (this needs to cover both FDS and anything else). This needs to be a full multiple of 0x1000 which is a flash page, and counts downwards from the bootloader. See the figure linked to earlier in this post.

  • thanks for your help, I have one more question because I am confused with this:

    NRF_DFU_APP_DATA_AREA_SIZE i 0x3000 in my bootloader (which actually shouldn't be enough with my calculations because if my storage starts at 0xec000 that's far off the 0xf8000 bootloader start address and would actually require more than 0x9000)

    how does NRF_DFU_APP_DATA_AREA_SIZE connect with FDS_RESERVED_VIRTUAL_PAGES and FDS_PAGES? 

    If I set FDS_RESERVED_VIRTUAL_PAGES to 3, this should have an impact on my bootloader application right? these 3 defines don't really make sense to me as in I don't know which one to change and if all 3 need to be changed at once 

  • FDS virtual page size is normally the same as a physical page size, so 0x1000 (4 kB). Regarding the value of FDS_VIRTUAL_PAGES_RESERVED, the number of pages here depends on how much data you store in FDS. If you don't use FDS at all, you can disable FDS by setting FDS_ENABLED to 0 and it will not matter. However, remember that for instance the peer manager library that handles bonding in BLE use FDS, so if that is used you will need FDS. If FDS is enabled, the lowest possible number of pages is 2, as that leaves 1 for swap and 1 for data.

    The FDS pages are always right below the bootloader. So if your bootloader starts at 0xec000 (you can just check the bootloader project to verify) and you have 3 FDS pages, that means that FDS starts at 0xEC000 - 0x1000 = 0xE9000. So I would suggest that you use the page right below there for your data that is outside of FDS, so starting 0xE8000. If you do, that means NRF_DFU_APP_DATA_AREA_SIZE should be 0x4000, as the bootloader needs to know that it should never touch the 4 pages right below it.

    A last point is that FDS can be used for multiple things, so you could also consider to use FDS for your other data. That is a common ting to do, and FDS is a simple file system that allows easy updating of data etc, and it handles the complexities of flash memory for you (things like only being able to flip bits from 1 to 0 except after a page erase, etc).

Reply
  • FDS virtual page size is normally the same as a physical page size, so 0x1000 (4 kB). Regarding the value of FDS_VIRTUAL_PAGES_RESERVED, the number of pages here depends on how much data you store in FDS. If you don't use FDS at all, you can disable FDS by setting FDS_ENABLED to 0 and it will not matter. However, remember that for instance the peer manager library that handles bonding in BLE use FDS, so if that is used you will need FDS. If FDS is enabled, the lowest possible number of pages is 2, as that leaves 1 for swap and 1 for data.

    The FDS pages are always right below the bootloader. So if your bootloader starts at 0xec000 (you can just check the bootloader project to verify) and you have 3 FDS pages, that means that FDS starts at 0xEC000 - 0x1000 = 0xE9000. So I would suggest that you use the page right below there for your data that is outside of FDS, so starting 0xE8000. If you do, that means NRF_DFU_APP_DATA_AREA_SIZE should be 0x4000, as the bootloader needs to know that it should never touch the 4 pages right below it.

    A last point is that FDS can be used for multiple things, so you could also consider to use FDS for your other data. That is a common ting to do, and FDS is a simple file system that allows easy updating of data etc, and it handles the complexities of flash memory for you (things like only being able to flip bits from 1 to 0 except after a page erase, etc).

Children
Related