Several problems arising with OTA DFU.

Hi, I am currently using SDK 15.3.0 with SD 6.1.1 and am playing around with the buttonless DFU.

I noticed that after uploading the new firmware via nrf connect, many times the MCU just hardfaults. It happened e.g. after I set NRF_LOG_ENABLE to 0 in the sdk_config.h file for the new firmware (it was still enabled in the old version). this lead to an internal error in a ble advertising function - doesn't make sense at all. I also hardfault or run into an error when I only comment out NRF_LOG_INIT() in a newer version. Why doesn't this just work, which memory addresses does the logging module use that a dfu will certainly end in a broken firmware?

So now I turn of logging in an initial version and want to update the device with a new software, now apparently a timer is not working alright. Something generally seems wrong with the DFU process but I cant debug it and don't know which parameters might be wrong. 

  • Hi,

    Have you confirmed that you get an actual hardfault, or something else? What have you found out by debugging? How exactly did this happen? (There are very many things that can go wrong, so we need to narrow things down as much as possible as early as possible).

    Is it so that the issue issue happened after adding buttonless DFU to your application, or does it happen after a DFU update?

    The application would be identical after a DFU update as if it was flashed directly (assuming you made no changes), but there could be changed in the other parts of the flash in case of a dual bank update. Normally that would not cause problems, but it could potentially trigger a bug in your application.

    In short, please explain in more detail what you did and what exactly fails and how. If you go back to when this issue did not happen, can you isolate at which point things start to fail and what exactly you do in that case? Then, what can you find from debugging (for instance logging and inspecting after en error handler is run or something else, depending on what actually happened)?

  • okay let me try to narrow it down.

    It happened while performing a DFU, I did that by using the nrf connect app and then the MCU didn't show up again (no advertising) hence I attached a debugger and noticed that the firmware crashed in some sort of logging process. But this was only one of some issues, it also happened once that advertising failed with an internal error (0x3)

    > but there could be changed in the other parts of the flash in case of a dual bank update. Normally that would not cause problems, but it could potentially trigger a bug in your application.

    yep that's what I though as well, I am using the internal flash storage module, writing to addresses 07c000 to 7ffff, maybe this helps? How can I see which banks are being written during the dfu process and therefore always keep those clear? Could it also be a problem if the application is too big? 

    I will inspect the issue in more detail and come back with more explanation, but that's it for now.

  • Ok so here's what I did:

    flash initial software to MCU -> test it, works

    make new software (which has basically no changes in sdk_config, mainly business logic) -> test it, works

    DFU (new software replaces old software) -> sensor crashes

    this is the call stack after attaching a debugger after the DFU

    so I reset the sensor in debug mode, then this crash report appears:

    then I reattached the debugger and this crash report appears

    NRF_LOG is enabled and RTT is enabled on both. 

    does this help?

    I cant see whats going on in that advertising function, it crashes at this handler:

  • Juliusc said:
    I cant see whats going on in that advertising function, it crashes at this handler:

    When you write "crashes", what exactly do that mean? Can you elaborate? What do you find about the state of the device and what happened from debugging? And what do you see form the RTT log?

    Juliusc said:
    07c000 to 7ffff,

    As long as the flash is not used by anything else it should be OK. However, even if it is not used by anything else when the application is running, it could be that it gets overwritten during a DFU. Generally, you should use flash right below the bootloader (but take into account that the FDS pages are there as well if you use that, and in that case use flash right after the FDS pages again). See memory layout. This is because you can configure the bootloader to lay away of this region by adjusting NRF_DFU_APP_DATA_AREA_SIZE in the bootloader's sdk_config.h.

  • I mean it just doesnt work any more and if I attach a debugger I see that it hangs on in that particular state (as shown in the screenshots). 

    The DATA_AREA_SIZE was set to 12288 in the bootloader SDK, that should be enough as far as I can see?

Related