Beware that this post is related to an SDK in maintenance mode
More Info: Consider nRF Connect SDK for new designs
This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

DFU settings page "randomly" gets zeroed --> Application cannot start

We do have a product in which we program a bootloader on the target. It is remotely based on the ble_app_buttonless_dfu example. The bootloader is not actually used in the end product, i.e. There is usually no occasion on which the bootloader is activated from the application, except if instructed to do so via UART (which should not happen in the production version of the product) and the application does not activate and use the Softdevice. Instead we are using the application solely for proprietary RF communication. The project uses SDK 15.3.0 with Softdevice 6.1.1.

Nonetheless, we observed that once in a while at some point in the product life cycle, the device seems to keep hanging in the bootloader. In the one device I have available for examination, I was able to single-step through the bootloader using a debugger and noticed that it started the bootloader because the check (s_dfu_settings.bank_0.bank_code != NRF_DFU_BANK_VALID_APP)in nrf_bootloader.c succeeds. On further investigation, I noticed that the entire DFU settings and settings backup pages are consisting of zeroes (0x00), except for the CRC (s_dfu_settings.crc, 0xAC871054), the version (s_dfu_settings.version, 0x00000001), the app validation type (s_dfu_settings.boot_validation_app.type, 0x01) and, curiously, (adv_name.name[1], 0x01).

Looking at nrf_dfu_settings_init(), line 262 in nrf_dfu_settings.c, it looks like the settings pages get zeroed if neither the settings page nor the settings backup page are valid - and when the settings page gets written, the above values should be generated and written, too. Now the question is: how and why do the settings get invalid in the first place?

I was able to get the production logs from the EMS who flashed the device and the device seems to have been properly flashed, containing the correct, valid settings pages. From my understanding, the settings pages should not be touched unless the bootloader is actually used, so I can't imagine under what circumstances the bootloader settings pages may get invalidated... Additionally it looks like this particular device never actually booted into the application, because if it had booted into the application, the first thing done there is activating Access Point protection, which would prevent me from being able to connect using the debugger.

I dumped the flash of the device exhibiting this behaviour and compared it to a freshly flashed device (that has not yet reset and thus did not have access point protection yet) and the differences are:

  • on the working device, page at 0x7E000 (presumably the settings backup page) has not been written (all 0xFFFFFFFF)
  • the faulty device has a copy of the faulty settings as described above there.
  • The working device has the presumably correct bootloader settings at 0x7F000
  • The faulty device has the faulty settings as described above in said page.

So finally my questions:

  • So with the setup as described, under what circumstances could the bootloader settings page become invalid?
  • Under what circumstances do the bootloader settings and backup pages get erased / written at all?
  • (When) are the bootloader settings pages protected by the bootloader using the BPROT registers?

Parents
  • Upon further examination, I noticed that the bootloader executes the following lines of Code in nrf_dfu_settings.c:

    if (NRF_DFU_SETTINGS_COMPATIBILITY_MODE && !NRF_DFU_IN_APP && (s_dfu_settings.settings_version == 1))
    {
        NRF_LOG_INFO("Old settings page detected. Upgrading info.");
    
        // Old version. Translate.
        memcpy(&s_dfu_settings.peer_data, (uint8_t *)&s_dfu_settings + DFU_SETTINGS_BOND_DATA_OFFSET_V1, NRF_DFU_PEER_DATA_LEN);
        memcpy(&s_dfu_settings.adv_name,  (uint8_t *)&s_dfu_settings + DFU_SETTINGS_ADV_NAME_OFFSET_V1,  NRF_DFU_ADV_NAME_LEN);
    
        // Initialize with defaults.
        s_dfu_settings.boot_validation_softdevice.type = NO_VALIDATION;
        s_dfu_settings.boot_validation_app.type        = VALIDATE_CRC;
        s_dfu_settings.boot_validation_bootloader.type = NO_VALIDATION;
        memcpy(s_dfu_settings.boot_validation_app.bytes, &s_dfu_settings.bank_0.image_crc, sizeof(uint32_t));
    
        s_dfu_settings.settings_version = NRF_DFU_SETTINGS_VERSION;
    }

    but the value for NRF_DFU_SETTINGS_VERSION is actually 1. So this check seems to always be true and it looks like the bootloader settings page gets rewritten every time the device boots! Since no bootloader settings backup page is available, if the power cuts during the erase/write procedure, the application won't boot again.

    I use nrfutil 3.5.1 to generate the settings page.

    I assume, that I have to

    • use a newer nrfutil
    • actually generate a backup page in the original image
    • set NRF_DFU_SETTINGS_VERSION to the value that the bootloader should actually use (for sure not to 1!)

    EDIT:

    Additionally, I found out that the check in settings_write() in nrf_dfu_settings.c which should prevent the bootloader settings page from being rewritten if not necessary, will get rewritten nonetheless because for whatever reason, some values change between reboots.

    The following screenshot shows on the left side the bootloader settings page as it is in flash and on the right side the s_dfu settings struct as it is in rame when settings_write is executed. There is only one reboot inbetween and there was no DFU activity whatsoever.  As you can see. peer_data.crc and peer_data.irk as well as the adv_name values changed. Why? To me, it looks like there is some strange memory corruption bug here, but I can't pinpoint where and why it happens.

  • Hi Michael, 

    Could you let me know the SDK version you are using ? If you are using SDK v15.3 and above the NRF_DFU_SETTINGS_VERSION  should be set to 2. It's by default defined in the pre-processor symbols in the project setting of the bootloader project. 

    You should use newer nrfutil, the current version is v6.0 

    The bootloader flash protect is performed via the call     ret_val = nrf_bootloader_flash_protect(BOOTLOADER_START_ADDR, BOOTLOADER_SIZE, false); in main.c 

  • Hi Hung Bui

    Thanks for the response. As mentioned in the original post, we are using SDK 15.3. We upgraded at some point and in the course did not update the preprocessor defines for the compiler where NRF_DFU_SETTINGS_VERSION is set.

    I don't see why NRF_DFU_SETTINGS_VERSION is not hardcoded in the dfu settings c-file? As the implementation does not support any other version than two, there is no reason to have it "hidden" in the project settings!

    The wrong NRF_DFU_SETTINGS_VERSION does indeed lead leads to these lines changing the bootloader settings upon every reboot, requiring it to be erased and rewritten.

    memcpy(&s_dfu_settings.peer_data, (uint8_t *)&s_dfu_settings + DFU_SETTINGS_BOND_DATA_OFFSET_V1, NRF_DFU_PEER_DATA_LEN);
    memcpy(&s_dfu_settings.adv_name,  (uint8_t *)&s_dfu_settings + DFU_SETTINGS_ADV_NAME_OFFSET_V1,  NRF_DFU_ADV_NAME_LEN);
    

     Without the backup page this leads to irrecoverable failure when the device loses power or gets reset between the erase and the completion of the write.

    Regards,

    -mike

Reply
  • Hi Hung Bui

    Thanks for the response. As mentioned in the original post, we are using SDK 15.3. We upgraded at some point and in the course did not update the preprocessor defines for the compiler where NRF_DFU_SETTINGS_VERSION is set.

    I don't see why NRF_DFU_SETTINGS_VERSION is not hardcoded in the dfu settings c-file? As the implementation does not support any other version than two, there is no reason to have it "hidden" in the project settings!

    The wrong NRF_DFU_SETTINGS_VERSION does indeed lead leads to these lines changing the bootloader settings upon every reboot, requiring it to be erased and rewritten.

    memcpy(&s_dfu_settings.peer_data, (uint8_t *)&s_dfu_settings + DFU_SETTINGS_BOND_DATA_OFFSET_V1, NRF_DFU_PEER_DATA_LEN);
    memcpy(&s_dfu_settings.adv_name,  (uint8_t *)&s_dfu_settings + DFU_SETTINGS_ADV_NAME_OFFSET_V1,  NRF_DFU_ADV_NAME_LEN);
    

     Without the backup page this leads to irrecoverable failure when the device loses power or gets reset between the erase and the completion of the write.

    Regards,

    -mike

Children
Related