What is the expected success rate of a DFU update?

We have both native iOS and Android-apps that handle DFU-updates of our BLE-FW on our HW-devices. We are currently experiencing a dip in success rate (according to our firebase metrics) on Android, where we seem to get an increased number of "GATT ERROR" during updates (no further information is available).

On Android we used to have a success rate of above 95%, and now we're down to 66% for no apparent reason (i.e. no changes to the app code handling the update, only the binary file has changed). iOS seems to succeed better, and stays around 90-95% as before.

Is there some form of "suspected success rate" on Android and on iOS (particularly interesting on Android due to the vastness of HW variants) for this chip that we can use as a "baseline" for what can be expected?

Parents
  • Hi,

    Hijacking this thread to add information about the issue, I'm working close with the thread creator on the embedded side.

    We are using a Laird module BL652-SA with nRF52832.

    Most phones can handle the DFU upgrade flow from Android. We can re-create the issue right now on a Google pixel 4. But the issue is reproducible on several other brands eg. Samsung Galaxy S10.
    The issue does not always appear. The upgrade flow can sometimes fail sometimes succeed. if the DFU upgrade is successful  the paring information on the phone is always removed after completed DFU upgrade. The nRF chip is upgraded with new firmware and has still the bonding information.
    So the issue seems to be from the Phone application side.
    If we are using nRF Connect debug application and trying DFU upgrade from the debug application the Upgrade always works as expected.

    The issue never appears on iPhone. Our iPhone application is not using the NRF DFU SDK but our Android application is using the nRF DFU SDK. We are using the DFU buttonless feature to set the device to DFU mode.

    If I scan for Bluetooth devices while upgrading i can see the unit changing from our normal advertising name ABCD_[s/n] to HDFU_[s/n] which is set by our bootloader when we set it to DFU upgrade state. So the device is restarted and set as expected.

    To mention is:
    I can find two devices with the same MAC when I scan. One with the name (unknown) and one with our desired name.
    The same happens when the device is in DFU mode.

    We set the device name with: sd_ble_gap_device_name_set

    I can't figure out where the (unknown) UUID is coming from. I have an idea that the DFU gets confused and mixup the two when trying to upgrade, but its a long shot.

    Relating the thread issue to my text.

    The gatt error seems to come after upgrading from our application on Android because the paring information has been lost.

    Do you have any ideas how to proceed with the debugging or any known bugs from the nRF DFU SDK or if our (unknown) UUID can affect in any way.

    Thanks for your help

  • the paring information on the phone is always removed after completed DFU upgrade. The nRF chip is upgraded with new firmware and has still the bonding information.

    Have you tried this: https://github.com/NordicSemiconductor/Android-DFU-Library/blob/913dbc128095b211e85e476fb10d497379e9be33/lib_dfu/src/main/java/no/nordicsemi/android/dfu/DfuServiceInitiator.java#L175 and setting to true?

    By default it will remove bond info, as this feature was added later in the SDK.

  • set setRestoreBond to false - otherwise it will bond again, instead of reusing the same bond information: github.com/.../DfuServiceInitiator.java

  • Also, if you're using Legacy DFU you may remove "setPrepareObjectDelay" and "setUnsafe...Enabled", as unused.

  • Thanks, this solved the issue with the device being unpaired. A bit non-intuitive naming imho, I interpreted "restore bond" as in "after update is complete, restore its previous bond", but I understand now.

    Strange thing is that setting this to true has been in the code for 1,5 years, and it wasn't until this latest update that this (unwanted) behaviour appeared...?

    I will send it to test and see what happens.

    The main issue with "GATT ERROR" and connection failure still remains, though.

  • Worth mentioning to the "GATT ERROR" is that when this error occurs we cannot connect ever again.

    But if the Bluetooth is turned off on the phone and then turned on again the device can connect again.

  • Hi,

    FredrikL said:
    We are using a Laird module BL652-SA with nRF52832.

    Do you have a external LF crystal on your board?
    In sdk_config.h, what are these currently configured to?

    NRF_SDH_CLOCK_LF_SRC
    NRF_SDH_CLOCK_LF_RC_CTIV
    NRF_SDH_CLOCK_LF_RC_TEMP_CTIV
    NRF_SDH_CLOCK_LF_ACCURACY

    Check both application project sdk_config.h and the bootloader sdk_config.h

    As an test, could you try to set NRF_SDH_CLOCK_LF_ACCURACY to 1 (i.e. 500 ppm) ?

    If it's already set to 1 (500ppm), try set NRF_SDH_CLOCK_LF_SRC to 2(NRF_CLOCK_LF_SRC_SYNTH) as well.

Reply
  • Hi,

    FredrikL said:
    We are using a Laird module BL652-SA with nRF52832.

    Do you have a external LF crystal on your board?
    In sdk_config.h, what are these currently configured to?

    NRF_SDH_CLOCK_LF_SRC
    NRF_SDH_CLOCK_LF_RC_CTIV
    NRF_SDH_CLOCK_LF_RC_TEMP_CTIV
    NRF_SDH_CLOCK_LF_ACCURACY

    Check both application project sdk_config.h and the bootloader sdk_config.h

    As an test, could you try to set NRF_SDH_CLOCK_LF_ACCURACY to 1 (i.e. 500 ppm) ?

    If it's already set to 1 (500ppm), try set NRF_SDH_CLOCK_LF_SRC to 2(NRF_CLOCK_LF_SRC_SYNTH) as well.

Children
  • Hi Sigurd,

    Thank you for your response.

    We are indeed using an external crystal.

    Setting in bootloader are:
    #define NRF_SDH_CLOCK_LF_SRC 0
    #define NRF_SDH_CLOCK_LF_RC_CTIV 4
    #define NRF_SDH_CLOCK_LF_RC_TEMP_CTIV 0
    #define NRF_SDH_CLOCK_LF_ACCURACY 1

    Settings in application are:
    #define NRF_SDH_CLOCK_LF_SRC 0
    #define NRF_SDH_CLOCK_LF_RC_CTIV 4
    #define NRF_SDH_CLOCK_LF_RC_TEMP_CTIV 0
    #define NRF_SDH_CLOCK_LF_ACCURACY 1
    If I changes setting for:
    NRF_SDH_CLOCK_LF_SRC to NRF_CLOCK_LF_SRC_SYNTH
    I end up in a hanged application at:
    err_code = nrf_sdh_enable_request();
  • FredrikL said:
    If I changes setting for:
    NRF_SDH_CLOCK_LF_SRC to NRF_CLOCK_LF_SRC_SYNTH
    I end up in a hanged application at:
    err_code = nrf_sdh_enable_request();

    Set NRF_SDH_CLOCK_LF_RC_CTIV to 0 when doing this test with NRF_CLOCK_LF_SRC_SYNTH

Related