Issues with completing DFU after updated to NCS V2.6.0

We have some custom hardware that utilises an NRF52832.  Up until recently, all our firmware was built on NCS V2.2.0, which I believe used Zephyr V3.2.99.  We have implemented DFU for firmware upgrades, and haven't had any issues with this.  Up until now.

The latest version of our firmware has been built around NCS V2.6.0, which I think uses Zephyr V3.5.99.  There were a bunch of CONFIG setting changes that I need to make in the transition from NCS V2.2.0 to NCS V2.6.0 associated with setting up the DFU functionality.

We have a custom App that we can use for customers to complete a firmware upgrade.  That initiates the transfer of the new image file into the spare slot, resets the device at the completion, and then awaits a response from the SMP service (the App has subscribed to the SMP Service notifications) to tell it that the new image has been accepted and the device has booted up using the new image.

For whatever reason, that notification from the device to the App via the SMP service is no longer occurring, so our App sits there endlessly waiting.

When we try and test the same functionality using Nordic's Device Manager App, we see the following:

  • New image uploads into Slot 1 OK
  • READ - this confirms the old image in Slot 0 and the new, pending image, in Slot 1
  • TEST - this confirms the new image in Slot 1 as the one to boot up from next time the device starts up
  • RESET - this works as expected
  • CONFIRM - this fails, with either an error message that says "Insufficient handle" or something about insufficent authentication

Basically, it looks like pairing/bonding information is getting wiped in the DFU process.  But this authentication either wasn't needed when we were doing DFU with V2.2.0, or the pairing/bonding info wasn't getting erased with the new image upload.

I suspect there is some CONFIG setting that is now being set that previously wasn't.  I've attached my proj.conf file so you can see how I currently have things configured

# Operational CONFIG settings

# General
CONFIG_REBOOT=y
CONFIG_GPIO=y
CONFIG_BOARD_ENABLE_DCDC=n
CONFIG_ADC=y

# Logging
CONFIG_LOG=y
CONFIG_LOG_MODE_MINIMAL=n
CONFIG_LOG_DEFAULT_LEVEL=0

# Boot
CONFIG_BOOT_BANNER=n

#Device Name - will be visible to clients scanning
CONFIG_BT_DEVICE_NAME="My Device"

# Bluetooth
CONFIG_BT=y
CONFIG_BT_PERIPHERAL=y
CONFIG_BT_SMP=y
CONFIG_BT_GATT_CLIENT=y
CONFIG_BT_GATT_DM=y
CONFIG_BT_MAX_PAIRED=10
CONFIG_BT_ID_MAX=10
CONFIG_BT_ID_UNPAIR_MATCHING_BONDS=y
CONFIG_BT_KEYS_OVERWRITE_OLDEST=y
CONFIG_BT_SMP_SC_PAIR_ONLY=y
#CONFIG_BT_CTS_CLIENT=y

#PHY update needed for updating PHY request
CONFIG_BT_USER_PHY_UPDATE=y
#For data length update
CONFIG_BT_USER_DATA_LEN_UPDATE=y
CONFIG_BT_BUF_ACL_TX_COUNT=10
CONFIG_BT_ATT_PREPARE_COUNT=2

# Enable CTS client
CONFIG_BT_CTS_CLIENT=y

# Below is setup to let DIS information be read from settings
CONFIG_BT_SETTINGS=y
CONFIG_SETTINGS_RUNTIME=y
CONFIG_SETTINGS=y

#Enable MCUBOOT bootloader build in the application
CONFIG_BOOTLOADER_MCUBOOT=y
#Include MCUMGR and the dependencies in the build
CONFIG_NCS_SAMPLE_MCUMGR_BT_OTA_DFU=y
CONFIG_NCS_SAMPLE_MCUMGR_BT_OTA_DFU_SPEEDUP=y
CONFIG_NCS_SAMPLE_MCUMGR_BT_OTA_DFU_VALIDATION=y

# Set stack and heap sizes
CONFIG_MAIN_STACK_SIZE=2048
CONFIG_HEAP_MEM_POOL_SIZE=2048

# Allow flash
CONFIG_FLASH=y
CONFIG_FLASH_PAGE_LAYOUT=y
CONFIG_FLASH_MAP=y
CONFIG_NVS=y
CONFIG_MPU_ALLOW_FLASH_WRITE=y

# Enable Power Management
CONFIG_POWEROFF=y

# Required to disable default behavior of deep sleep on timeout
CONFIG_PM_DEVICE=y

# Enable I2C
CONFIG_I2C=y
CONFIG_COUNTER=y
CONFIG_PCF85063A=y

#cJSON
CONFIG_CJSON_LIB=y
CONFIG_NEWLIB_LIBC=y
CONFIG_NEWLIB_LIBC_FLOAT_PRINTF=y

# Sensors
CONFIG_SENSOR=y

# CONFIG below here set up to minimise flash usage

# Disable features not needed
CONFIG_TIMESLICING=n
CONFIG_ASSERT=n

# Disable Bluetooth features not needed
CONFIG_BT_DEBUG_NONE=y
CONFIG_BT_ASSERT=n
CONFIG_BT_GATT_CACHING=n
CONFIG_BT_SETTINGS_CCC_LAZY_LOADING=y
CONFIG_BT_HCI_VS_EXT=n

# Disable Bluetooth controller features not needed
CONFIG_BT_CTLR_PRIVACY=n
CONFIG_BT_CTLR_PHY_2M=n

# Reduce Bluetooth buffers
CONFIG_BT_BUF_EVT_DISCARDABLE_COUNT=1
CONFIG_BT_BUF_EVT_RX_COUNT=2

# Drivers and peripherals
CONFIG_WATCHDOG=n
CONFIG_SPI=n

# Interrupts
CONFIG_DYNAMIC_INTERRUPTS=n

# Memory protection
CONFIG_THREAD_STACK_INFO=n
CONFIG_THREAD_CUSTOM_DATA=n
CONFIG_FPU=n

# Console
CONFIG_EARLY_CONSOLE=n

# Build
CONFIG_SIZE_OPTIMIZATIONS=y

Can anyone assist me in resolving this issue?  We need to roll out the updated version of firmware, but can't until we can ensure a smooth upgrade process through our App.

Regards,

Mike

Parents
  • Hi Mike,

    The confirm action should not require authentication (i.e., bonding) unless the CONFIG_MCUMGR_TRANSPORT_BT_AUTHEN symbol is selected, which does not seem to be the case based on the configuration you posted. Could you please try performing the update from the device manager using the "Confirm only" method from the "basic" menu to see if that enables you to complete the update? This will help narrow down the problem.

    Additionally, are you using the same static partition layout for both versions, and what is the flash utilization of the app you are uploading? You can see this from the build log.

    Best regards,

    Vidar

  • Hi Vidar,

    Yep - using the same static partition layout for both versions.  Flash utilisation is around 91%

    Memory region         Used Size  Region Size  %age Used
               FLASH:       26002 B        48 KB     52.90%
                 RAM:       23680 B        64 KB     36.13%
            IDT_LIST:          0 GB        32 KB      0.00%
    [34/42] Linking C executable zephyr\zephyr.elf
    Memory region         Used Size  Region Size  %age Used
               FLASH:      206412 B     224768 B     91.83%
                 RAM:       62408 B        64 KB     95.23%
            IDT_LIST:          0 GB        32 KB      0.00%
    [37/42] Generating ../../zephyr/app_update.bin
    image.py: sign the payload
    [38/42] Generating ../../zephyr/app_signed.hex
    image.py: sign the payload
    [40/42] Generating ../../zephyr/app_test_update.hex
    image.py: sign the payload
    [42/42] Generating zephyr/merged.hex

    OK, here are the details of what I did to do the upgrade:

    • Using Confirm Only from the Basic Menu, Upgrade from 1.7 (which was built on V2.2.0) to 1.8 (which was built on V2.6.0) which completes;
    • Switch to Advanced and UNSUCCESSULLY Read the Image data from the device.  The error returned is ‘The handle is invalid’
    • Go to iPhone Bluetooth Settings and ‘Forget the Device’
    • Go back to Device Manager Advanced and Successfully read the Image data from the device

    I took a bunch of screen shots of the upgrade process as per above.

    The selected firmware prior to to starting the Confirm Only upgrade (image1.png)

    The Upload Complete state after the upgrade and Reset of the device (image 2.png)

    Attempting to Read the Image data immediately after the Upgrade (image 3.png)

    Successfully reading the Image data after ‘Forgetting the Device’ in phone Bluetooth Settings (image4.png)

    So in summary, the upgrade is completing, but the Pairing/Bonding information appears to be corrupted/lost. 

    Alternatively, with the changes to Zephyr and the SDK, is the SMP Characteristic no longer being set to require Authentication??

    Regards,

    Mike

  • Hi Mike,

    Is it possible to enable logging in the new version to see if there are any errors reported? E.g., if the  link fails to be secured. You can also use the bt_foreach_bond() function to find out how many bonds are stored to check that bonds are not being erased during the DFU process.

    Regards,

    Vidar

  • Sorry for not noticing this earlier, but this setting will erase the settings partition when you do DFU:

    Do you experience the same if you turn it off? I have requested that we make this disabled by default

  • Hi Vidar,

    We tried turning that setting off, but it had on effect on our ability to do a DFU OTA.

    We did some logging from within the nRF Connect App to see if we could get some further information.  We currently have 3 versions of firmware we are using for testing:

    V1.6 - this is what is currently in field deployed devices, and is built on NCS v2.2.0

    V1.7 - this is an upgrade that we were planning to release, but when we tested it noticed it had a few bugs we hadn't uncovered in V1.6.  This is also built on NCS V2.2.0

    V1.8 - this is the newest version, that fixes the bugs in V1.7, but it built around NCS V2.6.0

    We tried going from V1.6 -> V1.7 -> V1.8. In each case we had nRF Connect set to "Test and Confirm", 

    Upgrade from V1.6 (NCS V2.2.0) to V1.7 (NCS V2.2.0)

    Found valid Firmware in file:///private/var/mobile/Containers/Data/Application/C16C782E-248A-43B7-98F8-B64C510541BD/Documents/6_update_1.7.bin for Device DFU McuMgr.
    
    Upgrade started with 1 image(s) using 'Test and Confirm' mode
    
    Firmware Upgrade Started.
    
    State changed from none to requestMcuMgrParameters
    
    Peripheral connected
    
    Device ready
    
    Mcu Manager parameters received (4 x 2475)
    
    State changed from requestMcuMgrParameters to bootloaderInfo
    
    Bootloader info not supported
    
    State changed from bootloaderInfo to validate
    
    Image List response: Header: {"version": "0", "op": "1", "flags": 0, "length": 134, "group": 1, "seqNum": 130, "commandId": 0}, Payload: {"images" : {{"bootable" : true, "version" : "0.0.0", "hash" : 0xC25B88AA20BEC2CC7E5BE0415A93344EF27AC17E641194893AAC3D88BFDE0A45, "pending" : false, "active" : true, "permanent" : false, "confirmed" : true, "slot" : 0}}, "splitStatus" : 0}
    
    Scheduling upload (hash: 0x2A78A535D1BAB0A61ED01DEC2872F2CB60C6B15DE7A26AF0C2E3DD2212FB4079) for image 0 (slot: 1)
    
    State changed from validate to upload
    
    No remaining chunks to be sent? chunkOffset: 203620, imageData: 203620.
    
    No remaining chunks to be sent? chunkOffset: 203620, imageData: 203620.
    
    Upload finished (1 of 1)
    
    State changed from upload to test
    
    Image Test response: Header: {"version": "0", "op": "3", "flags": 0, "length": 244, "group": 1, "seqNum": 215, "commandId": 0}, Payload: {"images" : {{"hash" : 0xC25B88AA20BEC2CC7E5BE0415A93344EF27AC17E641194893AAC3D88BFDE0A45, "confirmed" : true, "active" : true, "slot" : 0, "pending" : false, "version" : "0.0.0", "permanent" : false, "bootable" : true}, {"confirmed" : false, "version" : "0.0.0", "bootable" : true, "hash" : 0x2A78A535D1BAB0A61ED01DEC2872F2CB60C6B15DE7A26AF0C2E3DD2212FB4079, "pending" : true, "slot" : 1, "active" : false, "permanent" : false}}, "splitStatus" : 0}
    
    State changed from test to reset
    
    Reset request confirmed
    
    Disconnected.
    
    Peripheral disconnected
    
    Device has disconnected
    
    Waiting 10 seconds reconnecting...
    
    Reconnecting...
    
    Reconnect deferred
    
    State changed from reset to confirm
    
    Peripheral connected
    
    Device ready
    
    Image Confirm response: Header: {"version": "0", "op": "3", "flags": 0, "length": 244, "group": 1, "seqNum": 216, "commandId": 0}, Payload: {"splitStatus" : 0, "images" : {{"hash" : 0x2A78A535D1BAB0A61ED01DEC2872F2CB60C6B15DE7A26AF0C2E3DD2212FB4079, "bootable" : true, "pending" : false, "permanent" : false, "active" : true, "version" : "0.0.0", "slot" : 0, "confirmed" : true}, {"hash" : 0xC25B88AA20BEC2CC7E5BE0415A93344EF27AC17E641194893AAC3D88BFDE0A45, "bootable" : true, "permanent" : false, "slot" : 1, "confirmed" : false, "pending" : false, "version" : "0.0.0", "active" : false}}}
    
    Upgrade complete
    
    State changed from confirm to success
    
    Success!
    

    Upgrade from V1.7 (NCS V2.2.0) to V1.8 (NCS V2.6.0)

    Upgrade started with 1 image(s) using 'Test and Confirm' mode
    
    Firmware Upgrade Started.
    
    State changed from none to requestMcuMgrParameters
    
    Peripheral connected
    
    Device ready
    
    Mcu Manager parameters received (4 x 2475)
    
    State changed from requestMcuMgrParameters to bootloaderInfo
    
    Bootloader info not supported
    
    State changed from bootloaderInfo to validate
    
    Image List response: Header: {"version": "0", "op": "1", "flags": 0, "length": 244, "group": 1, "seqNum": 116, "commandId": 0}, Payload: {"images" : {{"permanent" : false, "confirmed" : true, "bootable" : true, "pending" : false, "hash" : 0x2A78A535D1BAB0A61ED01DEC2872F2CB60C6B15DE7A26AF0C2E3DD2212FB4079, "version" : "0.0.0", "active" : true, "slot" : 0}, {"pending" : false, "slot" : 1, "bootable" : true, "version" : "0.0.0", "active" : false, "permanent" : false, "confirmed" : false, "hash" : 0xC25B88AA20BEC2CC7E5BE0415A93344EF27AC17E641194893AAC3D88BFDE0A45}}, "splitStatus" : 0}
    
    Secondary slot of image 0 will be overwritten
    
    Scheduling upload (hash: 0x79834FF4F72A0934C370B232E754CD1FF56971FCCB565A7593A4B3A88EE487CA) for image 0 (slot: 1)
    
    State changed from validate to upload
    
    Retry 1 for seq: 159
    
    Retry 1 for seq: 181
    
    No remaining chunks to be sent? chunkOffset: 207260, imageData: 207260.
    
    No remaining chunks to be sent? chunkOffset: 207260, imageData: 207260.
    
    Upload finished (1 of 1)
    
    State changed from upload to test
    
    Image Test response: Header: {"version": "0", "op": "3", "flags": 0, "length": 244, "group": 1, "seqNum": 208, "commandId": 0}, Payload: {"splitStatus" : 0, "images" : {{"pending" : false, "version" : "0.0.0", "confirmed" : true, "hash" : 0x2A78A535D1BAB0A61ED01DEC2872F2CB60C6B15DE7A26AF0C2E3DD2212FB4079, "slot" : 0, "permanent" : false, "active" : true, "bootable" : true}, {"confirmed" : false, "hash" : 0x79834FF4F72A0934C370B232E754CD1FF56971FCCB565A7593A4B3A88EE487CA, "bootable" : true, "slot" : 1, "active" : false, "version" : "0.0.0", "pending" : true, "permanent" : false}}}
    
    State changed from test to reset
    
    Reset request confirmed
    
    Peripheral disconnected
    
    Device has disconnected
    
    Waiting 10 seconds reconnecting...
    
    Disconnected.
    
    Reconnecting...
    
    Reconnect deferred
    
    State changed from reset to confirm
    
    Peripheral connected
    
    The handle is invalid.
    
    Request (SMPv1, group: image, seq: 209, command: state) failed: The handle is invalid.
    
    The handle is invalid.
    
    DFU failed: The handle is invalid.
    
    DFU Failed with Error: The handle is invalid.
    

    A colleague stumbled across this ticket, which seems to indicate that with CONFIG_MCUMGR_SMP_BT_AUTHEN=y, that you need encryption and authentication.  Interestingly, in our factory code, we have this set to "n", but in our operational code (in both v1.6 and v1.7) we have this set to "y", yet the upgrade from v1.6 to v1.7 works as long as the device is paired, which sort of doesn't make sense

    (Note - we don't have any display options for passkeys on our device, so can't authenticate)

    In NCS v2.6.0, the config setting CONFIG_MCUMGR_SMP_BT_AUTHEN doesn't seem to be valid and I can't find anywhere in the corresponding smt_bt.c file where the SMP service is declared, so I can't see how its configured in terms of encryption/authentication

    As we're sending an unencrypted file, it seems we need the encrypted connection but as we don't have any means of entering or displaying passkeys, we can't have the requirement for authentication.

    Is there a way to have BT_GATT_PERM_WRITE_ENCRYPT & BT_GATT_PERM_READ_ENCRYPT permission setting on the SMP service within v2.2.0 and v2.6.0 without needing to "hack" the Zephyr files?

    Regards,

    Mike

  • Hi Mike,

    The 'erase settings' option will cause the bonding information to be deleted, so I think this may explain some of the problems you've been experiencing, at least. 

    CONFIG_MCUMGR_SMP_BT_AUTHEN was renamed to CONFIG_MCUMGR_TRANSPORT_BT_AUTHEN in Zephyr v.3.3 (https://github.com/nrfconnect/sdk-zephyr/blob/main/doc/releases/release-notes-3.3.rst), and is used in the service declaration here: https://github.com/nrfconnect/sdk-zephyr/blob/db34adf9c3d856b15f9662e034e44727aa16de47/subsys/mgmt/mcumgr/transport/src/smp_bt.c#L355 

    Mike Austin (LPI) said:
    Interestingly, in our factory code, we have this set to "n", but in our operational code (in both v1.6 and v1.7) we have this set to "y", yet the upgrade from v1.6 to v1.7

    That is strange. Did you verify that the symbol ended up being selected in the generated .config file?

    Mike Austin (LPI) said:

    As we're sending an unencrypted file, it seems we need the encrypted connection but as we don't have any means of entering or displaying passkeys, we can't have the requirement for authentication.

    Is there a way to have BT_GATT_PERM_WRITE_ENCRYPT & BT_GATT_PERM_READ_ENCRYPT permission setting on the SMP service within v2.2.0 and v2.6.0 without needing to "hack" the Zephyr files?

    Authentication limits who can initiate DFU (typcically those who have physical access to the device), but I don't think encryption itself offers much added security as the update binary is not encrypted in the app. 

    You can change the sec. level to require encryption, but this requires the smp service to be modified as there is no Kconfig symbol setting to only enable encryption.

    https://github.com/nrfconnect/sdk-zephyr/blob/db34adf9c3d856b15f9662e034e44727aa16de47/subsys/mgmt/mcumgr/transport/src/smp_bt.c#L355 

    Regards,

    Vidar

  • Thanks Vidar.

    We have something working now - "hacked" the smt_bt.c file so that with CONFIG_MCUMGR_TRANSPORT_BT_AUTHEN=n, we set the permissions on the SMP service to BT_GATT_PERM_WRITE_ENCRYPT & BT_GATT_PERM_READ_ENCRYPT.  If its set to "y", then it defaults to  BT_GATT_PERM_WRITE_AUTHEN & BT_GATT_PERM_READ_AUTHEN.

    We're still not sure how we managed to get the upgrade from v1.6 to v1.7 of our firmware to work, as it looks like we had set CONFIG_MCUMGR_SMP_BT_AUTHEN=y in v1.6.  For the product we have out in the field, we're trying another "hack" in our App that detects a device that has firmware v1.6 and approaches the OTA process slightly differently.

    Appreciate all your help in getting a resolution to this one!  Certainly one of the more challenging tickets I've had up on DevZone

    Regards,

    Mike

Reply
  • Thanks Vidar.

    We have something working now - "hacked" the smt_bt.c file so that with CONFIG_MCUMGR_TRANSPORT_BT_AUTHEN=n, we set the permissions on the SMP service to BT_GATT_PERM_WRITE_ENCRYPT & BT_GATT_PERM_READ_ENCRYPT.  If its set to "y", then it defaults to  BT_GATT_PERM_WRITE_AUTHEN & BT_GATT_PERM_READ_AUTHEN.

    We're still not sure how we managed to get the upgrade from v1.6 to v1.7 of our firmware to work, as it looks like we had set CONFIG_MCUMGR_SMP_BT_AUTHEN=y in v1.6.  For the product we have out in the field, we're trying another "hack" in our App that detects a device that has firmware v1.6 and approaches the OTA process slightly differently.

    Appreciate all your help in getting a resolution to this one!  Certainly one of the more challenging tickets I've had up on DevZone

    Regards,

    Mike

Children
No Data
Related