Issues with completing DFU after updated to NCS V2.6.0

We have some custom hardware that utilises an NRF52832.  Up until recently, all our firmware was built on NCS V2.2.0, which I believe used Zephyr V3.2.99.  We have implemented DFU for firmware upgrades, and haven't had any issues with this.  Up until now.

The latest version of our firmware has been built around NCS V2.6.0, which I think uses Zephyr V3.5.99.  There were a bunch of CONFIG setting changes that I need to make in the transition from NCS V2.2.0 to NCS V2.6.0 associated with setting up the DFU functionality.

We have a custom App that we can use for customers to complete a firmware upgrade.  That initiates the transfer of the new image file into the spare slot, resets the device at the completion, and then awaits a response from the SMP service (the App has subscribed to the SMP Service notifications) to tell it that the new image has been accepted and the device has booted up using the new image.

For whatever reason, that notification from the device to the App via the SMP service is no longer occurring, so our App sits there endlessly waiting.

When we try and test the same functionality using Nordic's Device Manager App, we see the following:

  • New image uploads into Slot 1 OK
  • READ - this confirms the old image in Slot 0 and the new, pending image, in Slot 1
  • TEST - this confirms the new image in Slot 1 as the one to boot up from next time the device starts up
  • RESET - this works as expected
  • CONFIRM - this fails, with either an error message that says "Insufficient handle" or something about insufficent authentication

Basically, it looks like pairing/bonding information is getting wiped in the DFU process.  But this authentication either wasn't needed when we were doing DFU with V2.2.0, or the pairing/bonding info wasn't getting erased with the new image upload.

I suspect there is some CONFIG setting that is now being set that previously wasn't.  I've attached my proj.conf file so you can see how I currently have things configured

# Operational CONFIG settings

# General
CONFIG_REBOOT=y
CONFIG_GPIO=y
CONFIG_BOARD_ENABLE_DCDC=n
CONFIG_ADC=y

# Logging
CONFIG_LOG=y
CONFIG_LOG_MODE_MINIMAL=n
CONFIG_LOG_DEFAULT_LEVEL=0

# Boot
CONFIG_BOOT_BANNER=n

#Device Name - will be visible to clients scanning
CONFIG_BT_DEVICE_NAME="My Device"

# Bluetooth
CONFIG_BT=y
CONFIG_BT_PERIPHERAL=y
CONFIG_BT_SMP=y
CONFIG_BT_GATT_CLIENT=y
CONFIG_BT_GATT_DM=y
CONFIG_BT_MAX_PAIRED=10
CONFIG_BT_ID_MAX=10
CONFIG_BT_ID_UNPAIR_MATCHING_BONDS=y
CONFIG_BT_KEYS_OVERWRITE_OLDEST=y
CONFIG_BT_SMP_SC_PAIR_ONLY=y
#CONFIG_BT_CTS_CLIENT=y

#PHY update needed for updating PHY request
CONFIG_BT_USER_PHY_UPDATE=y
#For data length update
CONFIG_BT_USER_DATA_LEN_UPDATE=y
CONFIG_BT_BUF_ACL_TX_COUNT=10
CONFIG_BT_ATT_PREPARE_COUNT=2

# Enable CTS client
CONFIG_BT_CTS_CLIENT=y

# Below is setup to let DIS information be read from settings
CONFIG_BT_SETTINGS=y
CONFIG_SETTINGS_RUNTIME=y
CONFIG_SETTINGS=y

#Enable MCUBOOT bootloader build in the application
CONFIG_BOOTLOADER_MCUBOOT=y
#Include MCUMGR and the dependencies in the build
CONFIG_NCS_SAMPLE_MCUMGR_BT_OTA_DFU=y
CONFIG_NCS_SAMPLE_MCUMGR_BT_OTA_DFU_SPEEDUP=y
CONFIG_NCS_SAMPLE_MCUMGR_BT_OTA_DFU_VALIDATION=y

# Set stack and heap sizes
CONFIG_MAIN_STACK_SIZE=2048
CONFIG_HEAP_MEM_POOL_SIZE=2048

# Allow flash
CONFIG_FLASH=y
CONFIG_FLASH_PAGE_LAYOUT=y
CONFIG_FLASH_MAP=y
CONFIG_NVS=y
CONFIG_MPU_ALLOW_FLASH_WRITE=y

# Enable Power Management
CONFIG_POWEROFF=y

# Required to disable default behavior of deep sleep on timeout
CONFIG_PM_DEVICE=y

# Enable I2C
CONFIG_I2C=y
CONFIG_COUNTER=y
CONFIG_PCF85063A=y

#cJSON
CONFIG_CJSON_LIB=y
CONFIG_NEWLIB_LIBC=y
CONFIG_NEWLIB_LIBC_FLOAT_PRINTF=y

# Sensors
CONFIG_SENSOR=y

# CONFIG below here set up to minimise flash usage

# Disable features not needed
CONFIG_TIMESLICING=n
CONFIG_ASSERT=n

# Disable Bluetooth features not needed
CONFIG_BT_DEBUG_NONE=y
CONFIG_BT_ASSERT=n
CONFIG_BT_GATT_CACHING=n
CONFIG_BT_SETTINGS_CCC_LAZY_LOADING=y
CONFIG_BT_HCI_VS_EXT=n

# Disable Bluetooth controller features not needed
CONFIG_BT_CTLR_PRIVACY=n
CONFIG_BT_CTLR_PHY_2M=n

# Reduce Bluetooth buffers
CONFIG_BT_BUF_EVT_DISCARDABLE_COUNT=1
CONFIG_BT_BUF_EVT_RX_COUNT=2

# Drivers and peripherals
CONFIG_WATCHDOG=n
CONFIG_SPI=n

# Interrupts
CONFIG_DYNAMIC_INTERRUPTS=n

# Memory protection
CONFIG_THREAD_STACK_INFO=n
CONFIG_THREAD_CUSTOM_DATA=n
CONFIG_FPU=n

# Console
CONFIG_EARLY_CONSOLE=n

# Build
CONFIG_SIZE_OPTIMIZATIONS=y

Can anyone assist me in resolving this issue?  We need to roll out the updated version of firmware, but can't until we can ensure a smooth upgrade process through our App.

Regards,

Mike

  • Hi Vidar,

    OK, checked the Attributes Table order between the two versions of firmware built around the two versions of NCS, and they are exactly the same.

    So, something seems to have changed in how the SMP service is set up.

    To try and get around this in the short term (we have product that customers are unable to update the firmware for, so I'm trying to get something that will work), I did the following:

    1. Went back to NCS v2.2.0/Zephyr v3.2.99
    2. Took the version of firmware that used to work, which was v1.7.  This successfully upgrades from v1.6 via our custom app.
    3. Made the very minor changes that I need to make to sort out the bug we have uncovered
    4. Set that as v1.9 (v1.8 is the one built around NCS v2.6.0 that we have been having issues with) and did a pristine build, using NCS v2.2.0
    5. Attempted to do the upgrade via the Device Manager

    Testing using Device Manager, we are now seeing something similar to before, even though we aren't changing the version of NCS/Zephyr now!  The steps are the same, and the failure occurs at the same point - the CONFIRM step - but the error is now different and is unrecoverable without rolling back to the previous version of firmware on the device.

    These are the details:

    1. At the CONFIRM step with our latest firmware version.  Attempting to Read the Image data and the error returned is "Encryption is insufficient"


    2. Attempting to Read the Image data again and the error changes to "Connection failed"
    3. After ‘Forgetting the device’ and turning Bluetooth Off and On again on the phone and attempting to Read the Image data again you get prompted to Pair to the device:
    4. The error retuned at this stage is "Authentication is insufficient".
    5. Device Manager thinks it is disconnected but the phone thinks it is connected:
    6. Turning Bluetooth Off and On on the phone and attempting to Read the Image data again and the error is back to Encryption is insufficient and then Connection failed.

     

    Basically at this point it is not possible to connect to the SMP characteristic to CONFIRM the image and on the next device reboot it rolls back to the previous firmware version.

    For all previous firmware, and for this firmware also, the SMP Characteristic has been set to have Encryption without Authentication.

    We're a bit stuck at this stage, as we basically can't release product because we can't upgrade the firmware.

    Regards,

    Mike

  • Hi Mike,

    Is it possible to enable logging in the new version to see if there are any errors reported? E.g., if the  link fails to be secured. You can also use the bt_foreach_bond() function to find out how many bonds are stored to check that bonds are not being erased during the DFU process.

    Regards,

    Vidar

  • Sorry for not noticing this earlier, but this setting will erase the settings partition when you do DFU:

    Do you experience the same if you turn it off? I have requested that we make this disabled by default

  • Hi Vidar,

    We tried turning that setting off, but it had on effect on our ability to do a DFU OTA.

    We did some logging from within the nRF Connect App to see if we could get some further information.  We currently have 3 versions of firmware we are using for testing:

    V1.6 - this is what is currently in field deployed devices, and is built on NCS v2.2.0

    V1.7 - this is an upgrade that we were planning to release, but when we tested it noticed it had a few bugs we hadn't uncovered in V1.6.  This is also built on NCS V2.2.0

    V1.8 - this is the newest version, that fixes the bugs in V1.7, but it built around NCS V2.6.0

    We tried going from V1.6 -> V1.7 -> V1.8. In each case we had nRF Connect set to "Test and Confirm", 

    Upgrade from V1.6 (NCS V2.2.0) to V1.7 (NCS V2.2.0)

    Found valid Firmware in file:///private/var/mobile/Containers/Data/Application/C16C782E-248A-43B7-98F8-B64C510541BD/Documents/6_update_1.7.bin for Device DFU McuMgr.
    
    Upgrade started with 1 image(s) using 'Test and Confirm' mode
    
    Firmware Upgrade Started.
    
    State changed from none to requestMcuMgrParameters
    
    Peripheral connected
    
    Device ready
    
    Mcu Manager parameters received (4 x 2475)
    
    State changed from requestMcuMgrParameters to bootloaderInfo
    
    Bootloader info not supported
    
    State changed from bootloaderInfo to validate
    
    Image List response: Header: {"version": "0", "op": "1", "flags": 0, "length": 134, "group": 1, "seqNum": 130, "commandId": 0}, Payload: {"images" : {{"bootable" : true, "version" : "0.0.0", "hash" : 0xC25B88AA20BEC2CC7E5BE0415A93344EF27AC17E641194893AAC3D88BFDE0A45, "pending" : false, "active" : true, "permanent" : false, "confirmed" : true, "slot" : 0}}, "splitStatus" : 0}
    
    Scheduling upload (hash: 0x2A78A535D1BAB0A61ED01DEC2872F2CB60C6B15DE7A26AF0C2E3DD2212FB4079) for image 0 (slot: 1)
    
    State changed from validate to upload
    
    No remaining chunks to be sent? chunkOffset: 203620, imageData: 203620.
    
    No remaining chunks to be sent? chunkOffset: 203620, imageData: 203620.
    
    Upload finished (1 of 1)
    
    State changed from upload to test
    
    Image Test response: Header: {"version": "0", "op": "3", "flags": 0, "length": 244, "group": 1, "seqNum": 215, "commandId": 0}, Payload: {"images" : {{"hash" : 0xC25B88AA20BEC2CC7E5BE0415A93344EF27AC17E641194893AAC3D88BFDE0A45, "confirmed" : true, "active" : true, "slot" : 0, "pending" : false, "version" : "0.0.0", "permanent" : false, "bootable" : true}, {"confirmed" : false, "version" : "0.0.0", "bootable" : true, "hash" : 0x2A78A535D1BAB0A61ED01DEC2872F2CB60C6B15DE7A26AF0C2E3DD2212FB4079, "pending" : true, "slot" : 1, "active" : false, "permanent" : false}}, "splitStatus" : 0}
    
    State changed from test to reset
    
    Reset request confirmed
    
    Disconnected.
    
    Peripheral disconnected
    
    Device has disconnected
    
    Waiting 10 seconds reconnecting...
    
    Reconnecting...
    
    Reconnect deferred
    
    State changed from reset to confirm
    
    Peripheral connected
    
    Device ready
    
    Image Confirm response: Header: {"version": "0", "op": "3", "flags": 0, "length": 244, "group": 1, "seqNum": 216, "commandId": 0}, Payload: {"splitStatus" : 0, "images" : {{"hash" : 0x2A78A535D1BAB0A61ED01DEC2872F2CB60C6B15DE7A26AF0C2E3DD2212FB4079, "bootable" : true, "pending" : false, "permanent" : false, "active" : true, "version" : "0.0.0", "slot" : 0, "confirmed" : true}, {"hash" : 0xC25B88AA20BEC2CC7E5BE0415A93344EF27AC17E641194893AAC3D88BFDE0A45, "bootable" : true, "permanent" : false, "slot" : 1, "confirmed" : false, "pending" : false, "version" : "0.0.0", "active" : false}}}
    
    Upgrade complete
    
    State changed from confirm to success
    
    Success!
    

    Upgrade from V1.7 (NCS V2.2.0) to V1.8 (NCS V2.6.0)

    Upgrade started with 1 image(s) using 'Test and Confirm' mode
    
    Firmware Upgrade Started.
    
    State changed from none to requestMcuMgrParameters
    
    Peripheral connected
    
    Device ready
    
    Mcu Manager parameters received (4 x 2475)
    
    State changed from requestMcuMgrParameters to bootloaderInfo
    
    Bootloader info not supported
    
    State changed from bootloaderInfo to validate
    
    Image List response: Header: {"version": "0", "op": "1", "flags": 0, "length": 244, "group": 1, "seqNum": 116, "commandId": 0}, Payload: {"images" : {{"permanent" : false, "confirmed" : true, "bootable" : true, "pending" : false, "hash" : 0x2A78A535D1BAB0A61ED01DEC2872F2CB60C6B15DE7A26AF0C2E3DD2212FB4079, "version" : "0.0.0", "active" : true, "slot" : 0}, {"pending" : false, "slot" : 1, "bootable" : true, "version" : "0.0.0", "active" : false, "permanent" : false, "confirmed" : false, "hash" : 0xC25B88AA20BEC2CC7E5BE0415A93344EF27AC17E641194893AAC3D88BFDE0A45}}, "splitStatus" : 0}
    
    Secondary slot of image 0 will be overwritten
    
    Scheduling upload (hash: 0x79834FF4F72A0934C370B232E754CD1FF56971FCCB565A7593A4B3A88EE487CA) for image 0 (slot: 1)
    
    State changed from validate to upload
    
    Retry 1 for seq: 159
    
    Retry 1 for seq: 181
    
    No remaining chunks to be sent? chunkOffset: 207260, imageData: 207260.
    
    No remaining chunks to be sent? chunkOffset: 207260, imageData: 207260.
    
    Upload finished (1 of 1)
    
    State changed from upload to test
    
    Image Test response: Header: {"version": "0", "op": "3", "flags": 0, "length": 244, "group": 1, "seqNum": 208, "commandId": 0}, Payload: {"splitStatus" : 0, "images" : {{"pending" : false, "version" : "0.0.0", "confirmed" : true, "hash" : 0x2A78A535D1BAB0A61ED01DEC2872F2CB60C6B15DE7A26AF0C2E3DD2212FB4079, "slot" : 0, "permanent" : false, "active" : true, "bootable" : true}, {"confirmed" : false, "hash" : 0x79834FF4F72A0934C370B232E754CD1FF56971FCCB565A7593A4B3A88EE487CA, "bootable" : true, "slot" : 1, "active" : false, "version" : "0.0.0", "pending" : true, "permanent" : false}}}
    
    State changed from test to reset
    
    Reset request confirmed
    
    Peripheral disconnected
    
    Device has disconnected
    
    Waiting 10 seconds reconnecting...
    
    Disconnected.
    
    Reconnecting...
    
    Reconnect deferred
    
    State changed from reset to confirm
    
    Peripheral connected
    
    The handle is invalid.
    
    Request (SMPv1, group: image, seq: 209, command: state) failed: The handle is invalid.
    
    The handle is invalid.
    
    DFU failed: The handle is invalid.
    
    DFU Failed with Error: The handle is invalid.
    

    A colleague stumbled across this ticket, which seems to indicate that with CONFIG_MCUMGR_SMP_BT_AUTHEN=y, that you need encryption and authentication.  Interestingly, in our factory code, we have this set to "n", but in our operational code (in both v1.6 and v1.7) we have this set to "y", yet the upgrade from v1.6 to v1.7 works as long as the device is paired, which sort of doesn't make sense

    (Note - we don't have any display options for passkeys on our device, so can't authenticate)

    In NCS v2.6.0, the config setting CONFIG_MCUMGR_SMP_BT_AUTHEN doesn't seem to be valid and I can't find anywhere in the corresponding smt_bt.c file where the SMP service is declared, so I can't see how its configured in terms of encryption/authentication

    As we're sending an unencrypted file, it seems we need the encrypted connection but as we don't have any means of entering or displaying passkeys, we can't have the requirement for authentication.

    Is there a way to have BT_GATT_PERM_WRITE_ENCRYPT & BT_GATT_PERM_READ_ENCRYPT permission setting on the SMP service within v2.2.0 and v2.6.0 without needing to "hack" the Zephyr files?

    Regards,

    Mike

  • Hi Mike,

    The 'erase settings' option will cause the bonding information to be deleted, so I think this may explain some of the problems you've been experiencing, at least. 

    CONFIG_MCUMGR_SMP_BT_AUTHEN was renamed to CONFIG_MCUMGR_TRANSPORT_BT_AUTHEN in Zephyr v.3.3 (https://github.com/nrfconnect/sdk-zephyr/blob/main/doc/releases/release-notes-3.3.rst), and is used in the service declaration here: https://github.com/nrfconnect/sdk-zephyr/blob/db34adf9c3d856b15f9662e034e44727aa16de47/subsys/mgmt/mcumgr/transport/src/smp_bt.c#L355 

    Mike Austin (LPI) said:
    Interestingly, in our factory code, we have this set to "n", but in our operational code (in both v1.6 and v1.7) we have this set to "y", yet the upgrade from v1.6 to v1.7

    That is strange. Did you verify that the symbol ended up being selected in the generated .config file?

    Mike Austin (LPI) said:

    As we're sending an unencrypted file, it seems we need the encrypted connection but as we don't have any means of entering or displaying passkeys, we can't have the requirement for authentication.

    Is there a way to have BT_GATT_PERM_WRITE_ENCRYPT & BT_GATT_PERM_READ_ENCRYPT permission setting on the SMP service within v2.2.0 and v2.6.0 without needing to "hack" the Zephyr files?

    Authentication limits who can initiate DFU (typcically those who have physical access to the device), but I don't think encryption itself offers much added security as the update binary is not encrypted in the app. 

    You can change the sec. level to require encryption, but this requires the smp service to be modified as there is no Kconfig symbol setting to only enable encryption.

    https://github.com/nrfconnect/sdk-zephyr/blob/db34adf9c3d856b15f9662e034e44727aa16de47/subsys/mgmt/mcumgr/transport/src/smp_bt.c#L355 

    Regards,

    Vidar

Related