DFU issue - bt_mesh_cli: dropping

Dear,

I’m experiencing an issue with performing DFU over BLE Mesh. I have a setup with one node acting as provisioner and distributor. This node has a new image loaded (with a modified advertising name and updated imgtool_sign_version). Additionally, there is one target node (node 3) added in ble mesh. 

I’ve uploaded the image to the distributor, added a slot, and registered the receiver. However, when I start the distribution and check the status, I receive a response indicating phase 10 and status 0.

This setup used to work occasionally when using SDK v2.6.2. However, after migrating to SDK v2.7.0, the issue consistently appears.

Do you have any idea what might be causing this?

Here is Seggers rtt output:

And commands from app:

I have also tried with image index 1, but nothing...

Any idea, inputs?

Thank you 

Best regards,
Matej

Parents
  • Hi Amanda,

    Thanks for the suggestion. Upgrading to NCS v2.9.1 isn't a trivial task for us at this point. We've already deployed and validated our current setup (NCS v2.7.0) across over 100 nodes in various locations. Repeating the entire validation process would be quite resource-intensive.

    Is there any workaround or patch that could address the DFU issue within our existing NCS v2.7.0 setup? If not, any guidance on minimizing the impact of migrating to a newer SDK version would be appreciated.

    Best regards,
    Matej

  • Phase 10 with status 0 means the target node didn’t respond to the Firmware Update Start message, which prevents the DFU from proceeding. The error codes provided suggest there is an internal error on the target node. Could you provide logs from the target side to provide more detail?

    Other things to try: extend timeout_base, and make sure that the imgtool_sign_version is strictly higher than the one already present on the target. Lastly, make sure the target is correctly provisioned and configured for DFU.

  • Hi,

    I have added additional logs, and noticed that i'm receiving <wrn> bt_mesh_dfu_srv: Wrong state4

    void bt_mesh_dfu_srv_applied(struct bt_mesh_dfu_srv *srv)
    {
    	if (srv->update.phase != BT_MESH_DFU_PHASE_APPLYING) {
    		LOG_WRN("Wrong state4");
    		return;
    	}
    
    	LOG_DBG("");
    
    	srv->update.phase = BT_MESH_DFU_PHASE_IDLE;
    	store_state(srv);
    }

    I have prepare seperate command to initialize dfu, to ensure that mesh is already initialized, i wait for few messages that are sent through, after that i send cmd for dfu initialization.

    function above is called within dfu_target_image_confirm, as last step of dfu initialization routine.

    Any idea?

    Thanks,

    BR,
    Matej

  • It would be helpful if you could provide the actual debug logs. The bt_mesh_dfu_srv_applied function is called on every boot, so this warning does not necessarily indicate anything is wrong.

    In your original screenshot, the reason is indicated as "9" (BT_MESH_BLOB_ERR_INTERNAL) by the BLOB client. This reason is reported directly by the BLOB server. So, we should check the BLOB server logs as well.  

    The transfer is orchestrated by the FU server. If something is wrong with that process, BLOB Srv can return this error. Looks like the transfer is breaking in the very first phase itself. 

    I would suggest: 

    1.  Add "CONFIG_BT_MESH_MODEL_LOG_LEVEL_DBG=y" and "CONFIG_LOG_BUFFER_SIZE=2048" on both distributor and target firmware. Collect logs on both sides by starting the process, and share them with us along with ".config" file for both.
    2. I will strongly recommend to try this using latest SDK revision first and by following instructions from target and distributor samples; and then attempt the same DFU procedure on their existing nodes.
    3. If anything goes wrong during the transfer, you must issue "cancel" to all participating nodes to get everyone back to same state. Or Reboot the devices.
  • Hi Amanda,

    Thank you for your inputs. I've successfully migrated to NCS 2.9.1, and most functionalities appear to be working as expected. However, I'm still encountering issues with DFU.

    I tested example within ncs, without any code modifications, and I was able to perform a firmware update through the Device Manager without any problems.

    In my custom project, I have enabled the suggested logging, I’m seeing the following output — unfortunately, nothing particularly useful or concrete. Apologies for not sharing the full logs and .config file at this stage, as I’d need to filter out a significant amount of content related to customer

    00> Starting the firmware distribution.
    00> Slot:
    00>   Size:     430487 bytes
    00>   FWID:     0304050001000000
    00>   Metadata: 030405000100000097910601844419620200
    00> D: Distribution Start: slot: 0, appidx: 0, tb: 0, addr: 0000, ttl: 255, apply: 1
    00> Distribution phase changed to Transfer Active
    00> D: 
    00> D: 1 targets
    00> D: 5
    00> D: 4
    00> D: 3
    00> D: 2
    00> D: 1
    00> D: Transfer timed out.
    00> D: 
    00> W: Dropping 0x0003: 9
    00> E: Target 0x0003 failed: 3
    00> D: continuing
    00> D: 
    00> D: 3
    00> D: reason: 3, phase: 1, apply: 1
    00> Distribution phase changed to Failed

    While on the target node, nothing DFU releated does not appear in RTT terminal.

    We're exploring the possibility of using the same firmware image for both the distributor and target nodes to simplify field deployment. I suspect this approach might be contributing to the issue.

    Is it actually supported to have both DFU client and server roles enabled in a single firmware image? If so, do you have an example project or configuration that demonstrates this setup?

    Also, could you clarify what the minimal configuration and required function calls are for both the distributor and target nodes?

    Thank you again for your support.

    Best regards,
    Matej

  • Matej said:
    Is it actually supported to have both DFU client and server roles enabled in a single firmware image? If so, do you have an example project or configuration that demonstrates this setup?

    Yes, please refer to Bluetooth Mesh: Device Firmware Update (DFU) distributor and see the Self-update section.

    Matej said:
    could you clarify what the minimal configuration and required function calls are for both the distributor and target nodes?

    It needs to enable the required models in your project configuration:

    For the Distributor (DFU client):

    CONFIG_BT_MESH_DFD_SRV (Firmware Distribution Server)
    CONFIG_BT_MESH_DFU_CLI (Firmware Update Client)
    CONFIG_BT_MESH_BLOB_SRV (BLOB Transfer Server)
    CONFIG_BT_MESH_BLOB_CLI (BLOB Transfer Client)

    For the Target (DFU server):

    CONFIG_BT_MESH_DFU_SRV (Firmware Update Server)
    CONFIG_BT_MESH_BLOB_SRV (BLOB Transfer Server)

    The required function calls are as the Bluetooth Mesh: Device Firmware Update (DFU) distributor demonstrates. 

Reply
  • Matej said:
    Is it actually supported to have both DFU client and server roles enabled in a single firmware image? If so, do you have an example project or configuration that demonstrates this setup?

    Yes, please refer to Bluetooth Mesh: Device Firmware Update (DFU) distributor and see the Self-update section.

    Matej said:
    could you clarify what the minimal configuration and required function calls are for both the distributor and target nodes?

    It needs to enable the required models in your project configuration:

    For the Distributor (DFU client):

    CONFIG_BT_MESH_DFD_SRV (Firmware Distribution Server)
    CONFIG_BT_MESH_DFU_CLI (Firmware Update Client)
    CONFIG_BT_MESH_BLOB_SRV (BLOB Transfer Server)
    CONFIG_BT_MESH_BLOB_CLI (BLOB Transfer Client)

    For the Target (DFU server):

    CONFIG_BT_MESH_DFU_SRV (Firmware Update Server)
    CONFIG_BT_MESH_BLOB_SRV (BLOB Transfer Server)

    The required function calls are as the Bluetooth Mesh: Device Firmware Update (DFU) distributor demonstrates. 

Children
No Data
Related