This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

Mesh DFU, Size of the network, Location of the nodes, Timing.

Hi,

I am working on DFU over Mesh for quite a bit now. I did run few tests but never run the DFU on a large network comprising of 100+ nodes. SO, I have questions about DFU over large networks.

Assume this situation:

Case1: I have 100 nodes which are DFU capable inside a room. Using my PC, I perform a DFU which takes around about 80 minutes to push FW image to the node

Question1: What happens if the node that the PC pushes the FW image to is disconnected after all the packets are pushed onto Mesh before all the nodes get updated?

                  Do all the nodes get updated? or does the DFU fail immediately as that node was disconnected before all the nodes are updated?

Case 2: I have 100 nodes which are DFU capable but are in multiple rooms, floors etc. Using my PC, I perform a DFU.

Question 2: Does the time taken to push FW image to the node change? If it changes, will it increase or decrease?

                   Does the distance between the node to which the FW image is pushed and the node far from it have any effect on the time required to push the DFU?

Note: All the bold "node" representation in the post indicate the same node to which the FW image is being pushed to.

Thank you.

  • Hi, 

    The way DFU mesh works is that the image is split in to multiple segments and each segments will be propaganded through out the network. When a node detect that the segment index it has is behind the current one, it can send a DFU data request packets. The other nodes that has the segment will retransmit that segment (DFU data response). Please read more about the process here.  

    The DFU master move to the next segment every 500ms. The process finished until the full image is propaganded.Then the DFU master will trigger the command to switch image. 

    Regarding your question: 

    1. If the node reset and get back to the network after the image has been transferred and the DFU is finished then it won't get any new DFU and will have to stick to the old image. You can trigger a new DFU update from the DFU master, those node that already has the latest image will not store the new image but will stil relay the image. 

    If the node reset and get back to the network in the middle of the image transferring, it still can't resume the process as we don't have the mechanism to resume DFU yet. But I believe it should be possible to modify the code to add this feature. 

    If the node doesn't reset, and simply got disconnected from the network, it should be able to resume the DFU if it's getting back to the network before the DFU process is finished. 

    So, basically if a node reset in the middle of DFU transfer, it won't get the image. 

    2. The DFU master wouldn't wait until all the nodes receive the image/segment, but rely on the redundancy re-transmitting and the long interval of each segment (500ms) to move to next segment. So the time it take to push an image is defined by the size of the image and the latency of each segment. Your job is to calculate the latency so that all nodes in the network would receive the image. Please read more about transfer rate here. 

    The re-transmission of a missing packet for a node is handled by relay nodes (normal nodes) the DFU master wouldn't involve in this. 

  • Hi,

    So, the DFU master can disconnect after transmitting the new FW image and triggering the command to switch image without waiting for every node in the network to update. Is that right?

    I am asking this question with reference to what was discussed in this post.

    The idea that was put forward was about having a mesh DFU through the phone which isn't implemented yet. I am curious about it if it is implemented in the future.

    It was said that, in theory the phone can get disconnected after the image is pushed. Does it apply to the DFU master also? 

    If I disconnect my DFU master after transmitting the new FW image and triggering the command to switch image, does it affect the nodes that are requesting and receiving the missing data packets?

    Do the nodes accept the command to switch image only after all the packets are received or do they hold on to the command(if they get it beforehand) till they receive all the packets and then execute the command?

  • Note that the large amount of time for DFU update is to propagate the image's segments into the network, not the time it take to switch the image. 
    So the idea of disconnect the phone/DFU master after the image has been transferred doesn't make much sense as it wouldn't take too long to switch the image. 

    Anyway, updating Mesh image from a phone is not supported yet. We currently only have UART transport layer. But it's of course possible to modify the code to receive image from BLE. 

    What you can do to free the phone/DFU master from being occupied for a long time is to write the firmware for a node that can receive the image directly (in short period) and then start to propagate the image like a DFU master. This way, the phone/DFU master only need to be present in less than a minute to send the image and then can leave. We currently don't have the firmware for this. 

    There will be a DFU standard from Bluetooth SIG. So I don't think we will have a huge update with our DFU protocol until we have the new standard from Bluetooth SIG. 

    A node won't accept the command to switch image if it hasn't receive the whole image (and verify it with the signature). 

Related