This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

Mesh Bootloader Data Request Handling Issue

I've been doing some testing with Mesh DFU and I've noticed some unexpected behavior around handling missing data segments.

As I understand the Mesh DFU protocol, if a data segment is found to be missing, then the nRF chip is supposed to continuously broadcast a Data Request for the missing segment. Upon receiving a corresponding Data Response, the system should stop the previous Data Request broadcast then start broadcasting a Data Request for the next missing segment, until all missing segments are received.

However, when the Data Response is handled, the system won't broadcast the next Data Request. I tracked down why and found the function that handles the setup for the Data Requests. It has a conditional where it won't send out a Data Request unless if the requested segment is older than the received segment (ie. requested segment number < received segment number). See: github.com/.../dfu_mesh.c. This, of course, poses a problem when handling Data Responses. Missing data segments are requested in chronological order, so the received segment number from the Data Response will always be lower than the next segment to be requested, which then fails the conditional. So whenever handling segments from Data Responses, the system won't immediately request the next missing segment.

This behavior doesn't seem intuitive to me but is it intentional? Are there issues if the system were to directly move onto the next Data Request after handling a Data Response?

I've found this on nRF5-SDK-for-Mesh v3.2.0 and v3.1.0. I haven't checked any other versions.

PS: It's worth mentioning that the next missing segment actually gets requested when handling the latest Data packet. But if the system has a few missing packets after receiving the last segment, it will never request those missing packets since the only incoming packets are Data Responses. It effectively causes the Mesh DFU to stall and finally timeout. This is pretty painful especially since Mesh DFU is slow. Retries cost a lot of time.

Parents
  • Hi. 

    Some information from our developers: 
    You can find the changes for the behavior here: 
    https://github.com/NordicPlayground/nRF51-ble-bcast-mesh/commit/ed93bfb2619d1e272fc3866dbe15fda03d8fbe67 

    This was made a long time ago, but one of the reasons could be to reduce the flooding in the network. One suggestion could be to play with REQ_RX_COUNT_RETRY and see if it still causes the DFU to stall. 

    Another possible issue here might be that DFU stalls at the end of the DFU process, so that the target will never receive segments higher than the last requested missing segment, because it has already received the last segment. We will do some additional investigation to check if this actually happens so that we can provide a patch for it. 
    You could also try to check if this actually happens, for example by checking if the target has received the last segment (which is equal to m_transaction.segment_count) and if so, unconditionally request all the rest missing packets. 

    Best regards, 
    Joakim Jakobsen

Reply
  • Hi. 

    Some information from our developers: 
    You can find the changes for the behavior here: 
    https://github.com/NordicPlayground/nRF51-ble-bcast-mesh/commit/ed93bfb2619d1e272fc3866dbe15fda03d8fbe67 

    This was made a long time ago, but one of the reasons could be to reduce the flooding in the network. One suggestion could be to play with REQ_RX_COUNT_RETRY and see if it still causes the DFU to stall. 

    Another possible issue here might be that DFU stalls at the end of the DFU process, so that the target will never receive segments higher than the last requested missing segment, because it has already received the last segment. We will do some additional investigation to check if this actually happens so that we can provide a patch for it. 
    You could also try to check if this actually happens, for example by checking if the target has received the last segment (which is equal to m_transaction.segment_count) and if so, unconditionally request all the rest missing packets. 

    Best regards, 
    Joakim Jakobsen

Children
No Data
Related