This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

Mesh Bootloader Data Request Handling Issue

I've been doing some testing with Mesh DFU and I've noticed some unexpected behavior around handling missing data segments.

As I understand the Mesh DFU protocol, if a data segment is found to be missing, then the nRF chip is supposed to continuously broadcast a Data Request for the missing segment. Upon receiving a corresponding Data Response, the system should stop the previous Data Request broadcast then start broadcasting a Data Request for the next missing segment, until all missing segments are received.

However, when the Data Response is handled, the system won't broadcast the next Data Request. I tracked down why and found the function that handles the setup for the Data Requests. It has a conditional where it won't send out a Data Request unless if the requested segment is older than the received segment (ie. requested segment number < received segment number). See: github.com/.../dfu_mesh.c. This, of course, poses a problem when handling Data Responses. Missing data segments are requested in chronological order, so the received segment number from the Data Response will always be lower than the next segment to be requested, which then fails the conditional. So whenever handling segments from Data Responses, the system won't immediately request the next missing segment.

This behavior doesn't seem intuitive to me but is it intentional? Are there issues if the system were to directly move onto the next Data Request after handling a Data Response?

I've found this on nRF5-SDK-for-Mesh v3.2.0 and v3.1.0. I haven't checked any other versions.

PS: It's worth mentioning that the next missing segment actually gets requested when handling the latest Data packet. But if the system has a few missing packets after receiving the last segment, it will never request those missing packets since the only incoming packets are Data Responses. It effectively causes the Mesh DFU to stall and finally timeout. This is pretty painful especially since Mesh DFU is slow. Retries cost a lot of time.

Related