Matter OTA: BDX image transfer timeout in Matter OTA DFU

Dear Nordic,

We’re working on a Smart Door Lock project built on custom hardware that includes an nRF5340 and several other controllers. For firmware updates, we’ve adopted Nordic’s Matter OTA mechanism, with some customizations. Specifically, we modified the DFU target library and the CBOR-based multi-image bundling process to package all controller firmware — including the nRF5340’s application and network core images — into a single combined Matter OTA binary.

This unified OTA image is then passed to the Matter OTA provider (using chip-tool) for delivery to the OTA requestor — the smart lock device running on our Nordic-based hardware.

Our setup includes:

  • OpenThread Border Router running on a Raspberry Pi 5

  • nRF52840 dongle configured as an RCP using RCP 1.4 firmware

  • Nordic SDK version: nRF Connect SDK v2.7.0

- Chip tool is running in one of the Raspberry Pi 5 thread by using below command.

Command: ./apps/chip-ota-provider-app -f ./apps/H3P_BNDL_DBG_OTA1_EF_FF.01.9C.1B.bin

- In one of the Raspberry Pi 5 thread we have apply below command to create otbr network and pair lock nRF5340 device into the otbr matter network.
1. ./certification-tool/backend/test_collections/matter/scripts/OTBR/otbr_start.sh

2. ./apps/chip-tool pairing ble-thread 87 hex:0e08000000000001000035060004001fffe00708fdd61ac57c913a890410835c1dae12630654ed9a14d8fc4a7a5a0c0402a0f7f800030000150102567802083333333344444444030444454d4f051000112233445566778899aabbccddeeff 88855613 3492 --paa-trust-store-path /var/paa-root-certs/

3. Pair the OTA Provider and Matter lock on the same network:
./apps/chip-tool pairing onnetwork 24 20202021

4. Write the OTA provider details:
./apps/chip-tool otasoftwareupdaterequestor write default-otaproviders '[{"fabricIndex": 1, "providerNodeID": 24, "endpoint": 0}]' 87 0

5. Set access control permissions:
./apps/chip-tool accesscontrol write acl '[{"fabricIndex": 1, "privilege": 5, "authMode": 2, "subjects": [112233], "targets": null}, {"fabricIndex": 1, "privilege": 3, "authMode": 2, "subjects": null, "targets": null}]' 24 0

6. Announce the OTA provider:
./apps/chip-tool otasoftwareupdaterequestor announce-otaprovider 24 0 0 0 87 0

Using the setup and commands described above, we’ve successfully transferred the complete Matter OTA image from the OTA provider to the OTA requestor device. All controllers, including the Nordic cores and external controllers, were able to update and switch to the newly transferred image without issues — everything worked well in most cases.

However, we have frequently encountered BDX transfer timeout issues. Our observations indicate that these timeouts tend to occur consistently on certain days, suggesting a potential pattern or environmental influence. For reference, we’ve attached debug logs from both the chip-tool and the Nordic device. As illustrated in the logs, the initial OTA cycle completes successfully, whereas the subsequent cycle fails due to a timeout during the image transfer phase.

Are there any recommended strategies to mitigate BDX timeouts in such scenarios? What approaches can be employed to enhance the reliability and robustness of the Matter OTA process over Thread?

Nordic_debuglogs.log

chip-tool-logs.txt

Thanks & Regards,

Kaushik Parsana

Parents Reply
  • Hi,

    Thanks for the logs. What the team sees in the failed cases is that Controller/OTBR fails to deliver the BDX block despite Block Query retransmissions. BR does transmit one 6LoWPAN fragment of the Block but then suddenly stops for an unknown reason. In the last run, BR sends two fragments, then stops for 5 seconds and tries again from fragment 1. 

    As I previously asked, it would be ideal to get OpenThread logs from the OpenThread Border Router, which would show exactly why the Thread Border Router does not deliver packets and stops sending fragments - e.g. due to CSMA/CA failures or some other reason (the OTBR should print information on every packet transmission).

    If you have NVM space to enable OpenThread logs on the device itself, it could help as well to see two sides:

    CONFIG_OPENTHREAD_SOURCES=y
    CONFIG_OPENTHREAD_DEBUG=y
    CONFIG_OPENTHREAD_LOG_LEVEL_INFO=y

     

    All in all, based on the logs and PCAP, it looks more like an issue on the OTBR/RCP side than on the accessory side. We also have extensive internal BDX tests where we don't see such an issue, though we use a Linux PC instead of an RPi 5.

    -Amanda H.

Children
No Data
Related