nRF5 SDK for Thread: Secure DFU OTA stopped working all of sudden .

Hello, 

I am using nrf5 SDK for thread SDK for development of my thread based product. Product is already developed just have to integrate dfu in it. Few months back when I tried dfu integrating it with my mqttsn example it worked everything was working smoothly. I was able to update mqttsn firmware both in unicast and multicast mode. 

But now all of sudden when I try to perform same steps for dfu, it doesn't work. I have tried again with dfu client example as given in docs there too I am only able to perform dfu once.

Problem faced till now:

- After one time dfu of dfu client example, I have to manually reset the board otherwise it is not downloading the firmware packet.

In the client example logs, I can see that after one dfu process when I try to update the firmware with another package generated by nrfutil with incremented application version I see dfu application showing error of invalid init packet although the packet is generated by nrfutil. If I reset the board, and try again it accepts the same dfu init packet without any error.

- I have again started from the scratch with simple led dfu client example on 2 dk boards but dfu completion successfully occurs only once. I have performed all the steps given in docs from generating new keys to generating public keys. Some times I get error of some parsing issue. Some times I see the error of too big gap between firmware packets and much more. I have also tried decreasing the rate of packets per second to 3 as well but still no help. 

- Only difference with my current dfu implementation and the previously running dfu implementation is that at that time only 2_3 nodes were there in network. But now I have 4_5 nodes in network. 

So please someone help me in the correct implementation of the DFU as it is currently failing 99 percent of times in firmware update.  Or please point out the possible issues and solutions.

Parents
  • Hi,

    Can you post the logs from the device being updated, and the commands you use for generating the firmware packets and start the DFU from nrfutil?

    If you can also post a sniffer trace from the on-air traffic when the DFU process completes and fails, that may help determine the issue.

    Best regards,
    Jørgen

  • Hello,

    And the example i have used is simple dfu client example as given in SDK.  Docs which i have reffered is https://infocenter.nordicsemi.com/index.jsp?topic=%2Fsdk_tz_v4.1.0%2Fthread_ot_secure_dfu.html&cp=8_3_2_8

    MULTICAST MODE

    Issue No 1:

    If a dfu client block request gets timedout for 50-60 packets it stops working. Even if i start dfu process again with nrfutill no logs are printed on DFU client side. Even with an updated application version i don't see any update on DFU client side. DFU client get stuck and requires a reset.

    Sometimes I get this error on nrfutil side while dfu process, due to which client block request gets timedout.

    nrfutil dfu thread -f -pkg app_dfu.zip -p /dev/ttyACM0 --channel 11 --panid 43981 -r 4 -rs 5000 -a FF03::1
    Using connectivity board at serial port: /dev/ttyACM0
    Flashing connectivity firmware...
    Connectivity firmware flashed.
    Waiting for NCP to promote to a router...
    Thread DFU server is running... Press <Ctrl + D> to stop.
    Waiting 20s before starting multicast DFU procedure
    ff03::1: 100%|████████████████████████████████████| 2/2 [00:00<00:00,  2.59it/s]
    ff03::1: 100%|████████████████████████████████████| 2/2 [00:00<00:00,  2.75it/sException in thread Upload thread:           | 2089/6513 [09:23<19:07,  3.85it/s]
    Traceback (most recent call last):
      File "/usr/lib/python2.7/threading.py", line 801, in __bootstrap_inner
        self.run()
      File "/usr/lib/python2.7/threading.py", line 754, in run
        self.__target(*self.__args, **self.__kwargs)
      File "/home/dnk121/.local/lib/python2.7/site-packages/nordicsemi/thread/dfu_server.py", line 293, in _upload
        payload)
      File "/home/dnk121/.local/lib/python2.7/site-packages/nordicsemi/thread/dfu_server.py", line 267, in _send_block
        self.protocol.request(request)
      File "/home/dnk121/.local/lib/python2.7/site-packages/piccata/core.py", line 570, in request
        return self._transaction_layer.send_request(request, response_callback, response_callback_args, response_callback_kw)
      File "/home/dnk121/.local/lib/python2.7/site-packages/piccata/core.py", line 453, in send_request
        self._message_layer.send_message(request)
      File "/home/dnk121/.local/lib/python2.7/site-packages/piccata/core.py", line 227, in send_message
        self._transport.send(raw_message, message.remote)
      File "/home/dnk121/.local/lib/python2.7/site-packages/nordicsemi/thread/tncp.py", line 184, in send
        src_addr = ipaddress.ip_address(self._ml_prefix + '\x00\x00\x00\xff\xfe\x00' + struct.pack('>H', rloc16))
    error: cannot convert argument to integer

    For this error, to i have attached a log file (dfu_block_timeout.log) of dfu client RTT Viewer, where you can see DFU client is sending block request continously for 2090 and onwards till block 2156 and all gets timed out (background_dfu Block timeout).
    So now at this point of time if i start the above dfu process with same application version, i don't see anything happening on
    DFU Client side.  

    dfu_block_timeout.log


    Now after resetting the client device, the dfu client gets the dfu packets (it ignores already stored packets) and start downloading the blocks that isn't stored yet. (see the file dfu_gap_encountered.log). After storing some packets, it starts showing Gap Encountered error and eventualy aborts DFU.

    dfu_gap_encountered.log

    So why dfu client got stuck at first place ? Shouldn't it reset itself instead of doing it manually.

    How to solve this gap encountered issue ?

    Commands i have used are same as given in the docs.

    Unicast Mode:

    I am able to perform OTA DFU without any issues while using unicast mode when i use my thread device as Full Thread device.

    But in same firmware when i convert my example from FTD to SED (mqttsn sleepy example + dfu client) i am not able to perform OTA DFU again, once the the client has been updated with (sleepy firmware + dfu client firmware).  Only difference between my FTD and SED firmware is that in SED i am parent polling period is kept around 4-5 seconds to save power consumption.

    So can you probably tell me what could be the issue here ?

Reply
  • Hello,

    And the example i have used is simple dfu client example as given in SDK.  Docs which i have reffered is https://infocenter.nordicsemi.com/index.jsp?topic=%2Fsdk_tz_v4.1.0%2Fthread_ot_secure_dfu.html&cp=8_3_2_8

    MULTICAST MODE

    Issue No 1:

    If a dfu client block request gets timedout for 50-60 packets it stops working. Even if i start dfu process again with nrfutill no logs are printed on DFU client side. Even with an updated application version i don't see any update on DFU client side. DFU client get stuck and requires a reset.

    Sometimes I get this error on nrfutil side while dfu process, due to which client block request gets timedout.

    nrfutil dfu thread -f -pkg app_dfu.zip -p /dev/ttyACM0 --channel 11 --panid 43981 -r 4 -rs 5000 -a FF03::1
    Using connectivity board at serial port: /dev/ttyACM0
    Flashing connectivity firmware...
    Connectivity firmware flashed.
    Waiting for NCP to promote to a router...
    Thread DFU server is running... Press <Ctrl + D> to stop.
    Waiting 20s before starting multicast DFU procedure
    ff03::1: 100%|████████████████████████████████████| 2/2 [00:00<00:00,  2.59it/s]
    ff03::1: 100%|████████████████████████████████████| 2/2 [00:00<00:00,  2.75it/sException in thread Upload thread:           | 2089/6513 [09:23<19:07,  3.85it/s]
    Traceback (most recent call last):
      File "/usr/lib/python2.7/threading.py", line 801, in __bootstrap_inner
        self.run()
      File "/usr/lib/python2.7/threading.py", line 754, in run
        self.__target(*self.__args, **self.__kwargs)
      File "/home/dnk121/.local/lib/python2.7/site-packages/nordicsemi/thread/dfu_server.py", line 293, in _upload
        payload)
      File "/home/dnk121/.local/lib/python2.7/site-packages/nordicsemi/thread/dfu_server.py", line 267, in _send_block
        self.protocol.request(request)
      File "/home/dnk121/.local/lib/python2.7/site-packages/piccata/core.py", line 570, in request
        return self._transaction_layer.send_request(request, response_callback, response_callback_args, response_callback_kw)
      File "/home/dnk121/.local/lib/python2.7/site-packages/piccata/core.py", line 453, in send_request
        self._message_layer.send_message(request)
      File "/home/dnk121/.local/lib/python2.7/site-packages/piccata/core.py", line 227, in send_message
        self._transport.send(raw_message, message.remote)
      File "/home/dnk121/.local/lib/python2.7/site-packages/nordicsemi/thread/tncp.py", line 184, in send
        src_addr = ipaddress.ip_address(self._ml_prefix + '\x00\x00\x00\xff\xfe\x00' + struct.pack('>H', rloc16))
    error: cannot convert argument to integer

    For this error, to i have attached a log file (dfu_block_timeout.log) of dfu client RTT Viewer, where you can see DFU client is sending block request continously for 2090 and onwards till block 2156 and all gets timed out (background_dfu Block timeout).
    So now at this point of time if i start the above dfu process with same application version, i don't see anything happening on
    DFU Client side.  

    dfu_block_timeout.log


    Now after resetting the client device, the dfu client gets the dfu packets (it ignores already stored packets) and start downloading the blocks that isn't stored yet. (see the file dfu_gap_encountered.log). After storing some packets, it starts showing Gap Encountered error and eventualy aborts DFU.

    dfu_gap_encountered.log

    So why dfu client got stuck at first place ? Shouldn't it reset itself instead of doing it manually.

    How to solve this gap encountered issue ?

    Commands i have used are same as given in the docs.

    Unicast Mode:

    I am able to perform OTA DFU without any issues while using unicast mode when i use my thread device as Full Thread device.

    But in same firmware when i convert my example from FTD to SED (mqttsn sleepy example + dfu client) i am not able to perform OTA DFU again, once the the client has been updated with (sleepy firmware + dfu client firmware).  Only difference between my FTD and SED firmware is that in SED i am parent polling period is kept around 4-5 seconds to save power consumption.

    So can you probably tell me what could be the issue here ?

Children
No Data
Related