nRF5 SDK for Thread: Secure DFU OTA stopped working all of sudden .

Hello, 

I am using nrf5 SDK for thread SDK for development of my thread based product. Product is already developed just have to integrate dfu in it. Few months back when I tried dfu integrating it with my mqttsn example it worked everything was working smoothly. I was able to update mqttsn firmware both in unicast and multicast mode. 

But now all of sudden when I try to perform same steps for dfu, it doesn't work. I have tried again with dfu client example as given in docs there too I am only able to perform dfu once.

Problem faced till now:

- After one time dfu of dfu client example, I have to manually reset the board otherwise it is not downloading the firmware packet.

In the client example logs, I can see that after one dfu process when I try to update the firmware with another package generated by nrfutil with incremented application version I see dfu application showing error of invalid init packet although the packet is generated by nrfutil. If I reset the board, and try again it accepts the same dfu init packet without any error.

- I have again started from the scratch with simple led dfu client example on 2 dk boards but dfu completion successfully occurs only once. I have performed all the steps given in docs from generating new keys to generating public keys. Some times I get error of some parsing issue. Some times I see the error of too big gap between firmware packets and much more. I have also tried decreasing the rate of packets per second to 3 as well but still no help. 

- Only difference with my current dfu implementation and the previously running dfu implementation is that at that time only 2_3 nodes were there in network. But now I have 4_5 nodes in network. 

So please someone help me in the correct implementation of the DFU as it is currently failing 99 percent of times in firmware update.  Or please point out the possible issues and solutions.

Parents Reply Children
  • Hello,

    And the example i have used is simple dfu client example as given in SDK.  Docs which i have reffered is https://infocenter.nordicsemi.com/index.jsp?topic=%2Fsdk_tz_v4.1.0%2Fthread_ot_secure_dfu.html&cp=8_3_2_8

    MULTICAST MODE

    Issue No 1:

    If a dfu client block request gets timedout for 50-60 packets it stops working. Even if i start dfu process again with nrfutill no logs are printed on DFU client side. Even with an updated application version i don't see any update on DFU client side. DFU client get stuck and requires a reset.

    Sometimes I get this error on nrfutil side while dfu process, due to which client block request gets timedout.

    nrfutil dfu thread -f -pkg app_dfu.zip -p /dev/ttyACM0 --channel 11 --panid 43981 -r 4 -rs 5000 -a FF03::1
    Using connectivity board at serial port: /dev/ttyACM0
    Flashing connectivity firmware...
    Connectivity firmware flashed.
    Waiting for NCP to promote to a router...
    Thread DFU server is running... Press <Ctrl + D> to stop.
    Waiting 20s before starting multicast DFU procedure
    ff03::1: 100%|████████████████████████████████████| 2/2 [00:00<00:00,  2.59it/s]
    ff03::1: 100%|████████████████████████████████████| 2/2 [00:00<00:00,  2.75it/sException in thread Upload thread:           | 2089/6513 [09:23<19:07,  3.85it/s]
    Traceback (most recent call last):
      File "/usr/lib/python2.7/threading.py", line 801, in __bootstrap_inner
        self.run()
      File "/usr/lib/python2.7/threading.py", line 754, in run
        self.__target(*self.__args, **self.__kwargs)
      File "/home/dnk121/.local/lib/python2.7/site-packages/nordicsemi/thread/dfu_server.py", line 293, in _upload
        payload)
      File "/home/dnk121/.local/lib/python2.7/site-packages/nordicsemi/thread/dfu_server.py", line 267, in _send_block
        self.protocol.request(request)
      File "/home/dnk121/.local/lib/python2.7/site-packages/piccata/core.py", line 570, in request
        return self._transaction_layer.send_request(request, response_callback, response_callback_args, response_callback_kw)
      File "/home/dnk121/.local/lib/python2.7/site-packages/piccata/core.py", line 453, in send_request
        self._message_layer.send_message(request)
      File "/home/dnk121/.local/lib/python2.7/site-packages/piccata/core.py", line 227, in send_message
        self._transport.send(raw_message, message.remote)
      File "/home/dnk121/.local/lib/python2.7/site-packages/nordicsemi/thread/tncp.py", line 184, in send
        src_addr = ipaddress.ip_address(self._ml_prefix + '\x00\x00\x00\xff\xfe\x00' + struct.pack('>H', rloc16))
    error: cannot convert argument to integer

    For this error, to i have attached a log file (dfu_block_timeout.log) of dfu client RTT Viewer, where you can see DFU client is sending block request continously for 2090 and onwards till block 2156 and all gets timed out (background_dfu Block timeout).
    So now at this point of time if i start the above dfu process with same application version, i don't see anything happening on
    DFU Client side.  

    dfu_block_timeout.log


    Now after resetting the client device, the dfu client gets the dfu packets (it ignores already stored packets) and start downloading the blocks that isn't stored yet. (see the file dfu_gap_encountered.log). After storing some packets, it starts showing Gap Encountered error and eventualy aborts DFU.

    dfu_gap_encountered.log

    So why dfu client got stuck at first place ? Shouldn't it reset itself instead of doing it manually.

    How to solve this gap encountered issue ?

    Commands i have used are same as given in the docs.

    Unicast Mode:

    I am able to perform OTA DFU without any issues while using unicast mode when i use my thread device as Full Thread device.

    But in same firmware when i convert my example from FTD to SED (mqttsn sleepy example + dfu client) i am not able to perform OTA DFU again, once the the client has been updated with (sleepy firmware + dfu client firmware).  Only difference between my FTD and SED firmware is that in SED i am parent polling period is kept around 4-5 seconds to save power consumption.

    So can you probably tell me what could be the issue here ?

  • Hello any update ?

    Update:

    I am able to perform DFU OTA easily in FTD device. But when i do the same for SED, i am able to perform the following DFU:

    1 FTD firmware -----> SED Firmware (DFU successfull)

    2 FTD firmware -----> FTD Firmware (DFU successfull)

    3  SED Firmware----> SED Firmware (Not successfull).

    in 3rd, if o perform a manual reset than i am able to perform SED DFU for one time, after that i am unable to perform DFU process as my thread device isn't asking for the COAP firmware packets. If i do a manual reset now than again i am able to do dfu for one time.

  • Hi,

    A sniffer trace would be very useful to determine what is failing in the DFU process. Did the nrfutil application crash with the error you posted above when the DFU started failing?

    If you can also increase the log level in the application, but setting the following configs to "Debug", this may also provide some more details:

    // <o> NRF_LOG_DEFAULT_LEVEL  - Default Severity level
     
    // <0=> Off 
    // <1=> Error 
    // <2=> Warning 
    // <3=> Info 
    // <4=> Debug 
    
    #ifndef NRF_LOG_DEFAULT_LEVEL
    #define NRF_LOG_DEFAULT_LEVEL 4
    #endif
    
    // <o> BACKGROUND_DFU_CONFIG_LOG_LEVEL  - Default Severity level
     
    // <0=> Off 
    // <1=> Error 
    // <2=> Warning 
    // <3=> Info 
    // <4=> Debug 
    
    #ifndef BACKGROUND_DFU_CONFIG_LOG_LEVEL
    #define BACKGROUND_DFU_CONFIG_LOG_LEVEL 4
    #endif
    
    // <e> IOT_COAP_CONFIG_LOG_ENABLED - Enables logging in the module.
    //==========================================================
    #ifndef IOT_COAP_CONFIG_LOG_ENABLED
    #define IOT_COAP_CONFIG_LOG_ENABLED 1
    #endif
    // <o> IOT_COAP_CONFIG_LOG_LEVEL  - Default Severity level
     
    // <0=> Off 
    // <1=> Error 
    // <2=> Warning 
    // <3=> Info 
    // <4=> Debug 
    
    #ifndef IOT_COAP_CONFIG_LOG_LEVEL
    #define IOT_COAP_CONFIG_LOG_LEVEL 4
    #endif

    Enabling higher Log levels in OpenThread would also be useful, but this requires you to rebuild the OpenThread libraries.

    Best regards,
    Jørgen

  • Did the nrfutil application crash with the error you posted above when the DFU started failing?

    No instead it's opposite, the DFU client try to receive the successive blocks but it cant as server side shows some serial problem.  So eventually DFU fails.

    I think after one time DFU, the client example only send COAP DFU trigger packet once at start so if i start my DFU after 2-3 minutes my client is unable to respond. I think this was the issue, because if i send repeated dfu trigger packets (once every 30 sec), i am able to complete DFU multiple number of times without any issues. 

    Also the DFU process is slow due to my parent polling period.

Related