This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

Mesh DFU Relay Problem

Hi ,

     I have tested two boarrds(PCA10040) with  same application id and higher application version,These work  well done.Then I want to test mesh DFU Relay function,However ,the target board does not upgrade.

    Prepare: Name two boards, Board A which is connected to pc,acts as source ,Board B which is the targect to upgrade using mesh  .

   The follow is what i do:

   (1) To generate Device Page with config  "application_id": 1,"application_version": 1, .Using it to program Board   as mesh dfu guide .

   (2)To generate Device Page with config  "application_id": 2,"application_version": 1, .Using it to program Board  B  as mesh dfu guide .

 (3)To generate DFU file with command  :nrfutil dfu genpkg --application ... using config  "application_id": 2,"application_version": 2, 

   (4)Then ,Baord A connect PC with USB serial. Board B is just supplied power .Using command: nrfutil --verbose dfu serial -pkg ... to start mesh dfu.

   The result is:

     At the first few seconds ,both boards  go to dfu with LED 0and LED2 be lighted, A few mnutes later ,Baords B ends dfu with LED 0and Led1 be lighted . and Board A is still going with dos  windows shows rate of progress.About an hour later ,the dfu transport for Baord A is done,Even I still wait for another nearly one hour, However Board B is still not be upgarded.

  I check the Board B's final ending RTT log:

......

0> <t: 16666007>, nrf_mesh_dfu.c, 324, Write complete (0x2000FE90)
0> <t: 16666010>, nrf_mesh_dfu.c, 333, Flash idle.
0> <t: 16688463>, nrf_mesh_dfu.c, 528, RADIO TX! SLOT 6, count 3, interval: exponential, handle: FFFC
0> <t: 16688479>, nrf_mesh_dfu.c, 324, Write complete (0x2000FE90)
0> <t: 16688483>, nrf_mesh_dfu.c, 333, Flash idle.
0> <t: 16700264>, nrf_mesh_dfu.c, 383, Abort event. Reason: 0x3
0> <t: 16700267>, main.c, 175, Mesh DFU End !

    The Abort Reason:0x3 is found out in file dfu_types_mesh.h  DFU_END_ERROR_PACKET_LOSS  .  So the relay dfu fail  may be packet loss over RF radio.How  to repair this ? who can give me some advice ? Thanks you.

Best Regards,

Panda.

Parents
  • Hi Panda, 

    Could you post the full log on the B device ? Also please add a breakpoint in the firmware and check if fw_updated_event_is_for_me() return true.

    Please test with a smaller application so the testing time will be shorter, the blinky application for example. 

    I don't see why sending applicaction with the different id to the one connected to PC causing this issue. The issue  DFU_END_ERROR_PACKET_LOSS   happens when the segments missing are >64 segments. 

    We need to know if the issue can be reproduced every time or it only happens from time to time and if it happens on very image or on some particular images. 

  • Hi Hung,

      The below is  Device B's full log :

    0> <t: 903695>, nrf_mesh_dfu.c, 390, New firmware!
    0> <t: 903698>, main.c, 154, Requesting DFU transfer with bank at 0x0004C000
    0> <t: 903701>, nrf_mesh_dfu.c, 528, RADIO TX! SLOT 0, count 255, interval: periodic, handle: FFFD
    0> <t: 903705>, nrf_mesh_dfu.c, 534, Killing a TX slot prematurely (repeats done: 3).
    0> <t: 903709>, nrf_mesh_dfu.c, 561, SERIAL TX!
    0> <t: 968745>, nrf_mesh_dfu.c, 528, RADIO TX! SLOT 0, count 255, interval: periodic, handle: FFFD
    0> <t: 968749>, nrf_mesh_dfu.c, 534, Killing a TX slot prematurely (repeats done: 0).
    0> <t: 968753>, nrf_mesh_dfu.c, 561, SERIAL TX!
    0> <t: 1035468>, nrf_mesh_dfu.c, 430, DFU start
    0> <t: 1035471>, main.c, 170, Mesh DFU Start...
    0> <t: 1035474>, nrf_mesh_dfu.c, 528, RADIO TX! SLOT 1, count 6, interval: exponential, handle: FFFC
    0> <t: 1041587>, nrf_mesh_dfu.c, 329, Erase complete (0x4C000)
    0> <t: 1041590>, nrf_mesh_dfu.c, 333, Flash idle.
    0> <t: 1101978>, nrf_mesh_dfu.c, 528, RADIO TX! SLOT 2, count 3, interval: exponential, handle: FFFC
    0> <t: 1101995>, nrf_mesh_dfu.c, 324, Write complete (0x2000FE90)
    0> <t: 1101998>, nrf_mesh_dfu.c, 333, Flash idle.

    ......

    0> <t: 6693194>, nrf_mesh_dfu.c, 528, RADIO TX! SLOT 7, count 3, interval: exponential, handle: FFFC
    0> <t: 6693211>, nrf_mesh_dfu.c, 324, Write complete (0x2000FE90)
    0> <t: 6693214>, nrf_mesh_dfu.c, 333, Flash idle.
    0> <t: 6710839>, nrf_mesh_dfu.c, 528, RADIO TX! SLOT 1, count 3, interval: exponential, handle: FFFC
    0> <t: 6710856>, nrf_mesh_dfu.c, 324, Write complete (0x2000FE90)
    0> <t: 6710859>, nrf_mesh_dfu.c, 333, Flash idle.
    0> <t: 6726942>, nrf_mesh_dfu.c, 528, RADIO TX! SLOT 2, count 3, interval: exponential, handle: FFFC
    0> <t: 6726958>, nrf_mesh_dfu.c, 324, Write complete (0x2000FE90)
    0> <t: 6726962>, nrf_mesh_dfu.c, 333, Flash idle.
    0> <t: 6743185>, nrf_mesh_dfu.c, 383, Abort event. Reason: 0x3
    0> <t: 6743189>, main.c, 175, Mesh DFU End !

        From the log ,Device B receives the dfu packets.The application size is:147372 bytes. The Device A connected to PC can relay the whole process.

      The Device B's issue happens every time.

       My project use nRF SDK:15.3.0 ,Mesh SDK: 3.2.0 .

      

     If there are any  advice to repair this issue ,Please tell me .

    Best Regards,

    Panda

      

      

  • Hi Panda, 

    I know that you need to use the dimming feature, but we want to try the default example to see if it was any issue integrating the DFU code into the dimming example causing the issue or not. 

    Which bootloader did you use ? Was it the precompiled one ? Could you state the .hex file you used ? 

  • Hi Panda, 

    Just a quick update, we have reproduced the issue here. With error 0x03. I will let you know when we have any finding. 

  • Hi Panda, 

    Just an update, we found that it's an issue when doing the test with just one relay node (device A , connected to PC) and one mesh node (device B) ,when there is any missing segment on device B,  the device A won't resend the missing segment (as it's only doing relay) and this will result in error 3. 


    It's not the case when device A also receive image, or when you have multiple nodes that receive the image as the destination not relay. In that case the other node(s) can resend the missing segment when requested by B. 

    We are working on a fix for this.

    You can test this theory by having 2 or 3 nodes as devices B, they should be able to serve each other if one node misses a packet. 

  • Hi Panda,
     
    Good news, we have a workaround prototype that you can test. We are still continue testing but it seems to work. 

    As explained earlier it was an issue that the serial node doesn't re-transmit the missing segment. The fix here involved modification on nrfutil, bootloader and the serial dfu app. 
    Please make sure you rebuild nrfutil after the patch. 

    Attached is the .patch file and the patched file that I was using in the test. Please let me know the result on your side.

    patch.zip

Reply
  • Hi Panda,
     
    Good news, we have a workaround prototype that you can test. We are still continue testing but it seems to work. 

    As explained earlier it was an issue that the serial node doesn't re-transmit the missing segment. The fix here involved modification on nrfutil, bootloader and the serial dfu app. 
    Please make sure you rebuild nrfutil after the patch. 

    Attached is the .patch file and the patched file that I was using in the test. Please let me know the result on your side.

    patch.zip

Children
Related