DFU bootloader get stuck in SoftDevice at 0x0000C100 - 0x0000C0FC then jumps to 0x00016348 - 0x0001635E and loops forever in these code segments

We are using nRF52833 with DFU secure bootloader and implemented DFU master/host on ESP32 BLE. We observe that DFU bootloader get stuck in above locations in softdevice more frequently during OTA.

SDK used: nRF5_SDK_17.0.0_9d13099
SoftDevice used with version: s140_nrf52_7.0.1_softdevice

Following are some more details

1. nRF52833 is in central mode in our application code, and DFU works in peripheral(obviously) mode
2. DFU host is implemented on ESP32 which is on same board where nRF52833 is, so no issues on BLE range
3. nRF52833 always get stuck somewhere in the last steps, where it erases/writes something in flash like bootloader settings
4. As said, it gets stuck at the end, after power cycle we observed that OTA was successful, so the new application code is loaded and even bootloader settings are also altered to make new application active but then it got stuck somewhere
5. On notification from ESP32 over serial port, nRF52833 application code jumps to bootloader by writing GPREGRET register, we have used sd_ wrapper for it as we are using softdevice
6. OTA host activities from ESP32 works perfect every time and it does not get stuck anywhere in any state, so ESP32 completes OTA file transfer and restarts itself but nRF52833 get stuck sometimes
7. According to study of DFU protocol, after new application file transfer nRF52833 do not do any handshake after flash write activities at the end are completed (this is what we observed to RTT debug messages in DFU bootloader as well)
8. After whatever debugging we did so far, we suspect that, it is something related to flash APIs, because every time it gets stuck, we have seen some peer managers log where it said it has updated something in peer manager data. We have tried to find if we can disable peer manager, but there is no way, so we tried with sd_NVICSystemReset which is supposed to reset softdevice as well before going into DFU bootloader mode, but no luck

If you can help me to find/tell which APIs of Softdevice are located at above locations, then we can try to write something in application code before jumping to bootloader to avoid stuck situation.
Also, if you can provide .o of softdevice, which we can disassemble using objdump, that would be extremely helpful, so that we can see which APIs and instructions where it is getting stuck

Below is the jlink step log, when it get stuck

J-Link>h
PC = 0000C0F8, CycleCnt = 04C5F1A4
R0 = 00000000, R1 = 00000000, R2 = 80000000, R3 = 4001F014
R4 = 20000040, R5 = 40000000, R6 = 00000000, R7 = 20006FC9
R8 = 00000000, R9 = 0007C318, R10= 20010000, R11= 00000000
R12= 00070E35
SP(R13)= 2001FE38, MSP= 2001FE38, PSP= 00000000, R14(LR) = 0000C101
XPSR = 2100000B: APSR = nzCvq, EPSR = 01000000, IPSR = 00B (DebugMonitor)
CFBP = 00000001, CONTROL = 00, FAULTMASK = 00, BASEPRI = 00, PRIMASK = 01

FPS0 = 00F93AFB, FPS1 = DFDE7575, FPS2 = 46847A75, FPS3 = C2B9E2FE
FPS4 = FA78CE71, FPS5 = 3CBA2E2E, FPS6 = 40A70424, FPS7 = AB566C9C
FPS8 = C9DCB219, FPS9 = 490E4A11, FPS10= 4B6332CF, FPS11= 2FBBB72C
FPS12= 963BEA63, FPS13= 22B5A828, FPS14= 2D7FC343, FPS15= 2001FD80
FPS16= 00000000, FPS17= 00000000, FPS18= 00000000, FPS19= 00000000
FPS20= 00000000, FPS21= 00000000, FPS22= 00000000, FPS23= 00000000
FPS24= 00000000, FPS25= 00000000, FPS26= 00000000, FPS27= 00000000
FPS28= 00000000, FPS29= 00000000, FPS30= 00000000, FPS31= 00000000
FPSCR= 00000000
J-Link>s
0000C0F8: 20 B9 CBNZ R0, #+0x08
J-Link>s
0000C0FA: 00 20 MOVS R0, #0
J-Link>s
0000C0FC: 0A F0 24 F9 BL #+0xA248
J-Link>s
00016348: 07 49 LDR R1, [PC, #+0x1C]
J-Link>s
0001634A: 10 31 ADDS R1, #16
J-Link>s
0001634C: 0A 68 LDR R2, [R1]
J-Link>s
0001634E: D2 03 LSLS R2, R2, #15
J-Link>s
00016350: 06 D5 BPL #+0x0C
J-Link>s
00016352: 09 68 LDR R1, [R1]
J-Link>s
00016354: 01 F0 03 01 AND R1, R1, #0x03
J-Link>s
00016358: 81 42 CMP R1, R0
J-Link>s
0001635A: 01 D1 BNE #+0x02
J-Link>s
0001635C: 01 20 MOVS R0, #1
J-Link>s
0001635E: 70 47 BX LR
J-Link>s
0000C100: 00 28 CMP R0, #0
J-Link>s
0000C102: F7 D1 BNE #-0x12
J-Link>s
0000C0F4: D5 F8 0C 01 LDR R0, [R5, #+0x10C]
J-Link>s
0000C0F8: 20 B9 CBNZ R0, #+0x08
J-Link>s
0000C0FA: 00 20 MOVS R0, #0
J-Link>s
0000C0FC: 0A F0 24 F9 BL #+0xA248
J-Link>s
00016348: 07 49 LDR R1, [PC, #+0x1C]
J-Link>s
0001634A: 10 31 ADDS R1, #16
J-Link>s
0001634C: 0A 68 LDR R2, [R1]
J-Link>s
0001634E: D2 03 LSLS R2, R2, #15
J-Link>s
00016350: 06 D5 BPL #+0x0C
J-Link>s
00016352: 09 68 LDR R1, [R1]
J-Link>s
00016354: 01 F0 03 01 AND R1, R1, #0x03
J-Link>s
00016358: 81 42 CMP R1, R0
J-Link>s
0001635A: 01 D1 BNE #+0x02
J-Link>s
0001635C: 01 20 MOVS R0, #1
J-Link>s
0001635E: 70 47 BX LR
J-Link>s
0000C100: 00 28 CMP R0, #0

Parents Reply Children
  • Hi,

    Is it possible to test DFU with nRF connect on Android or iOS as well? It would be interesting to know if it leads to the same problem.

    Thanks,

    Vidar

  • Update: the Softdevice is getting stuck in a loop waiting for the LF clock calibration timer to stop. We have observed this in some corner case when the Softdevice is re-enabled after being disabled by softdevice_disable().

    More follow up questions: Do you also call NVIC_SystemReset() after DFU is complete before booting the app like in the nRF5 SDK bootloader, or do you disable the Softdevice and boot the application directly?

  • Hello,

    Thank you for your reply, we have already tried/tested DFU with nRF connect app on Android and it leads to same problem, the frequency of problem occurrence varies, but it's completely random.

    Regards,

    Alankar

  • Hello Alankar,

    Thanks for confirming. But what about the boot sequence after a successful DFU, does it involve disabling and re-enabling of the Softdevice?

    Thanks,

    Vidar

  • Sorry for late reply Vidar, we are having festive holidays this week...

    As you said it is getting stuck in a loop waiting for the LF clock calibration timer to stop, so to indicate here, we have not used external crystal on our board and using internal RC. so, we have done respective changes in sdk_config.h of bootloader and app too. Following are the changes, this was done as per recommended setting with internal RC

    // </h>
    //==========================================================

    // <h> Clock - SoftDevice clock configuration

    //==========================================================
    // <o> NRF_SDH_CLOCK_LF_SRC - SoftDevice clock source.

    // <0=> NRF_CLOCK_LF_SRC_RC
    // <1=> NRF_CLOCK_LF_SRC_XTAL
    // <2=> NRF_CLOCK_LF_SRC_SYNTH

    #ifndef NRF_SDH_CLOCK_LF_SRC
    #define NRF_SDH_CLOCK_LF_SRC 0
    #endif

    // <o> NRF_SDH_CLOCK_LF_RC_CTIV - SoftDevice calibration timer interval.
    #ifndef NRF_SDH_CLOCK_LF_RC_CTIV
    #define NRF_SDH_CLOCK_LF_RC_CTIV 16
    #endif

    // <o> NRF_SDH_CLOCK_LF_RC_TEMP_CTIV - SoftDevice calibration timer interval under constant temperature.
    // <i> How often (in number of calibration intervals) the RC oscillator shall be calibrated
    // <i> if the temperature has not changed.

    #ifndef NRF_SDH_CLOCK_LF_RC_TEMP_CTIV
    #define NRF_SDH_CLOCK_LF_RC_TEMP_CTIV 2
    #endif

    And following are the answers to your questions

    1. Do you also call NVIC_SystemReset() after DFU is complete before booting the app like in the nRF5 SDK bootloader

    Ans: We have not modified any piece of code in bootloader other than above changes to support internal RC. And one more change was the DFU advertise name. Rest of the bootloader code is same as of SDK, so it must be calling NVIC_SystemReset() after DFU is complete.

    2. do you disable the Softdevice and boot the application directly?

    Ans: My answer is same as above, we have not modified code of bootloader, so behaviour must be same as bootloader. Again, we are not disabling Softdevice anywhere in our app code, not even before jumping to bootloader in DFU mode. To tell you more, we have tried disabling SoftDevice before booting into bootloader mode, but it didn't help and shown some other issues like sometimes BLE goes off after going into DFU mode. So, I removed it from the app code. So, in short Softdevice is never disabled in our code and no changes in logical flow of bootloader. 

    I have few queries on your comments though, if the Softdevice is getting stuck in a loop, waiting for clock calibration, is there any timeout implemented for this, why it gets stuck forever? and what could be the solution on this?

Related