Hang with nRF5 SDK 17.1.0 Bootloader and nRF Connect SDK 2.1.0 application

I have an existing product that is using the nRF5 SDK 17.1.0 Bootloader.  I have updated the application to use nRF Connect SDK 2.1.0.  The new application also has an implementation of the buttonless DFU service, so that the nRF Connect app on iOS can be used to do the updates.

Most of this is working.  The application runs fine when built to run without a boot loader.  When build to run with the boot loader (CONFIG_FLASH_LOAD_OFFSET=0x27000) there is an issue that I'm having trouble debugging.  This is what is happening:

  1. I erase the device and program in the nRF5 SDK 17.1.0 Bootloader.  After reset the boot loader starts up normally.
  2. I use the nRF Connect app on iOS to transfer over the nRF Connect SDK 2.1.0 application.  I get a success message from the nRF Connect app.
  3. The application does not start BLE advertising and the device appears to be hung.
  4. If I attach the VScode debugger, it usually halts inside the boot loader startup sequence where it is checking the CRC.
  5. Looking at the RESETREAS register I see the expected SREQ, but I also see LOCKUP.
  6. If I continue code execution then the application starts up and works fine.

The hang happens every time I run the above sequence.  It happens with or without a debug probe attached.  None of the nrfjprog resets (--reset, --debugreset, --pinreset) have any effect - the device remains hung.  If I attach the debugger before step 2 above then the halt doesn't happen.  The application starts up fine without any hang.

Anyone have any ideas on what might be happening?  Any tips on how to debug this type of issue where simply having the debugger attached avoids the issue?

  • Hello,

    I like the idea of just keeping the old bootloader+softdevice and only update the application. I have not thought of this as a possibility before. My approach has been to replace the Softdevice with the Zephyr based application and the nRF5 bootloader with MCUBoot.

    Using nRF5 SDK bootloader to upgrade to FW based on the nRF connect SDK

    I actually performed this exercise last week using the same SDK versions as you (17.1.0 & 2.1.0), and it seemed to work fine. The memory layout ended up looking like this in end:

    And here are the projects I used for this test if it is of interest:

    dfu_from_nrf5_SDK_to_NCS_demo.zip

    I tried to run 'peripheral_lbs' sample with the nRF5 Bootloader and Softdevice present to reproduce the problem you described. At first I encountered a fault exception (bus fault) but it turned out was because I had forgotten to relocate storage partition (similar to FDS are in nRF5 SDK) from its default location @0xf8000 to an area that did not overlap with bootloader. This fault happened when my device was in debug interface mode as well.

    Maybe you can try my project below and see if you get the same result?

    The only explanation I can think of as to why it only worked in debug mode in your case is if your debugger starts execution from 0x27000 instead of 0x0 , but I was not able to replicate this scenario with the VS code debugger here, so I'm not sure if this could be a likely explanation or not.

    2273.peripheral_lbs.zip

    ncs_app.hex

    The application does not start BLE advertising and the device appears to be hung.

    I would suggest to run "nrfjprog --readregs" at this step to try to determine if it hangs at a fixed location or if the device is going in a bootloop (the LOCKUP bit could be an indication of the latter).

    Best regards,

    Vidar

  • Thanks for the readregs tip.  I ran that quote a few times and the output is always the same:

    % nrfjprog --readregs
    R0: 0x00050B3C
    R1: 0x0008C8D0
    R2: 0x00000020
    R3: 0x3442AE31
    R4: 0x00000000
    R5: 0xEDB88320
    R6: 0x10001000
    R7: 0x00000000
    R8: 0x00000000
    R9: 0x00000000
    R10: 0x20030000
    R11: 0x00000000
    R12: 0x00000000
    SP: 0x2003FFCC
    LR: 0x000FB423 (nrf_dfu_validation_boot_validate)
    PC: 0x000F8A3A (crc32_compute)
    xPSR: 0x21000000
    MSP: 0x2003FFCC
    PSP: 0x00000000

    So it does seem to be stuck there.

    This is also what I see if I attach the VSCode debugger.

  • I forgot to add that nrfjprog --readregs will halt the CPU so execution won't continue after you have run this command. Does the PC value stay the same if you reset the board in-between each read?

    Edit: I don't see how the program could get stuck in the crc32_compute() function. Do you have logging over UART enabled in the NCS application?

  • When I reset the board inbetween runs, it is in the CRC check each time - but not in the exact same place.  Seems like similar behavior to attaching from vscode.  Like it is halted until the debugger is attached and then it starts running.  If I read the resetreas register is still shows the lockup bit set.

  • The lockup bit means that hardfault has occurred. But the RESETREAS can be "misleading" sometimes as it is retained through soft resets. Can you try to a power on reset and see if it it still gets set?

    I think your board must be going in a reset loop. Question is if the fault exception occurs during the CRC computation or shortly after. It can be hard to capture with this approach.

    Vidar Berg said:
    Do you have logging over UART enabled in the NCS application?

    If enabled, please check if it prints anything.

Related