nrfjprog times out with error flashing Thingy:91 - Did worker process die?

I'm getting timeout errors when trying to program a Thingy:91 with a J-Link Pro PoE on macOS on an ARM Apple M3 Max.

Here's the command and error message:

❯ west flash -i 1196000003
-- west flash: rebuilding
[0/26] Performing build step for 'tfm'
ninja: no work to do.
[1/7] Performing build step for 'mcuboot_subimage'
ninja: no work to do.
[2/7] Performing install step for 'tfm'
-- Install configuration: "MinSizeRel"
[6/6] Completed 'mcuboot_subimage'
-- west flash: using runner nrfjprog
-- runners.nrfjprog: Flashing file: /Users/chris/cgnd/clients/golioth/thingy91-golioth-workspace/app/build/zephyr/merged.hex
Timed out waiting for progress updates.sing non-volatile memory - block 2 of 2                                         
Did worker process die?
[ #################### ]  14.809s | Erase file - Done erasing                                                          
ERROR: An internal error has occurred, please try again.
NOTE: For additional output, try running again with logging enabled (--log).
NOTE: Any generated log error messages will be displayed.
FATAL ERROR: command exited with status 63: nrfjprog --program /Users/chris/cgnd/clients/golioth/thingy91-golioth-workspace/app/build/zephyr/merged.hex --sectorerase --verify -f NRF91 --snr 1196000003

I get the same error when running nrfjprog manually (I've attached the log file):

❯ nrfjprog --program /Users/chris/cgnd/clients/golioth/thingy91-golioth-workspace/app/build/zephyr/merged.hex --sectorerase --verify -f NRF91 --snr 1196000003 --log
Timed out waiting for progress updates.sing non-volatile memory - block 2 of 2                                         
Did worker process die?
[ #################### ]  16.435s | Erase file - Done erasing                                                          
ERROR: An internal error has occurred, please try again.

However, running nrfjprog --recover appears to fix whatever is causing the errors:

❯ nrfjprog --recover
Recovering device. This operation might take 30s.
Erasing user code and UICR flash areas.

❯ nrfjprog --log --program /Users/chris/cgnd/clients/golioth/thingy91-golioth-workspace/app/build/zephyr/merged.hex --sectorerase --verify -f NRF91 --snr 1196000003
[ #################### ]  13.059s | Erase file - Done erasing                                                          
[ #################### ]   2.836s | Program file - Done programming                                                    
[ #################### ]   2.820s | Verify file - Done verifying

Is there a way to get a better understanding of what's going on here? Is there something that my application code is doing that is preventing the J-Link from being able to flash the device reliably? What is nrfjprog --recover doing to "fix" this issue?

3173.log.log

Parents
  • BTW, I only get these timeout errors when trying to flash a board that is running an existing firmware image. Flashing a board with empty flash works.

    Also, if I first open the nRF Connect for Desktop Programmer GUI software and do "Erase all", and THEN run west flash, it works every time:

    ❯ west flash -i 1196000003
    -- west flash: rebuilding
    [0/26] Performing build step for 'tfm'
    ninja: no work to do.
    [1/7] Performing build step for 'mcuboot_subimage'
    ninja: no work to do.
    [2/7] Performing install step for 'tfm'
    -- Install configuration: "MinSizeRel"
    [6/6] Completed 'mcuboot_subimage'
    -- west flash: using runner nrfjprog
    -- runners.nrfjprog: Flashing file: /Users/chris/cgnd/clients/golioth/thingy91-golioth-workspace/app/build/zephyr/merged.hex
    [ #################### ]  13.381s | Erase file - Done erasing                                                          
    [ #################### ]   2.829s | Program file - Done programming                                                    
    [ #################### ]   2.810s | Verify file - Done verifying                                                       
    Applying system reset.
    Run.
    -- runners.nrfjprog: Board with serial number 1196000003 flashed successfully.

  • Hello,

    Are you using APPROTECT on this board?

    Best regards,

    Michal

  • I think in this case recover does chip erase instead of sector erase, which is indeed faster.

    If you hover over the "Flash" button in the extension, there should be a small button that does "Erase and Flash" to the right of the main "Flash" button. Does it behave the same by chance as your hack?

    Best regards,

    Michal

  • I think that's something else again. The "recover" seems to be what happens when you have a flash failure and you're prompted with a message about the flash being "protected" - you're given the option to "recover". I often wish I could just go straight to that, instead of hoping it fails and I get the prompt. Now I can.

  • I understand, thank you for the information.

    Best regards,

    Michal

  • I'm seeing the same "Did worker process die?" issue constantly.  Sometimes a series of recovers allows it to complete.  Flashing does take a long time.  My colleagues and I are all using Mac silicon and have large applications.  Note that we are using an nRF5340.  This seemed to start happening a few weeks ago, possibly when the application reached a certain size.  Looks like this was reported to Nordic months ago.  Has Nordic reproduced the issue in house? Is there a fix or workaround available?

  • Hello,

    Thank you for reporting it. Could you make a new ticket about it with logs and refer this one?

    There has been some much back and forth under this one with several different issues that it will be much easier for us to continue under a new ticket.

    Best regards,

    Michal

Reply Children
Related