Nordic not reflashable suddenly

Nordic,

We've been using the nRF52832 for many years now.  We have a custom board that uses SWCLK and SWDIO to flash our NRF52.  We're getting set for production and I am working on our automated manufacturing rig.  We have a python script that flashes a nordic with a soft device, bootloader, and then application.  Some of the GPIO go to LEDs, so I can see what state the chip is in after programming.  It had been working great then I started getting this problem where the LEDs didn't show up in the right state.  I'd go to reflash the device, and then the SWD won't connect, and I could not erase the chip.  I can hit our reset button, tied to the RESET pin on the IC, and nothing happens.  Normally this button does a hard reset, where the LED goes away and (I believe) the chip loses power.  Upon release of the reset, the board should initialize into the bootloader and the LED blinks red marking as such.  But for these 'bricked' boards, the LED stays on throughout the reset button press.   These two things: not able to reset the chip, and not being able to reflash the chip, these are quite alarming.

One piece of good news is the current drawn seems to be about 1 mA like when the Nordic is in low power state.  

I've spent most all day going through devzone forums trying things, hoping to recover or at least explain my freeze-up.  

When I go back into J Flash Lite, I can no longer ERASE the chip.  Now it says

Connecting to J-Link...
Connecting to target...
ERROR: Could not connect to target.
Done.

Fine, I can try from the command utility.  I use the jlink.exe tool, and go through the regular steps to connect and then try to erase.  This also doesn't work:

J-Link>connect
Device "NRF52832_XXAA" selected.


Connecting to target via JTAG
InitTarget() start
InitTarget() end
TotalIRLen = ?, IRPrint = 0x..000000000000000000000000
InitTarget() start
InitTarget() end
TotalIRLen = ?, IRPrint = 0x..000000000000000000000000
InitTarget() start
InitTarget() end
TotalIRLen = ?, IRPrint = 0x..000000000000000000000000
InitTarget() start
InitTarget() end
TotalIRLen = ?, IRPrint = 0x..000000000000000000000000
Cannot connect to target.

Ok, now I try to use the nrfjprog tool.  For reference I look up the version as 

nrfjprog version: 10.15.1 external
JLinkARM.dll version: 7.58b

Then I do all kinds of commands to try and talk to my board.  Here are some

>nrfjprog --recover --log
ERROR: [SeggerBackend] - JLinkARM.dll reported "-1", "An unknown error.".
ERROR: [SeggerBackend] - JLinkARM.dll reported "-1", "An unknown error.".
ERROR: [SeggerBackend] - JLinkARM.dll reported "-1", "An unknown error.".
ERROR: [SeggerBackend] - JLinkARM.dll reported "-1", "An unknown error.".
ERROR: [SeggerBackend] - JLinkARM.dll reported "-1", "An unknown error.".
ERROR: [SeggerBackend] - JLinkARM.dll reported "-1", "An unknown error.".
ERROR: [SeggerBackend] - JLinkARM.dll reported "-1", "An unknown error.".
ERROR: [ nRFXX] - Device does not have an ARM debug port.
ERROR: [SeggerBackend] - JLinkARM.dll reported "-1", "An unknown error.".
ERROR: nrfjprog could not identify the target device. This may be due to an
ERROR: invalid family argument, a problem with your device, or nrfjprog may
ERROR: not yet support your device.
ERROR: Please check the family argument passed, or upgrade nrfjprog to a more
ERROR: recent version.

>nrfjprog -e
ERROR: nrfjprog could not identify the target device. This may be due to an
ERROR: invalid family argument, a problem with your device, or nrfjprog may
ERROR: not yet support your device.
ERROR: Please check the family argument passed, or upgrade nrfjprog to a more
ERROR: recent version.
NOTE: For additional output, try running again with logging enabled (--log).
NOTE: Any generated log error messages will be displayed.

>nrfjprog -d --family nrf52 --log
ERROR: Unable to connect to a debugger.
ERROR: [ nRF52] - Debug probe is not connected to an NRF52 series device.
ERROR: The --family option given with the command (or the default from
ERROR: nrfjprog.ini) does not match the device connected.

I know the segger and cables and everything are correct, I HAD been able to program this and other boards.  But during development of the python script, I've 'bricked' 5 of my boards.  I only have 2 left.  So I need to get to the bottom of this.

I don't understand how something could ever make the SWCLK SWDIO not able to erase the chip.  What am I missing, please help.

environment:  windows 10

Parents
  • Jonathan,

    Here are some more updates.  We probed the reset, because that is such a unique failure detail.  The P0.21/Reset pin is grounded, always.  This is wrong: on a good board, this pin sits at 3V.  On the good board, when the reset hw button is pressed, it grounds the IO.  This got us thinking maybe we should put 3V on the P0.21 pin.  When we do that from an external supply, the nordic IC can be fully erased!  Which confirms the chip is somewhat alive. We even reflashed the soft device, bootloader and program onto the chip.  We can watch the serial output of the initialization. 

    But then the problem is the board seems to do a self reset every couple of seconds.  This is despite us keeping 3V applied to the P0.21 pin.  

    We get the feeling there is a pull-down inside the Nordic P0.21.  

    I saw this forum post that has this same sort of discussion, where there is a reset internal problem.  https://devzone.nordicsemi.com/f/nordic-q-a/8342/nrf52832-has-the-reset-pin-an-internal-pull-down

    Now I'm just as confused.  How could we have possibly changed the reset pin setup? Furthermore, how could a complete erase and reflashing not address whatever bad setup state we got into? 

    Hmmmm 

  • More updates.  
    The self-resetting has gone away - I'm not sure why.  On three boards we get seemingly full functionality of the board as long as the external power supply gives 3V to the P0.21 reset pin.  But whenever we turn off that supply, the chip is bricked.  

    We still haven't explained how this happens, nor how to fix it.  What could we be doing to disrupt the pull-up inside the reset pin?

  • What have you set these two registers to:

      __IOM uint32_t  PSELRESET[2]; /*!< (@ 0x00000200) Description collection[0]
      __IOM uint32_t  APPROTECT;    /*!< (@ 0x00000208) Access Port protection                                     */

    In system_nrf52.c the reset pin is set when first enabled without using the optional pull-up which is always available to the pin when not in reset:

        /* Configure GPIO pads as pPin Reset pin if Pin Reset capabilities desired. If CONFIG_GPIO_AS_PINRESET is not
          defined, pin reset will not be available. One GPIO (see Product Specification to see which one) will then be
          reserved for PinReset and not available as normal GPIO. */
        #if defined (CONFIG_GPIO_AS_PINRESET)
            if (((NRF_UICR->PSELRESET[0] & UICR_PSELRESET_CONNECT_Msk) != (UICR_PSELRESET_CONNECT_Connected << UICR_PSELRESET_CONNECT_Pos)) ||
                ((NRF_UICR->PSELRESET[1] & UICR_PSELRESET_CONNECT_Msk) != (UICR_PSELRESET_CONNECT_Connected << UICR_PSELRESET_CONNECT_Pos))){
                NRF_NVMC->CONFIG = NVMC_CONFIG_WEN_Wen << NVMC_CONFIG_WEN_Pos; // Write Enable
                while (NRF_NVMC->READY == NVMC_READY_READY_Busy){}
                NRF_UICR->PSELRESET[0] = 21;
                while (NRF_NVMC->READY == NVMC_READY_READY_Busy){}
                NRF_UICR->PSELRESET[1] = 21;
                while (NRF_NVMC->READY == NVMC_READY_READY_Busy){}
                NRF_NVMC->CONFIG = NVMC_CONFIG_WEN_Ren << NVMC_CONFIG_WEN_Pos; // Read-only Enable
                while (NRF_NVMC->READY == NVMC_READY_READY_Busy){}
                // UICR changes require a reset to be effective
                NVIC_SystemReset();
            }
        #endif
    

    Maybe as a test enable the pin pull-up before the first power up when the pins get mapped to the reset function and the chip resets itself; at least the pin will be high at the moment of reset. First time powered up UICR.APPROTECT setting DISABLE (0x558) = 0x5A to disable, following an erase all, required for build codes Gxx and later.

  • We just confirmed with nrfjprog --memrd 0x10001200 that they are both set to 0x15, which is 21.  Which is what we want, we think.  We do have CONFIG_GPIO_AS_PINRESET in the make file, which again is what we want.  One more confirm is that we do a printout of those two pins on the startup, and we see them as 0x15,0x15.

    But still when we remove the 3V from the pin, the board bricks.

    How do we do your test?  When and what would we write to force this pin to be a pull-up? 

    Also, how do we know what build code our setup is? 

    Thanks for your help!

  • Abbreviation Definition and implemented codes
     N52/nRF52 nRF52 Series product
     832 Part code
     <PP> Package variant code
     <VV> Function variant code
     <H><P><F> Build code
                          H - Hardware version code
                          P - Production configuration code (production site, etc.)

    The Build code is in the part number; (old notes:)not sure if this is readable from the MCU, I have 'F':

        ARCHITECTURE bits[19:16] Reads as 0xF, see About the CPUID scheme on page B4-644

    Always run following code on power-up, but outside the reset() #if stuff; worth a try:

        // If the code gets here, enable the Reset pin pull-up to hold the pin inactive for a brief time on reset
    #define PULLNONE   (0UL <<  2) // Bit 3..2: Pin Pull Disabled
    #define PULLDOWN   (1UL <<  2) // Bit 3..2: Pin Pulldown
    #define PULL_NA    (2UL <<  2) // Bit 3..2: Pin Pull n/a
    #define PULLUP     (3UL <<  2) // Bit 3..2: Pin Pullup
        NRF_P0->PIN_CNF[21] = (PULLUP);
        // Software Disable APPROTECT - only use duriing development
        // AAAA 0x41414141 AAAA
        // AAAC 0x41414143 AAAC
        // AABA 0x41414241 AABA
        // AABB 0x41414242 AABB
        // AAB0 0x41414230 AAB0
        // ABB0 0x41424230 ABB0
        // AAE0 0x41414530 AAE0
        // ABE0 0x41424530 ABE0
        // AAGB 0x41414742 AAGB <== HwDisable Access Port protection
        // ABGB 0x41424742 ABGB <== HwDisable Access Port protection
        // AAG0 0x41414730 AAG0 <== HwDisable Access Port protection
        // ABG0 0x41424730 ABG0 <== HwDisable Access Port protection
        if ((NRF_UICR->APPROTECT & 0xFF) == 0xFF) // && (NRF_FICR->INFO.VARIANT == 0xnnGn))
        {
            NRF_UICR->APPROTECT = 0x5A;   // HwDisable Access Port protection  - write once only
        }
     
        #if defined (CONFIG_GPIO_AS_PINRESET)
            if (((NRF_UICR->PSELRESET[0] & UICR_PSELRESET_CONNECT_Msk) != (UICR_PSELRESET_CONNECT_Connected << UICR_PSELRESET_CONNECT_Pos)) ||
                ((NRF_UICR->PSELRESET[1] & UICR_PSELRESET_CONNECT_Msk) != (UICR_PSELRESET_CONNECT_Connected << UICR_PSELRESET_CONNECT_Pos))){
                NRF_NVMC->CONFIG = NVMC_CONFIG_WEN_Wen << NVMC_CONFIG_WEN_Pos; // Write Enable
                while (NRF_NVMC->READY == NVMC_READY_READY_Busy){}
                NRF_UICR->PSELRESET[0] = 21;
                while (NRF_NVMC->READY == NVMC_READY_READY_Busy){}
                NRF_UICR->PSELRESET[1] = 21;
                while (NRF_NVMC->READY == NVMC_READY_READY_Busy){}
                NRF_NVMC->CONFIG = NVMC_CONFIG_WEN_Ren << NVMC_CONFIG_WEN_Pos; // Read-only Enable
                while (NRF_NVMC->READY == NVMC_READY_READY_Busy){}
                // UICR changes require a reset to be effective
                NVIC_SystemReset();
            }
        #endif

  • Ah, found an update:

    Table 137: Function variant codes
     <VV> Flash (kB) RAM (kB) Access port protection
     AA 512 64 Controlled by hardware
     AB 256 32 Controlled by hardware
     AA-G 512 64 Controlled by hardware and software
     AB-G 256 32 Controlled by hardware and software

    Maybe you now have the new 'AAG0' device when before you used the AAE0; AAG0 has the different mechanism I describe above

  • I'm at my house now, so I can't try any new code.   But I did look up and found our parts are all NRF52832-QFAA-R7.  So it should all be hardware, right?

    This still doesn't address how this happened and how to confirm it won't happen again.  

    Thanks for your help!

Reply Children
  • Can't tell from that code; might have an AAGB or AAG0 - try:

       char ch, VarientStr[4+1] = "----";
       if (isprint(ch = (NRF_FICR->INFO.VARIANT>> 0u & 0xFF))) VarientStr[3] = ch;
       if (isprint(ch = (NRF_FICR->INFO.VARIANT>> 8u & 0xFF))) VarientStr[2] = ch;
       if (isprint(ch = (NRF_FICR->INFO.VARIANT>>16u & 0xFF))) VarientStr[1] = ch;
       if (isprint(ch = (NRF_FICR->INFO.VARIANT>>24u & 0xFF))) VarientStr[0] = ch;
       snprintf(mBuffer, sizeof(mBuffer), "Var %lX (%s)", NRF_FICR->INFO.VARIANT, VarientStr);
    

    Well, I have AAE1 which isn't even listed! It's in a module .. hmm see 0x41414531-not-listed-in-datasheet

  • It would be good to confirm that this is a HW issue,

    If you are using --recover command to reset the device then the RESET pin functionality should not be enabled. In this state the device should not need to have the pin P0.21 pulled high.
    (you could flash empty program or a simple blinky where reset is not enabled, to test)


    Could it be that there is some HW differences on the boards you have compared to the others that work, has there been a HW revision or update?

    Can the issue be with the reset switch? Is it mounted correctly? Maybe try to remove the switch on one of the “defective” boards and se if that fixes the issue.

    Regards,
    Jonathan

  • I did INFO.VARIANT suggestion and we have AAB0 0x41414230 AAB0, which seems fine and not a Gx style.

  • That simplifies things; looks like the reset simply isn't functional as if it were the LED would go out as you surmised, even if a broken reset switch was stuck low.

    for these 'bricked' boards, the LED stays on throughout the reset button press

    If I were testing this I would now try the recover steps you used with a VCC of 2.4 volts and again at 3.4 volts.

  • Jonathan,

    We are not using the --recover command to reset the board.  I did try that and it erased user code and UICR flash.  No improvement.

    I do not think it is a hardware issue. It might be, but this has happened to boards from this round and the previous build cycle.  

    I also tried to remove the reset button, with no success.  You can see from that part of the circuit that shouldn't help, the only thing pulling down is when the switch is closed.

    The confusing thing is, even after all erasing methods, the chip still needs the 3V on the reset pin to be connectable.  As in, we give it 3V in order to connect and do --recover, or -e, or 'erase chip' in Jflash Lite, all of these will fail unless we have the 3V external power applied to P0.21.  Then after we do the erase, if the reset pin setting were truly removed from memory, I'd think I could turn off the external supply.  Wrong, the chip goes to brick mode if that 3V is removed.  The board still can't function without the external 3V.

    I have confirmed the reset pin UICR is unset, by doing the 

    nrfjprog --memrd 0x10001204 --family nrf52
    0x10001204: FFFFFFFF                              |....|

    It sure feels like there's something broken inside the chip.  

    My main concern is figuring out HOW I broke the chip and ensuring it never happens again.  I'm ok if we lose 5 chips, but I can't go to production with an unknown bricking bug.

Related