Failure in ble_stack_init during startup

We are trying to bring up a new run of an existing PCB design. No changes in the area of the processor, but a different assembly house. The first call inside ble_stack_init, a call to nrf_sdh_enable_request(), fails and returns an error code of 8. No reference anywhere gives a usable explanation.

Strangely, if started by "copying" the hex file onto the Jlink "drive" (the nRF52832 DevKit), it succeeds.

What should we be looking for?

Parents
  • Hello,

    It seems like the issue might be with how the firmware is programmed since it appears to work when you use drag&drop programming. How are you programming the device when it fails? Is the same error also returned after a reset? Note that the "SoftDevice enable" function may return NRF_ERROR_INVALID_STATE (8) if the debugger forces execution to start from the application start address instead of address 0x0. This prevents the softdevice's reset handler from running on startup.

    Best regards,

    Vidar

  • I was starting it from SES (Segger) just as I have on dozens of previous nRF52832 projects, and indeed successfully on previous runs of this board. So there's something different in the hardware, and I'm asking where to look. What sort of "invalid state" is likely?

  • Probing the idea of the LF clock further. On a board that starts up fine from F5 or power up, I sample LFCLKSTAT early in the startup - all zeroes. I attempt to start it with

    NRF_CLOCK -> TASKS_LFCLKSTOP;
    NRF_CLOCK -> LFCLKSRC = 1;
    NRF_CLOCK -> TASKS_LFCLKSTART;

    And then test LFCLKSTAT up to 1 million times, always zero. On the good board, proceeding thru the BLE initialization and then testing LFCLKSTAT yields the expected 00010001. So I seem to have added another mystery: Why can't I start the clock? And the fact that it hasn't started prior to calling ble_stack_init() doesn't seem to be a problem. But if ble_stack_init() can't get it started, that could be the source of the hangups / resets on the new rev board.

    Am I at all close here?

  • F5 has always been assigned to "Debug->Go" on my setups (linux, macOS, and Windows with SES v5.x), hence my assumption that you were starting a debug session. "Debug->Go" is equivalent to "Build -> Build&Debug". Either way, thanks for clearing this up.

    I would say the expected failure mode for an LF clock issue is either that the program hangs forever inside sd_softdevice_enable() because the oscillator fails to start, or BLE connectivity issues due to frequency drift. What it cannot do is cause the invalid state error you are seeing.

    I'm still convinced that the issue is that the softdevice's reset handler was skipped during boot. Again, this is expected and a known limitation when using build&run. I do not know why you have not experienced this before with other boards, but the problem can easily get masked as I mentioned earlier. If you want to test for this hypothesis you can add the code below temporary to your project.

    #define RESET_HANDLER_ADDR (*(volatile uint32_t *) 0x4)
    #define SOFTDEVICE_RESET_HANDLER_EXECUTED 0xAA
    
    typedef void (*reset_handler_t)(void);
    
    static void run_reset_handler_once(void)
    {
        if (NRF_POWER->GPREGRET == SOFTDEVICE_RESET_HANDLER_EXECUTED)
        {
            /* Already ran fon af previous reset — skip */
            NRF_POWER->GPREGRET = 0xFF;
            return;
        }
    
        /* Mark it before jumping, in case reset handler doesn't return */
        NRF_POWER->GPREGRET = SOFTDEVICE_RESET_HANDLER_EXECUTED;
    
        ((reset_handler_t)(RESET_HANDLER_ADDR))();
    }
    
    /**@brief Function for application main entry.
     */
    int main(void)
    {
        /* For debugging purposes - manually invoke the MBR->Softdevice's reset handler */
        run_reset_handler_once();
        

  • I'm not sure that would cause the continuous resets I'm seeing, but that's just the code I needed to test for a proper reset. Will run and get back to you shortly.

  • Confirming again, even with your added "run once" reset code: If started from a drag&drop, it runs exactly as expected. If started from F5 (default is Build&Run in SES, which is how I left it), or from a power cycle, it gets to the call to sd_softdevice_enable() in the CRITICAL_REGION in nrf_sdh_enable_request() and resets somewhere before it returns from that call. I know this can't be, but somewhere in this house of cards hides a joker....

  • I have tried to start up the LF clock with the following routines, before any calls into the softdevice. It does not start up on either board, but after a correct startup (any method on the old rev, or via drag&drop on the new boards), it always shows running. I speculate that there's some problem in the softdevice starting the clock, but how to find it?

    void CheckLfClock (void)
    {
    char Str [80];
    sprintf (Str, "LFCLKSTAT %08X", NRF_CLOCK -> LFCLKSTAT);
    LogToJournal (Str);
    }

    void StartLfClock (void)
    {
    LogToJournal ("Starting LF Clock");
    NRF_CLOCK -> TASKS_LFCLKSTOP;
    NRF_CLOCK -> LFCLKSRC = 1;
    NRF_CLOCK -> TASKS_LFCLKSTART;
    }

    void AwaitLfClock (void)
    {
    int i;
    char Str [80];
    for (i = 1000000; i; --i) if (NRF_CLOCK -> LFCLKSTAT) break;
    sprintf (Str, "%d tries remaining", i);
    LogToJournal (Str);
    }

    I also tried clearing SOFTDEVICE_RESET_HANDLER_EXECUTED after recording its value and before the call to the softdevice, thinking I could at least trap any reset occurring there. No joy.

Reply
  • I have tried to start up the LF clock with the following routines, before any calls into the softdevice. It does not start up on either board, but after a correct startup (any method on the old rev, or via drag&drop on the new boards), it always shows running. I speculate that there's some problem in the softdevice starting the clock, but how to find it?

    void CheckLfClock (void)
    {
    char Str [80];
    sprintf (Str, "LFCLKSTAT %08X", NRF_CLOCK -> LFCLKSTAT);
    LogToJournal (Str);
    }

    void StartLfClock (void)
    {
    LogToJournal ("Starting LF Clock");
    NRF_CLOCK -> TASKS_LFCLKSTOP;
    NRF_CLOCK -> LFCLKSRC = 1;
    NRF_CLOCK -> TASKS_LFCLKSTART;
    }

    void AwaitLfClock (void)
    {
    int i;
    char Str [80];
    for (i = 1000000; i; --i) if (NRF_CLOCK -> LFCLKSTAT) break;
    sprintf (Str, "%d tries remaining", i);
    LogToJournal (Str);
    }

    I also tried clearing SOFTDEVICE_RESET_HANDLER_EXECUTED after recording its value and before the call to the softdevice, thinking I could at least trap any reset occurring there. No joy.

Children
No Data
Related