Failure in ble_stack_init during startup

We are trying to bring up a new run of an existing PCB design. No changes in the area of the processor, but a different assembly house. The first call inside ble_stack_init, a call to nrf_sdh_enable_request(), fails and returns an error code of 8. No reference anywhere gives a usable explanation.

Strangely, if started by "copying" the hex file onto the Jlink "drive" (the nRF52832 DevKit), it succeeds.

What should we be looking for?

Parents
  • Hello,

    It seems like the issue might be with how the firmware is programmed since it appears to work when you use drag&drop programming. How are you programming the device when it fails? Is the same error also returned after a reset? Note that the "SoftDevice enable" function may return NRF_ERROR_INVALID_STATE (8) if the debugger forces execution to start from the application start address instead of address 0x0. This prevents the softdevice's reset handler from running on startup.

    Best regards,

    Vidar

  • I was starting it from SES (Segger) just as I have on dozens of previous nRF52832 projects, and indeed successfully on previous runs of this board. So there's something different in the hardware, and I'm asking where to look. What sort of "invalid state" is likely?

  • I did Build and Debug in SES. The program started at the beginning of main(). I then set a breakpoint at the first instruction in app_error_fault_handler(), and did Debug - Go. Program runs normally, no errors, breakpoint never hit.

    Next?

  • Thanks for confirming. What happens if you power cycle the board after this, does it continue to work? If not, try adding the line below at the beginning of main(). This will prevent the chip from entering low power mode and will mimic being in debug interface mode. It would be helpful for troubleshooting to know if it has the same effect.

    int main(void)
    {
        NRF_POWER->TASKS_CONSTLAT = 1;
        ...

  • Interesting thought. There's no reason in my code that it should sleep within a few minutes of startup. I tried your suggestion; no change. If run with a power cycle, or with F5 from within SES, it still goes into a continuous reset loop. If run with Build and Debug, it runs fine. I have code at the beginning of main() that flashes an LED via direct NRF_GPIO access, so no calls to anything in the Nordic code. I also tried adding the line you suggested above both before and after that code; no change.

    I have nonvolatile memory external to the nRF52832, wherein I can record a journal of actions. One thing I record is RESETREAS after each startup. It appears to have a value of 0x00000004 on a successful startup, and 0 when stuck in a loop. I will investigate that further.

    Suggestions based on the above?

  • Here's something more interesting. I recently added logging to my journal for app_error_fault_handler(). I have a breakpoint at the first line of that function, which never hit. However, DEBUG is apparently defined when doing Build and Debug, so my logging was tripped even though the code never (?) went thru the breakpoint. It recorded the function parameters: id - 00004001, pc - 00000000, info - 2000FF6C. Interpretation?

  • Digging further back into my journal, APP_ERROR_CHECK was tripped when nrf_sdh_ble_enable return an error code of 8. Leading up to this, nrf_sdh_enable_request had returned 8 also, but nrf_sdh_ble_default_cfg_set had returned 0.

    In attempting to troubleshoot this, I had added a call no nrf_sdh_disable_request at the very beginning of ble_stack_init, ignoring any returned error code from that. Removing that line doesn't change anythng.

Reply
  • Digging further back into my journal, APP_ERROR_CHECK was tripped when nrf_sdh_ble_enable return an error code of 8. Leading up to this, nrf_sdh_enable_request had returned 8 also, but nrf_sdh_ble_default_cfg_set had returned 0.

    In attempting to troubleshoot this, I had added a call no nrf_sdh_disable_request at the very beginning of ble_stack_init, ignoring any returned error code from that. Removing that line doesn't change anythng.

Children
  • An additional clue: When I erase my external nonvolatile memory (via a command on the BLE link), I finish by sending an acknowledgement message via BLE and then call NVIC_SystemReset(), and the system restarts perfectly. The message never shows up on the BLE link, presumably because the system gets reset before it goes out. The message acknowledging the beginning of the erase process always shows up, and the erase is indeed successful.

  • If run with Build and Debug, it runs fine.

    What I wanted to confirm is whether it continues to run fine on subsequent reboots as well.

    thing I record is RESETREAS after each startup. It appears to have a value of 0x00000004 on a successful startup, and 0

    Does your FW clear the register after reading it? Is important to remember that this is a retained register. If it is zero it indicates a POR or BOR reset. Another question is if you can trust the recorded value if the device is going in a boot loop. I would instead suggest that you read out the RESETREAS register with your debugger.

    DEBUG is apparently defined when doing Build and Debug, so my logging was tripped even though the code neve

    Whether DEBUG is defined or not depends on your build configuration. We include it in our debug build configurations in our SDK examples:

    id - 00004001, pc - 00000000, info - 2000FF6C

    You can get it to print out the file name, line number and error value by retrieving the error from the info pointer as done in the default handler here: 

    finish by sending an acknowledgement message via BLE and then call NVIC_SystemReset(), and the system restarts perfectly.

    Does it continue to work after this? E.g., after a power cycle.

  • ResetReason = NRF_POWER -> RESETREAS;
    NRF_POWER -> RESETREAS = ResetReason;

    The debugger works poorly or not at all beyond being a program loader when using the softdevice. I added the code you suggested above, and never got it to output (never reached). I added some probes to see where we're going off the rails, and now we're resetting somewhere inside ble_stack_init(). I'll add some more probes inside there.

    How late are you working today? Pretty sure you're in a time zone several hours ahead of me.

  • Correction to the above: It's derailing in nry_sdh_enable_request, which is the first function called from ble_stack_init.

  • In nrf_sdh_enable_request(), it's resetting in this critical region:

        CRITICAL_REGION_ENTER();
    #ifdef ANT_LICENSE_KEY
        ret_code = sd_softdevice_enable(&clock_lf_cfg, app_error_fault_handler, ANT_LICENSE_KEY);
    #else
        ret_code = sd_softdevice_enable(&clock_lf_cfg, app_error_fault_handler);
    #endif
        m_nrf_sdh_enabled = (ret_code == NRF_SUCCESS);
        CRITICAL_REGION_EXIT();

    According to the highlighting by SES, ANT_LICENSE_KEY is not defined, so it's resetting semoewhere in sd_softdevice_enable, but apparently without getting to app_error_fault_handler. I have no visibility into the softdevice, so please wave your magic wand over it and tell me hwat I'm doing wrong.

Related