This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

PM_EVT_ERROR_UNEXPECTED causes nRF52 crash randomly

I'm currently working an nRF52 using SDK 14.1 and Softdevice 5.0.0

We're having a problem where the microcontroller is crashing after a few days for seemingly no reason. We have bluetooth services running, and timers performing tasks (such as reading from an accelerometer or an ADC) in the background. If we leave our device powered on but unconnected to other devices (so it's only advertising) we see this error after a few days and it crashes. It's quite difficult to catch the bug because we need to keep it connected to a debugger for several days but here are some screenshots from backtracing

Going through the code manually it looks like there is a BLE_GAP_EVT_PARAMS_REQUEST and this is passed onto the security dispatch module who receives a PM_EVT_CONN_SEC_PARAMS_REQ and from the smd_params_reply function, error code 6 (PM_EVT_ERROR_UNEXPECTED) is received. It looks like this can be generated by a variety of functions within smd_params_reply however so this is where I've gotten a bit lost. Short of just adding breakpoints in each of these function to check their return codes, I'm not sure where to go from here. I also thought it might be my bluetooth configuration so here is a screenshot for that:

I've looked around at other people encountering this error and their problems don't seem to be linked to mine so any help would be appreciated.

Parents
  • Hi,

     

    We have a known issue with GCC in SDK v14.x, where the breakpoint condition in app_error_handler generates a hardfault instead of a breakpoint condition. Here's a workaround:

    https://devzone.nordicsemi.com/f/nordic-q-a/31951/unable-to-debug-with-hardfault_handler/123867#123867

     

    In function security_dispatcher.c::smd_params_reply(), there are several places where you can get a return err_code = 6

    The first one is if the peer_manager cannot allocate a new bond (no free space in flash). Second one is if the call to sd_ble_gap_sec_params_reply (line 840) fails with return code "NRF_ERROR_NOT_SUPPORTED" (also set as 6). Third one is if your stack memory goes out-of-bounds and starts eating into your globals (ie: a stack overflow).

     

    Q1: Are you able to determine which of the function calls inside smd_params_reply() that fails?

    Q2: Is the return value consistent (always 6)? If yes; then it is not likely that you have a stack overflow (but please do post the CPU registers anyway)

     

    Best regards,

    Håkon

     

     

     

  • Hi Håkon,

    Thanks for the quick reply. I set breakpoints for the three locations in smd_params_reply and I have it running again so when it fails I can give you an answer to question 1.

    In regards to question 2, I've encountered this error twice in a row so far and both times it was the same error code. We'll see if it happens a third time. Are there specific registers you would want the output of to check for this?

    Thanks,

    Andrew

Reply
  • Hi Håkon,

    Thanks for the quick reply. I set breakpoints for the three locations in smd_params_reply and I have it running again so when it fails I can give you an answer to question 1.

    In regards to question 2, I've encountered this error twice in a row so far and both times it was the same error code. We'll see if it happens a third time. Are there specific registers you would want the output of to check for this?

    Thanks,

    Andrew

Children
  • Hi Andrew,

     

    If you have seen the same error code twice, I doubt that this is a stack overflow issue. The CPU registers can be read out using the gdb command "mon regs", and the one I was interested in was the "SP" (stack pointer) register, to see if this one has gone too deep. No need to post the SP, but I wanted to mention how to read the registers.

     

    Regardless of which call fails inside here, it would be good to see the auto-variables in-scope as well (especially if sd_ble_gap_sec_params_reply is the sinner).

     

    Best regards,

    Håkon

Related