I'm debugging on an nRF52832 (nRF52-DK, non-preview) using the ARM GCC toolchain and GDB under Eclipse on Ubuntu 14.04LTS, set up following the devzone tutorial . At various points, the processor jumps to the HardFault_Handler when plain-vanilla function calls push values to the call stack. Following instruction stepping the push, execution seems to run to the end of the address space. For example: The HardFault occurs on the push instruction, during instruction stepping: From build to build, with small changes, the specific function whose push instruction causes the HardFault varies, but it always occurs on the stack push. Frustratingly, the issue doesn't seem "reversible" (all builds clean by default): Build, debug, run, hit HardFault_Handler on call to X() Make a small change in main - set a variable differently, etc. Build, debug, run, hit HardFault_Handler on call to completely different function in different compilation unit Undo the small change in main Build, debug, run, now function call to X() goes just fine, rest of code runs normally I check the SP each time and never find it farther than 80-100 bytes from __StackTop, and the stack length is set to 8K. Given the unpredictability of the fault-inducing instruction's location, it "feels" like a word alignment issue or some compiler switch that isn't set correctly. The LR values in each push instruction pass sanity-check (they're within the .text section of FLASH and generally within a few bytes of the location of the original function call). How can I determine a root cause for these HardFault exceptions that occur on seemingly-innocuous plain-vanilla function call stack pushes, at random? It's very hard to trust the toolchain when an issue like this isn't repeatable or reversible.

This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

nRF52 Hard Faults on Stack Push

I'm debugging on an nRF52832 (nRF52-DK, non-preview) using the ARM GCC toolchain and GDB under Eclipse on Ubuntu 14.04LTS, set up following the devzone tutorial.

At various points, the processor jumps to the HardFault_Handler when plain-vanilla function calls push values to the call stack. Following instruction stepping the push, execution seems to run to the end of the address space. For example:

image Call Stack after HardFault

The HardFault occurs on the push instruction, during instruction stepping:

image Offending instruction

From build to build, with small changes, the specific function whose push instruction causes the HardFault varies, but it always occurs on the stack push.

Frustratingly, the issue doesn't seem "reversible" (all builds clean by default):

Build, debug, run, hit HardFault_Handler on call to X()
Make a small change in main - set a variable differently, etc.
Build, debug, run, hit HardFault_Handler on call to completely different function in different compilation unit
Undo the small change in main
Build, debug, run, now function call to X() goes just fine, rest of code runs normally

I check the SP each time and never find it farther than 80-100 bytes from __StackTop, and the stack length is set to 8K. Given the unpredictability of the fault-inducing instruction's location, it "feels" like a word alignment issue or some compiler switch that isn't set correctly. The LR values in each push instruction pass sanity-check (they're within the .text section of FLASH and generally within a few bytes of the location of the original function call).

How can I determine a root cause for these HardFault exceptions that occur on seemingly-innocuous plain-vanilla function call stack pushes, at random? It's very hard to trust the toolchain when an issue like this isn't repeatable or reversible.

Top Replies

D. Rea over 9 years ago +2

I found one possible issue, but have not confirmed it is the true root cause of this behavior: In my debug configuration, the debugger's "Device Name" field was still set to an nRF51 device, resulting…

Parents

0 D. Rea over 9 years ago

I found one possible issue, but have not confirmed it is the true root cause of this behavior:

In my debug configuration, the debugger's "Device Name" field was still set to an nRF51 device, resulting in a mismatch between the expected core and the core identified by the J-Link. (Aside: this should have thrown an error on debugger startup, not just a warning message!)

Connecting to target...WARNING: Identified core does not match configuration. (Found: Cortex-M4, Configured: Cortex-M0)

I changed the Device Name to "nRF52832_xxAA" per the Segger list of supported device names, and have not experienced the hard fault since. Will keep this thread updated if I encounter the HardFault issue again...
Cancel
Vote Up +2 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Vidar Berg over 9 years ago in reply to D. Rea

Thanks for the update. Agree in that it should have thrown an error rather than a warning which is easy to miss.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Reply

0 Vidar Berg over 9 years ago in reply to D. Rea

Thanks for the update. Agree in that it should have thrown an error rather than a warning which is easy to miss.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Children

No Data