We are developing a product based on nRF52832 and we need to verify that in case of problems, the device will "fail gracefully". So we looked at implementing the ARM handlers, and also making a test setup to verify that they are working OK. The test setup runs on the nRF52-DK and contains a minimal app (we call it "kamikaze") which has a simple UART interface, and which will do various nasty things to trigger the handlers when asked. (Of course you have to reset kamikaze after.)
All handlers print stack frame and fault registers to a simple bitbash UART, and flash LEDS in an infinite loop.
We were able initially to see that our HARD FAULT handler is OK. Then we enabled the BUS FAULT, MEMMANAGE FAULT and USAGE FAULT handlers and were able to trigger them all in different ways. (We also enabled DIV_BY_ZERO and ALIGNMENT faults.)
Just to see how the controller responded, we thought we would try writing some "evil recursive" code and try to overflow the stack. It was harder thna we thought, but we did it like this:
static volatile uint32_t dummy = 1;
uint32_t evil_recursive(uint32_t i)
{
printf("N:%d MSP: 0x%08x", i, (uint32_t *)__get_MSP());
return evil_recursive( i + dummy);
}
main()
{
uint32_t i = evil_recursive(1);
}
We needed to compile with the
-fno-optimize-sibling-calls
flag, but this does cause a BUS FAULT after eating stack for a few minutes. The printf is a slow, blocking call to the bit bash UART.
So far so good.
Next to speed up the test, we removed the printf, expecting the same BUS FAULT, but sooner.
To our surprise, we didn't get that. The uC immediately resets, NO handler is triggered.
We also tried replacing the long string with a single UART puct() call, to make things happen quicker - now we also got the same silent reset.
Maybe we are missing something, but this is not what we want or expect. With all handlers implemented and working, we don't want it to be possible to make the ARM core silently reset.
We realise that what we are doing is pretty abusive and not at all a usual situation, but still, cannot easily explain how this can happen.