Stack Guard and MPU

Question

Hi, I'm trying to get the nrf_stack_guard and nrf_mpu libraries to catch writes past the end of the stack. My stack is 8kB in size (0x2000E000-0x20010000). This is the log output after initializing the Stack Guard module:

<debug> nrf_mpu: MPU region creating (location: 0x2000E000-0x2000E07F)
<debug> nrf_mpu: MPU region 0 created (location: 0x2000E000-0x2000E07F, access: RO/RO, type: Normal, flags: XN).
<info> stack_guard: Stack Guard: 0x2000E000-0x2000E07F (usable stack area: 8064 bytes)

I'm having a little trouble understanding how the MPU works. I'd expect a write to 0x2000E000 to trigger the HardFault_Handler, but in reality nothing is triggered. The write just happens and silentlly corrupts RAM below the stack.

I'm using the nrf gcc hardfault library implementation and am able to catch NULL dereferences and other faults, so that should be correctly set up.

What am I missing?

wpaul · Accepted Answer

You made me curious enough that I actually hauled out my nRF52840 DK reference board and re-flashed it to run the SDK demos.

The problem here is not that there's something wrong with you, the problem is there's something wrong with the universe:

this example is buggy.

Note that this is not the only example that uses the stack guard feature. The examples/peripheral/cli/main.c code also uses it, and that code does it right.

What's not immediately obvious is that being a CLI example, there's actually an MPU command built into the CLI. If you run it, the problem becomes clear. Here's what I get:

uart_cli:~$ mpu info
MPU State: Disabled, 8 unified regions aviable.

Region 0: Enabled
        - Location:     0x2003E000-0x2003E07F (size: 128 bytes)
        - Access:       RO/RO
        - Type:         Normal
        - Caching:      WBWA/WBWA
        - Flags:        XN

Region 1: Enabled
        - Location:     0x20004301-0x20004400 (size: 256 bytes)
        - Access:       RO/RO
        - Type:         Normal
        - Caching:      WBWA/WBWA
        - Flags:        XN

Region 2: Disabled
Region 3: Disabled
Region 4: Disabled
Region 5: Disabled
Region 6: Disabled
Region 7: Disabled
[00:02:38.000,213] <info> app: Battery level update: 97
uart_cli:~$ mpu dump
MPU_TYPE:       0x00000800
MPU_CTRL:       0x00000000

MPU_RBAR[0]:    0x2003E000
MPU_RASR[0]:    0x1729000D

MPU_RBAR[1]:    0x20004301
MPU_RASR[1]:    0x1729000F

MPU_RBAR[2]:    0x00000002
MPU_RASR[2]:    0x00000000

MPU_RBAR[3]:    0x00000003
MPU_RASR[3]:    0x00000000

MPU_RBAR[4]:    0x00000004
MPU_RASR[4]:    0x00000000

MPU_RBAR[5]:    0x00000005
MPU_RASR[5]:    0x00000000

MPU_RBAR[6]:    0x00000006
MPU_RASR[6]:    0x00000000

MPU_RBAR[7]:    0x00000007
MPU_RASR[7]:    0x00000000
uart_cli:~$

The most important part is where it says: MPU State: Disabled, 8 unified regions aviable.

Yes, Disabled.

If you download the book that I linked in my first reply, in the MPU section it documents the operation of the MPU control register. You must set bit 0 in that register to actually turn the MPU on. But here we can see that hasn't been done:

MPU_CTRL: 0x00000000

You can also inspect the MPU registers from the debugger directly. They start at address 0xe000ed90. You can refer to table 4-38 in the manual above for the complete map.

The reason this is happening is that it's not enough to call nrf_mpu_stack_guard_init(). You *also* have to call nrf_mpu_init(). This example _DOESN'T_ do that, which is incredibly dumb.

Go to main.c in the example and do this:

    APP_ERROR_CHECK(nrf_mpu_init());  /* Add me! */
    APP_ERROR_CHECK(nrf_stack_guard_init());

Now re-run your test. When I do it, I get the following endless hard fault cycle:

[00:00:00.000,000] <info> stack_guard: Stack Guard (128 bytes): 0x2003E000-0x2003E07F (total stack size: 8192 bytes, usable stack area: 8064 bytes)
[00:00:00.000,000] <error> hardfault: HARD FAULT at 0x0003460E
[00:00:00.000,000] <error> hardfault:   R0:  0x00000000  R1:  0x0F81B159  R2:  0xBA5EBA11  R3:  0x2003E000
[00:00:00.000,000] <error> hardfault:   R12: 0x2000360C  LR:  0x00027EC9  PSR: 0x61000000
[00:00:00.000,000] <error> hardfault: Cause: The processor attempted a load or store at a location that does not permit the operation.
[00:00:00.000,000] <error> hardfault: MemManage Fault Address: 0x2003E000
[00:00:00.000,000] <info> stack_guard: Stack Guard (128 bytes): 0x2003E000-0x2003E07F (total stack size: 8192 bytes, usable stack area: 8064 bytes)
[00:00:00.000,000] <error> hardfault: HARD FAULT at 0x0003460E
[00:00:00.000,000] <error> hardfault:   R0:  0x00000000  R1:  0x0F81B159  R2:  0xBA5EBA11  R3:  0x2003E000
[00:00:00.000,000] <error> hardfault:   R12: 0x2000360C  LR:  0x00027EC9  PSR: 0x61000000
[00:00:00.000,000] <info> stack_guard: Stack Guard (128 bytes): 0x2003E000-0x2003E07F (total stack size: 8192 bytes, usable stack area: 8064 bytes)
[00:00:00.000,000] <error> hardfault: HARD FAULT at 0x0003460E
[00:00:00.000,000] <error> hardfault:   R0:  0x00000000  R1:  0x0F81B159  R2:  0xBA5EBA11  R3:  0x2003E000
[00:00:00.000,000] <error> hardfault:   R12: 0x2000360C  LR:  0x00027EC9  PSR: 0x61000000
[00:00:00.000,000] <error> hardfault: Cause: The processor attempted a load or store at a location that does not permit the operation.
[00:00:00.000,000] <error> hardfault: MemManage Fault Address: 0x2003E000
[...]

The examples/peripheral/cli/main.c code actually does call nrf_mpu_init() before nrf_stack_guard_init(). I don't know why this one doesn't.

-Bill

Stack Guard and MPU

Top Replies