This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

stack overflow and STACK GUARD

       I am using nrf52832 SDK14.2 development, now the system will generate some strange exceptions, I am worried about the stack overflow, is there any way to confirm whether the stack overflows, where does the stack overflow?
      In addition, I saw the STACK GUARD library. What is the role of this library? I only saw the initialization function, did not see the exception reminder and other processing functions, it can help me confirm whether the stack overflows and overflows?

  • The stack guard library uses a feature of the Cortex-M4 processor called the Memory Protection Unit (MPU). You can use the MPU to set attributes and access rights on regions of memory. The stack guard library creates a 256-byte region at the bottom of the stack with access permissions set to read-only (for both privileged and unprivileged accesses). This means that any instruction that tries to write past the bottom of the stack will trigger a memory management fault (as opposed to just silently corrupting the stack memory).

    Be aware that this only applies to the application stack, and I think this will only work if you use the "monolithic" SDK examples. If for example you use the FreeRTOS examples, then each thread will have its own stack.

    if you don't specifically enable memory manager faults in the System Handler Control and Status Register (SHCSR), then you'll just get a hard fault instead. This means that you should include both the stack guard library and the hard fault handler so that you will get an informative message on the serial port to let you know what happened.

    Note that you can use this same technique to also trap NULL pointer bugs. You just need to create a protected region at address 0x0. Since address 0x0 corresponds to the start of flash, reading from there is allowed and writing just fails silently, which means a NULL pointer bug could go unnoticed for some time. With the MPU it will show up immediately.

    If it were me, instead of using RO/RO (0x7) as the protection attributes, I would use NA/NA (0x0). This will trap both reads and writes. Reading past the end of the stack could also be a sign of a bug that is best fixed. I have found that you can get away with doing this even when setting a protection region at address 0x0. When I first tried this I was concerned it might have some impact on exception handling (since the exception vector table is also at address 0x0), but while reads and writes are successfully trapped, exception handling seems unaffected.

    You can read more about the MPU in this document:

    http://infocenter.arm.com/help/topic/com.arm.doc.dui0553b/DUI0553.pdf

    -Bill

  • wpaul wrote:

    If it were me, instead of using RO/RO (0x7) as the protection attributes, I would use NA/NA (0x0). This will trap both reads and writes. Reading past the end of the stack could also be a sign of a bug that is best fixed. I have found that you can get away with doing this even when setting a protection region at address 0x0. When I first tried this I was concerned it might have some impact on exception handling (since the exception vector table is also at address 0x0), but while reads and writes are successfully trapped, exception handling seems unaffected.

    I tried setting the MPU with a region at location zero with both read and write access disabled (0x0 for AP field). My experiments show that it still allows reads--only disables writes.  So I'm unable to detect reads of NULL pointers.  Have you been able to get it to trigger on reads from zero as well?

    - Tony

  • My experiments

    You need to describe your experiments, otherwise I have no idea what you actually did.

    Have you been able to get it to trigger on reads from zero as well?

    Yes, I did. That's why I brought it up.

    The code that I used to enable the MPU is here:

    https://github.com/netik/dc27_badge/blob/4bba0e58a304671b48755b99113b3d489259760a/software/firmware/badge/nullprot_lld.c#L72

    This is done with the MPU APIs in ChibiOS, but all of the code for that is in the repo, so you should be able to resolve all the macros and see what values are actually being used.

    This also works for the Cortex-M7 CPU that I'm using now. I use test code like this:

         uint8_t * blah = (uint8_t *)0;
         printf ("moo: %x\n", blah[0]);

    The result I get with my trap handler support is this:

    ********** MEMMANAGE FAULT **********
    Data access violation
    Memory fault address: 0x00000000
    Fault while in thread mode
    Floating point context saved on stack
    Interrupt is pending
    Exception pending: 53
    Exception active: 4
    PC: 0x0021158C LR: 0x00204FF5 SP: 0x200007E8 SR: 0x61000000
    R0: 0x20000F38 R1: 0x20001FE4 R2: 0x00000001 R3: 0x00000000 R12: 0x00000820

    The PC value above is at exactly the instruction that does the load:

    21158c: 7823 ldrb r3, [r4, #0]

    The Nordic SDK has some similar code in it. You can find mine in the badge_vectors.c and badge_fault.S modules in the above github repo. Also, in main.c , I do this:

    /*
      * Enable memory management, usage and bus fault exceptions, so that
      * we don't always end up diverting through the hard fault handler.
      * Note: the memory management fault only applies if the MPU is
      * enabled, which it currently is (for stack guard pages).
      */

    SCB->SHCSR |= SCB_SHCSR_USGFAULTENA_Msk |
    SCB_SHCSR_BUSFAULTENA_Msk |
    SCB_SHCSR_MEMFAULTENA_Msk;

    -Bill

  • Hi Bill,

    I appreciate your timely and detailed response.

    I've taken a look at your code and have tried to replicate the exact settings and I'm still only seeing memory fault on writes to address 0, but reads do not fault.

    Your code is essentially:

    #define mpuConfigureRegion(region, addr, attribs) {                         \
      MPU->RNR  = ((uint32_t)region);                                           \
      MPU->RBAR = ((uint32_t)addr);                                             \
      MPU->RASR = ((uint32_t)attribs);                                          \
    }
    
    mpuConfigureRegion (MPU_REGION_6, 0x0, 
        MPU_RASR_ATTR_AP_NA_NA | 
        MPU_RASR_ATTR_NON_CACHEABLE | 
        MPU_RASR_SIZE_1K | 
        MPU_RASR_ENABLE
    );
    

    Which, following out the macro values--if I'm not mistaken--becomes:

      MPU->RNR  = 6; 
      MPU->RBAR = 0;
      MPU->RASR = 0x00080013;

    When I execute the above code (followed by MPU->CTRL = 0x05), it protects from writes only. I can execute your code snippet just fine:

    uint8_t * blah = (uint8_t *)0;
    printf ("moo: %x\n", blah[0])

    It's only when I execute a write that the memory fault triggers. I'm also enabling the individual faults:

        SCB->SHCSR |= (
          SCB_SHCSR_USGFAULTENA_Msk |
          SCB_SHCSR_BUSFAULTENA_Msk |
          SCB_SHCSR_MEMFAULTENA_Msk
          );

    And I do get a memory management-specific fault when I write, but never on a read.

    I've experimented with slightly different variations of attribute settings (from a previous project running on a STM32F437 MCU that tripped on reads at 0) but the results remain the same--I'm never able to trigger a fault on reading a NULL pointer.

    I'm using an nRF52840 and s140_nrf52_6.1.0_softdevice.hex (and I'm doing this MPU configuration before the SD is enabled--like you seem to do).  So I remain puzzled about what is going on.

    Wait a minute... I just realized that I've always stepped through this code in the debugger.  But, if I let the processor free-run through the fault-generating code (even in the debugger), then it does generate the memory fault on the read!  Wow, wish I had thought of that earlier.  Not exactly sure what aspect of single-stepping is suppressing the read memory fault, but that seems to be what made me think it wasn't working.

  • One other observation, regarding why exception vectors don't trip the MPU:

    "When the MPU is enabled, accesses to the System Control Space and vector table are always permitted. Other areas are accessible based on regions and whether PRIVDEFENA is set to 1."

    Source: http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0552a/BABDJJGF.html

Related