Zephyr app at custom location causing hard fault

HI all,

i am using NRF52832. I have a custom bootloader because it needs to be able to receive an update file via BLE, even if the main application is broken. Also there is no space for two slots for he main app.

So there is just the custom bootloader + main app. The bootloader seems to be working fine. The main app works also as expected when I use a minimal examples (blinky with button triggering an interrupt). However, once I enable BT in my Config with CONFIG_BT=y and CONFIG_BT_PERIPHERAL=y then the app crashes in a very early startup phase in Zephyr, causing a hard fault.

My setup is like this:

Overlay flash partition used for main app.

/ {
    chosen {
        zephyr,code-partition = &app_partition;
    };

    aliases {
        flash0 = &flash0;
    };
};

&flash0 {
    partitions {
        compatible = "fixed-partitions";

        boot_partition: partition@0 {
            label = "bootloader";
            reg = <0x00000000 0x00030000>;
        };

        app_partition: partition@30000 {
            label = "app";
            reg = <0x00030000 0x00050000>;
        };
    };
};

Config of main app:

CONFIG_SERIAL=y

CONFIG_CPP=y
CONFIG_STD_CPP17=y

CONFIG_BT=y
CONFIG_BT_PERIPHERAL=y
CONFIG_BT_DEVICE_NAME="name"

CONFIG_USE_DT_CODE_PARTITION=y
CONFIG_INIT_ARCH_HW_AT_BOOT=y
CONFIG_MPU_ALLOW_FLASH_WRITE=y

If the debugger can be trusted, then the fault occurs somewhere in system_nrf52.c near:

    #if NRF52_ERRATA_12_ENABLE_WORKAROUND
        /* Workaround for Errata 12 "COMP: Reference ladder not correctly calibrated" found at the Errata document
           for your device located at https://infocenter.nordicsemi.com/index.jsp */
        if (nrf52_errata_12()){
            *(volatile uint32_t *)0x40013540 = (*(uint32_t *)0x10000324 & 0x00001F00) >> 8;
        }
    #endif

I am not sure what causes this problem and also not how to debug further. Maybe I am doing something wrong. Any help will be appreciated. Thanks!

Parents Reply Children
  • OK, then we look somewhere else.

    Mark G. said:
    Ok Interestingly the hard fault seems to occur once the offset (0x00030000) + size is bigger than > 256kb. Which is very telling.

    Can you please tell me how you came to this conclusion? What debug info did you get or see and what tests you attempted to come to this conclusion?

    I am suspecting that if you came to this conclusion with a debugger, then the debug info might be misleading if there is a stack corruption.

    However, once I enable BT in my Config with CONFIG_BT=y and CONFIG_BT_PERIPHERAL=y then the app crashes in a very early startup phase in Zephyr, causing a hard fault.

    I think this smells like a stack corruption. If you enable some features and have been having some minimal stack sizes, then I would recommend you to increase the stack sizes like CONFIG_MAIN_STACK_SIZE, CONFIG_BT_RX_STACK_SIZE, CONFIG_BT_HCI_TX_STACK_SIZE etc in your prj.conf (double their sizes for testing) and see if you still have this issue. If you still do have this issue, then we focus away from insufficient RAM and stack and then look again into your partition manager settings.

  • Thanks for the quick reply!

    I am testing with a minimal "blinky"-like main.cpp that is the only source for this build. I set the partition and the start of the partition. And then I tested by varying the offset adresses in the Flash and by enabling different features in the config. 

    I look at the output size of the binary:

    [100%] Linking CXX executable trigger.elf
    Memory region         Used Size  Region Size  %age Used
               FLASH:      163508 B       320 KB     49.90%
                 RAM:       23428 B        64 KB     35.75%
            IDT_LIST:          0 GB        32 KB      0.00%

    and then add the "flash offset" to it. By testing different combinations of start address and size I figured that somehow a total size of 256kb is the magic boundary between working and not working.

    I even did further testing with complete deactivating of the Blueetooth. By controlling the occupied Flash memory of the app with some constant array I could reproduce the issue.

    // occupy 141kb of Flash memory, together with the offest > 256kb
    constexpr uint8_t magicData[1024 * 141] = {1, 2};// does lead to error
    
    // results in slightly less than 256kb flash memory
    // constexpr uint8_t magicData[1024 * 140] = {1, 2}; // no error!
    
    volatile uint8_t sum = 0;
    
    int main(void)
    {
        for (size_t i = 0; i < sizeof(magicData); ++i) {
            sum += magicData[i];
        }
        
        ....

    As soon as more than 256 KB are occupied it fails.

    So its pretty clear that its not related to BT, but somehow the partition / memory management.

  • Mark, 

    Thanks for that test and results. That helps. So it is not related to Bluetooth and the stack sizes then.

    Can you please check your ".config" file in your build folder and see if "CONFIG_PARTITION_MANAGER_ENABLED" is enabled? If enabled, then you need to then your pm.yml file in your bootloader to reflect the sizes of your partition correctly (and your devicetree settings in this case does not matter as partition manager takes control).

    Also make sure that the names for the partitions is the same as your bootloader expects something like below in your pm.hyml file

    # pm.yml
    partitions:
      flash0:
        # FIXME: fix names of partitions as my bootloaders expect
        bootloader:
          address: 0x00000000
          size:    0x00030000
        app:
          address: 0x00030000
          size:    0x00050000
    

  • The variable CONFIG_PARTITION_MANAGER_ENABLED does not even exist as valid configuration.

    Maybe one important part of information is that I currently use "raw Zephyr", no nrf sdk.

  • Hmm, then there is a lot that I cannot guess about your setup.

    Please check the RAM size (FICR.INFO:FLASH) in the hardware

    nrfjprog --memrd 0x10000110 --w 4

    and see what result you get. If you get 00000200 then you have 512KB and after that I do not know what could be wrong. After that I strongly recommend you to follow this up in Zephyr forum as you are not using our tested release tags.

Related