Problem with flash_manager, NRF_MESH_ASSERT(p_manager->config.p_area[i].metadata.page_index < i);

We have a sudden problem with the flash_manager. The code stops at static uint32_t flash_area_build(flash_manager_t * p_manager), at line NRF_MESH_ASSERT(p_manager->config.p_area[i].metadata.page_index < i);

The system was working well for several day's. After a reboot, the system halts at the mentioned NRF_MESH_ASSERT.
I downloaded the memory area the p_manager.config.p_area is pointing to (0x000F0000)

The area is all 0xFF until 0x000F1000, There I can see settings we stored in flash.

In the debugger , I see the p_manager->config.p_area[i].metadata =
.metadata_len = 0x08
.entry_header_length = 0x04
.entry_len_length_bits = 0x10
.entry_type_length_bits = 0x10
.pages_in_area = 0x02
.page_index = 0x01
._padding = 0xffff

Any hints?

Parents
  • Hi,

    I see that in the SDK, the line is:

    NRF_MESH_ASSERT(p_manager->config.p_area[i].metadata.page_index == i);

    This means your file is different from the one delivered in the SDK, and possibly at multiple locations. I assume that you have patched the SDK at some point, in your end, and so we are not sitting on the same code base.

    In general those asserts will fail either if the configuration is wrong (to prevent overlapping areas,) but from what I understand it can also mean the flash page is corrupted (e.g. other parts of the application has written to the same area, for instance if you use fstorage and/or FDS configured to use the same flash pages.) If this happens on a DK, or if you have a lot of flash operations, it may also be flash wear. (Each page is rated for 10 000 write/erase cycles.)

    If this is reproducible (it happens consistently after a given amount of time, after a device has been completely erased and reprogrammed) then it is probably something we could look further into. If it happens sporadically and/or is difficult or impossible to reproduce, then I am afraid we do likely not have the resources to look further into the issue as nRF5 SDK is now in maintenance mode.

    Regards,
    Terje

Reply
  • Hi,

    I see that in the SDK, the line is:

    NRF_MESH_ASSERT(p_manager->config.p_area[i].metadata.page_index == i);

    This means your file is different from the one delivered in the SDK, and possibly at multiple locations. I assume that you have patched the SDK at some point, in your end, and so we are not sitting on the same code base.

    In general those asserts will fail either if the configuration is wrong (to prevent overlapping areas,) but from what I understand it can also mean the flash page is corrupted (e.g. other parts of the application has written to the same area, for instance if you use fstorage and/or FDS configured to use the same flash pages.) If this happens on a DK, or if you have a lot of flash operations, it may also be flash wear. (Each page is rated for 10 000 write/erase cycles.)

    If this is reproducible (it happens consistently after a given amount of time, after a device has been completely erased and reprogrammed) then it is probably something we could look further into. If it happens sporadically and/or is difficult or impossible to reproduce, then I am afraid we do likely not have the resources to look further into the issue as nRF5 SDK is now in maintenance mode.

    Regards,
    Terje

Children
  • Hi Terje,

    Thanks for your reply!

    We found out that the main reason for the problem is de DFU. The application size seems to be to big and DFU is overwriting some part of the flash.
    For now we switched on the 'optimization for size' in Segger studio (although I don't like to change those settings on a running project..) and are able to solve the issue.

    Can you point me to a place where the DFU process is explained? In Zephyr I know there are several methodes. Are the same or similar methodes available when using the softdevice?  

    regards,

    Bob

  • Hi,

    Flash overvrites from the DFU would explain the issues, yes.

    There are basically two DFU solutions in nRF5 SDK. One is specific for Bluetooth Mesh, while the other is for BLE or serial. (In addition there is a legacy version of the BLE/serial one, which was deprecated around SDK version 11 if I recall correctly.) Those solutions are not compatible, and the mesh specific solution is not compatible with the DFU solution later specified in the Mesh Specifications (as it is a proprietary solution predating that from the Bluetooth SIG.)

    The DFU solutions in nRF Connect SDK are not compatible with the solutions from nRF5 SDK. However, there the BLE and Bluetooth Mesh solutions are different transports for the common back-end solution, and the Bluetooth Mesh solution implements the Bluetooth Mesh specifications for DFU.

    Regards,
    Terje

  • Hi,

    We are facing the same issue. After DFU we have the same assert when the system is booting.
    or application size is 464ko (.hex size) with the opitimisation for size activated.
    maybe it is not the way to know the real size of the app ?

    Is it too big to use the DFU ? if yes, What is the limit when using DFU ?
    Is it really a size issue or rather a problem with the place where or custom data in flash are stored ?

    We use the "mesh_config" (MESH_CONFIG_ENTRY) system to write and read in flash. Like it is done in nordic example for every models. We added one for a generic user property model.
    We use these id's (in the model_config_file.h) : 
    #define MESH_APP_MODEL_GENERIC_USERPROPERTY_ID_START (0x2700)
    #define MESH_APP_MODEL_GENERIC_USERPROPERTY_ID_END (0x27FF)

    We already tried to remove this part we added but the assert still appear so we are not convinced it is related to this.

    We would like to know if the size is the only reason before cleaning all the project.

    Waiting your answer ! thx :)

  • Here is the start when it success : 

    <t:          0>, main.c, 3652, ----- BLE Mesh LPN Demo -----
    <t:      11749>, main.c, 3667, ----- 4 -----
    <t:      12256>, flash_manager_defrag.c,  630, BOOTLOADERADDR(): 0x00072000 fm_recovery_area: 0x00070000
    <t:      12264>, mesh_config_flashman_glue.c,  351, Mesh config area: 0x0006F000 file_id: 0x0000
    <t:      12268>, flash_manager.c,  822, FM area: 0x0006F000
    <t:      12276>, mesh_config_flashman_glue.c,  351, Mesh config area: 0x0006E000 file_id: 0x0001
    <t:      12280>, flash_manager.c,  822, FM area: 0x0006E000
    <t:      12288>, mesh_config_flashman_glue.c,  351, Mesh config area: 0x0006D000 file_id: 0x0002
    <t:      12292>, flash_manager.c,  822, FM area: 0x0006D000
    <t:      12300>, mesh_config_flashman_glue.c,  351, Mesh config area: 0x0006C000 file_id: 0x0003
    <t:      12303>, flash_manager.c,  822, FM area: 0x0006C000
    <t:      12311>, mesh_config_flashman_glue.c,  351, Mesh config area: 0x0006B000 file_id: 0x0004
    <t:      12315>, flash_manager.c,  822, FM area: 0x0006B000
    <t:      12323>, mesh_config_flashman_glue.c,  351, Mesh config area: 0x0006A000 file_id: 0x0005
    <t:      12327>, flash_manager.c,  822, FM area: 0x0006A000
    <t:      12335>, mesh_config_flashman_glue.c,  351, Mesh config area: 0x00069000 file_id: 0xFFFE
    <t:      12339>, flash_manager.c,  822, FM area: 0x00069000
    advertiser_interval_set : 1000
    <t:      12356>, userproperty_model_client.c,   63, INIT BUFFER DONE
    <t:      12359>, userproperty_model_client.c,   63, INIT BUFFER DONE
    <t:      12385>, main.c, 3509, advertiser_instance_init OK
    <t:      12387>, main.c, 3669, ----- 5 -----

    And there is the start after DFU with the assert : 

    lfclk_config ok
    rtc_config ok
    <t:          0>, main.c, 3651, ----- BLE Mesh LPN Demo -----
    <t:      11650>, main.c, 3666, ----- 4 -----
    <t:      12160>, flash_manager_defrag.c,  630, BOOTLOADERADDR(): 0x00072000 fm_recovery_area: 0x00070000
    <t:      12167>, mesh_config_flashman_glue.c,  351, Mesh config area: 0x0006F000 file_id: 0x0000
    <t:      12171>, flash_manager.c,  822, FM area: 0x0006F000
    <t:      12174>, mesh_config_flashman_glue.c,  351, Mesh config area: 0x0006E000 file_id: 0x0001
    <t:      12178>, flash_manager.c,  822, FM area: 0x0006E000
    <t:      12182>, mesh_config_flashman_glue.c,  351, Mesh config area: 0x0006D000 file_id: 0x0002
    <t:      12185>, flash_manager.c,  822, FM area: 0x0006D000
    <t:      12189>, mesh_config_flashman_glue.c,  351, Mesh config area: 0x0006C000 file_id: 0x0003
    <t:      12193>, flash_manager.c,  822, FM area: 0x0006C000
    <t:      12195>, app_error_weak.c,  105, Mesh assert at 0x0002F3A6 (:0)

Related