Program stop running problem, Mesh Serial network.

Hi,

My problem described below, please share solution or any suggestion how to fix this.

=== Background / Problem ===

  • I've setup and run MESH serial network, for machine sensor. It works good, at least 2 weeks before STOP running.
  • Re-program application doesn't help.
  • Only erase entire Flash, re-program SoftDevice and Application to bring it run again.

=== Device ===

  • nRF52832 module

=== Code configuration ===

  • Keil compiler, Processor symbol:

CONFIG_NFCT_PINS_AS_GPIOS DONOT_CONFIG_GPIO_AS_PINRESET FLOAT_ABI_HARD NRF52 NRF52832_XXAA NRF52_PAN_74 NRF_SD_BLE_API_VERSION=6 S132 SOFTDEVICE_PRESENT SWI_DISABLE0 CONFIG_APP_IN_CORE NRF_MESH_LOG_ENABLE=1 PERSISTENT_STORAGE=1 NRF52_SERIES NRF52832 SD_BLE_API_VERSION=6 uECC_OPTIMIZATION_LEVEL=2 uECC_SUPPORTS_secp160r1=0 uECC_SUPPORTS_secp192r1=0 uECC_SUPPORTS_secp224r1=0 uECC_SUPPORTS_secp256r1=1 uECC_SUPPORTS_secp256k1=0 uECC_SUPPORT_COMPRESSED_POINT=0

  • The Code modified from Mesh Serial example,"nrf5_SDK_for_Mesh_v2.1.1_src\examples\serial". No bootloader.
  • Add Coexistence (follow "how_to_nordicSDK.md") with ble_app_uart (nRF5_SDK_15.0.0_a53641a\examples\ble_peripheral\ble_app_uart).
  • Enable persistent storage, chage symbol define from "PERSISTENT_STORAGE=0" to "PERSISTENT_STORAGE=1"
  • Apply fix described in "https://devzone.nordicsemi.com/f/nordic-q-a/34542/bluetooth-5-ble-mesh-assert-0x0002dc36"
  • Add Quick setup inside the code instead of external serial command (No external controller).

void mesh_quick_setup_(uint16_t address)
{
    uint32_t status;
    
    // dsm_subnet_add
    status = dsm_subnet_add(0, MESH_DEFAULT_SUBNET_KEY, &MESH_subnet_handle);
    __LOG(LOG_SRC_APP, LOG_LEVEL_INFO, "dsm_subnet_add: %u\n", status);

    // dsm_appkey_add
    status = dsm_appkey_add(0, MESH_subnet_handle, MESH_DEFAULT_APP_KEY, &MESH_appkey_handle);
    __LOG(LOG_SRC_APP, LOG_LEVEL_INFO, "dsm_appkey_add: %u\n", status); 

    // dsm_local_unicast_addresses_set
    dsm_local_unicast_address_t local = {
        .address_start = address, //DEFAULT_LOCAL_UNICAST_ADRESS_START,
        .count = ACCESS_ELEMENT_COUNT
    };
    status = dsm_local_unicast_addresses_set(&local);
    __LOG(LOG_SRC_APP, LOG_LEVEL_INFO, "dsm_local_unicast_addresses_set: %u\n", status); 

    // publication
    uint16_t pub_count = get_setting_publication_count();
    for (uint16_t i=0; i< pub_count; i++) {
        status = mesh_addr_publication_add(get_setting_publication(i));
        __LOG(LOG_SRC_APP, LOG_LEVEL_INFO, "mesh_addr_publication_add: %X, sta:%d\n", get_setting_publication(i), status); 
    } 
} 

  • Add handler event to receive Mesh data

m_evt_handler.evt_cb = serial_handler_mesh_evt_handle;
nrf_mesh_evt_handler_add(&m_evt_handler);



=== Debugging ===
Below information from my running debugger,

  • Found the CPU process locked at "flash_manager_wait" function, no return util Watchdog reset (10 sec).

/** Waits for the flash manager to complete all its operations. */
static inline void flash_manager_wait(void)
{
#if !defined(HOST)
while (!flash_manager_is_stable())
{
/* Temporary hack to make sure that bearer events are handled while waiting for
* the flash manager to finish.
* TODO: Find a solution for this that does not include busy-waiting. */
if (bearer_event_handler())
{
__WFE();
}
}
#endif
}

  • I observer the MCU write 8 bytes data append to flash on every time it reset.

Here is the Flash memory data reading from "nrfjprog --readcode d:/code.hex"

:020000040007F3
:10900000080410100100FFFFFFFFFFFFFFFFFFFF3D
:10901000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF60
:10902000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF50
...
...
:10A00000080410100100FFFF040001000800080010
:10A010000A0020000800000006000000000000BB4D
:10A02000CCBBCCBBCCBBCCBBCCBBCCBBCCBBCC00B3
:10A030000600000000000000AABBAABBAABBAABB86
:10A04000AABBAABBAABBAABB020002000001010076
:10A05000020000102100000006000030000000BBDC
:10A06000CCBBCCBBCCBBCCBBCCBBCCBBCCBBCC0073
:10A070000600004000000000AABBAABBAABBAABB06
:10A08000AABBAABBAABBAABBFFFFFF7FFFFFFFFFC4
:10A09000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFD0
:10A0A000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFC0
:10A0B000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFB0
...
...
:10B00000080410100100FFFF030002000000000010
:10B01000000000000200000000200000020000000C
:10B02000004000000200000000600000020000007C
:10B03000008000000200000000A0000002000000EC
:10B0400000C000000200000000E00000020000005C
:10B0500000000100020000000020010002000000CA
:10B06000004001000200000000600100020000003A
:10B07000008001000200000000A0010002000000AA
:10B0800000C001000200000000E00100020000001A
:10B090000000020002000000002002000200000088
:10B0A00000400200020000000060020002000000F8
...
...
...
:10BF800000A03D000200000000C03D0002000000D3
:10BF900000E03D000200000000003E000200000042
:10BFA00000203E000200000000403E0002000000B1
:10BFB00000603E000200000000803E000200000021
:10BFC00000A03E000200000000C03E000200000091
:10BFD00000E03E000200000000003F000200000000
:10BFE00000203F000200000000403F00020000006F
:10BFF00000603F000200010000803F00FFFFFF7F64
:10C00000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF40
:10C01000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF30

Observer the problem appear when write address reach 0x7BFFF.

Thank you,

  • This sounds like the FDS garbage collector is never run, and at some point it runs out of space. I suppose it should be run at some point. As you mention it is written to at reset, I suppose running it once during initialisation should be fine. 

  • Hi,

     

    Q1: Do you have a callstack/backtrace when the device is looping in flash_manager_wait() ?

    Q2: Did you add fstorage/peer_manager to the project? It might be problematic to have the flash_manager and fds/fstorage/peer_manager in the same project.

    Q3: could you attach the whole code.hex readout, or post the .hex for the pages used by flash_manager? This will make it easier for us to recreate.

     

    Kind regards,

    Håkon