This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

NFR_ERROR_NO_MEM

After updating from THREAD SDK 2.0 to SDK 3.1 I am getting NRF_ERROR_NO_MEM in nrf_sdh.c.

The project is based from the multiprotocol BLE thread dynamic example.

The code will run for hours then after a reboot during initialization when calling the otThreadSetEnabled function;

I get the message from line 391 in nrf_sdh.c  NRF_ERROR_NO_MEM error; 

Looking in the app_sched_event_put function it seems that the event_index value is not being changed from the default value.

I have increased the SCHED_QUEUE_SIZE define from 32 to 48 but the issue is still happening.

The BLE DFU is enabled in this code.

Once this happens the radio is bricked.

What would cause the app queue to be full so early in startup

  • In my code I never call app_sched_event_put directly. It is called from either functions that are part of the Nordic SDK or the OPENTHREAD. In the code I have the TWIS, BLUETOOTH and OPENTHREAD running. Before updating to SDK 3.1 we have had a network of 256 nodes running with this same code for weeks. After the nodes with the SDK 3.1 code  crash I have connected a debugger to a crashed node. It is calling the app_sched_event_put function when the otThreadSetEnabled function is called. It never returns from trying to start the OPENTHREAD during startup. I have increase the SCHED_QUEUE_SIZE from the pre SDK 3.1 code. It still happens. 

    The real question is what is happening to cause the crashing that bricks the radio in the first place. 

  • Hi Jay,

    I have forwarded this information to our Thread team, but unfortunately they are quite busy at the moment so please be patient. I am out of ideas about what can be the problem here so I will wait to update this ticket when get an answer from them.

    Best regards,

    Marjeris

  • Hi Jay,

    Sorry for the late answer. After finally discussing this case with the Thread team we think it would be good to take a look at your code to move this issue forward. Could you share your main.c, sdk config file and linker script with us? I can also make this case private if you want before sharing these files.

    I know you already have done some debugging at your end, but have you tried to set a breakpoint at NRF_ERROR_NO_MEM (line 211) and print the variables in app_scheduler.c?

    static event_header_t * m_queue_event_headers;  /**< Array for holding the queue event headers. */
    static uint8_t        * m_queue_event_data;     /**< Array for holding the queue event data. */
    static volatile uint8_t m_queue_start_index;    /**< Index of queue entry at the start of the queue. */
    static volatile uint8_t m_queue_end_index;      /**< Index of queue entry at the end of the queue. */
    static uint16_t         m_queue_event_size;     /**< Maximum event size in queue. */
    static uint16_t         m_queue_size;           /**< Number of queue entries. */

    These variables should be initialized to zero, but one of our theories is that maybe something is causing these variables to be corrupted somehow? Maybe the linker script is outdated for SDK 3.1? Are you using the same linker scripts as for SDK 2.0?

    If you haven't done it yet you should also turn off optimization, and set the -debug flag so you can get more information when debugging, it could be really helpful.

    You also mention having LEDs on the devices, we wonder if you are using bsp_thread.c or another bsp module for this? The bsp_thread.c module has a state_changed_callback() handler which is called everytime a device role changes, which then call app timer which uses the app scheduler (see code in bsp_thread_ping_indication_set()), so another theory is that maybe it can be related to this as well...

    Best regards,

    Marjeris

  • You can close this ticket 

    We went back to using the OPENTHREAD from git not the pre-compiled libraries from the SDK.

    Problem solved.

Related