This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

Handling of errors in access_model_publish()

Hi,

like described in this post I sometimes get an error = 4 (NO_MEMORY) when calling:

error = access_model_publish(m_clients[0].model_handle, &msg);

How should this be handled? (Ignore, retry with delay, etc)

Is there a way to find out, wether the softdevice is still sending a message and therefore is not ready for the next one?

I'm using the mesh SDK 2.0.1 

Regards

Gerry

Parents
  • Hi Seger, 

    When access_model_publish() throw NRF_ERROR_NO_MEM this means there isn't enough buffer for your message and you have to wait for a NRF_MESH_EVT_TX_COMPLETE event before you continue. How often do you call publish command ? Was it a reliable message? 

  • Hi,

    I call access_model_publish()  as an unreliable message 10 times per second when I get NRF_ERROR_NO_MEM in about 50% of the calls. When using Mesh SDK 1.0.0. I was able to send 20-30 messages per second with the same code without a problem. Is there a signifficant drop in speed using SDK 2.0.1.?

    I didn't check the return value when using SDK 1.0.0 but now I get an assert sometimes like described here, so I thought, this could be related to it. What do you think?

    Kind regards

        Gerry

  • Hi Hung

    BEARER_ADV_INT_DEFAULT_MS is set to 20. I didn't change any settings in nrf_mesh_config_bearer.

    GATT_PROXY is set to 0, so proxy.h is not included in my project. There's also no changes made to nrf_mesh_config_code. I use standard values and I don't use BLE.

    I only changed the settings in nrf_mesh_config_app.h. Could there be the problem?

    Here are my settings:

    SERVER_COUNT = 20

    CLIENT_MODEL_INSTANCE_COUNT = SERVER_COUNT + 1

    #define ACCESS_DEFAULT_TTL (SERVER_COUNT > NRF_MESH_TTL_MAX ? NRF_MESH_TTL_MAX : SERVER_COUNT)

    #define ACCESS_MODEL_COUNT (1 + /* Configuration client */ \
    1 + /* Health client */ \
    1 + /* Simple OnOff client (group) */ \
    2 + /* remote client */ \
    SERVER_COUNT /* Simple OnOff client (per server) */)

    #define ACCESS_ELEMENT_COUNT (1 + CLIENT_MODEL_INSTANCE_COUNT)

    #define ACCESS_SUBSCRIPTION_LIST_COUNT (ACCESS_MODEL_COUNT)

    #define ACCESS_FLASH_PAGE_COUNT (1)

    #define ACCESS_RELIABLE_TRANSFER_COUNT (ACCESS_MODEL_COUNT)

    #define DSM_SUBNET_MAX (1)
    #define DSM_APP_MAX (1)
    #define DSM_DEVICE_MAX (SERVER_COUNT) //(SERVER_COUNT)
    #define DSM_VIRTUAL_ADDR_MAX (1)
    #define DSM_NONVIRTUAL_ADDR_MAX (ACCESS_MODEL_COUNT + 1) 
    #define DSM_FLASH_PAGE_COUNT (2) 

  • You are having 21 element (21 client model) on one device ? I would suggest to test using less number of models/element on one device. Do you really need to have one client for each server ? If you have large number of server you want to control, I would suggest to use one single client and change the publication address instead. 

    Please test and verify you see the issue with stock light switch example.

  • Thanks for your answer Hung

    I think we don't really need one client per server, so I will eliminate this first and then set CLIENT_MODEL_INSTANCE_COUNT = 1. I only have to change the address using access_model_publish_address_set() when talking to the servers right? Does this affect the callbacks when receiving messages on the client also? 

  • No it shouldn't affect when you have the reply from the server. The server replies to the unicast address of the client, so should always receive it, doesn't matter which public address you currently pointing to. 

  • I changed the code to one client now, thanks for pointing this out.

    access_model_publish() still returns 4 (NO_MEM) when sending approx. 5-10 messages / Sek (8 Bytes).

    Also the assert still happens in core_tx_complete_cb_set()

    This didn't happen in SDK 1.0.0. What could be the reason for that?

Reply Children
  • Hi, 

     

    Are you publishing reliable or unreliable message ? If you have no mem , most likely it was reliable packet not getting ACKed and piled up the stack. 

    Could you check what cause assert in core_tx_complete_cb_set() and which error code ? The mesh stack is not a black box, please try to debug when you see an error. 

  • Hi Hung

    I'm publishing unreliable. the following line causes the assert in core_tx_complete_cb_set() :

    NRF_MESH_ASSERT(m_packet.bearer_bitmap == 0);


    I didn't find a way to debug this any further, since it only happens during havy communication but I placed a Log-message in front of this line to get the actual m_packet.bearer_bitmap. This suddenly returns 1 instead of 0 when this happens. I'm not publishing in an interrupt. 

    What could cause this?

    What is the meaning of bearer_bitmap?

  • HI Seger, 

     

    We found that it's an issue with the interrupt priority. If you configure the mesh with NRF_MESH_IRQ_PRIORITY_LOWEST, you should call your mesh API with that priority to avoid the operation being interrupted by other mesh activity. Another option is to set them down to Thread mode. 

    Here is the quote from one of our colleague: 

    The key points are:

    • in mesh_init(), the code set the mesh IRQ priority in “NRF_MESH_IRQ_PRIORITY_LOWEST”
    • which means the mesh internal scheduling is using “level 7”
    • in mesh SDK, we want all the mesh events are in the same level
    • however, if we call the mesh API in the main-context, the interrupt priority will be in “NRF_MESH_IRQ_PRIORITY_THREAD”
    • the level is “15”
    • the priority level is less critical than the “NRF_MESH_IRQ_PRIORITY_LOWEST”
    • which means, the procedure may be “interrupted by other mesh events”, and ruin the mesh stack status

    To solve this issue:

    1. Please config the mesh IRQ priority in “NRF_MESH_IRQ_PRIORITY_THREAD” instead of “NRF_MESH_IRQ_PRIORITY_LOWEST”
    2. So we can make sure all the mesh events have the same priority, and the nest interrupt between mesh events won’t happen
    3. And please modify the busy loop in the main() like this:

     

     for (;;)
     {
         if(m_relay)
         {
              simple_on_off_control_send_state(&m_control);
              m_relay=false;
         }
         bool done = nrf_mesh_process();
         if(done)
         {
             sd_app_evt_wait();
         }
    }

    You can read more about interrupt priority levels here.
  • Thanks for your answer Hung

    when I change the .core.irq_priority in mesh_init from NRF_MESH_IRQ_PRIORITY_LOWEST  to NRF_MESH_IRQ_PRIORITY_THREAD I get a problem initializing.

    My initialisation:

    initialize();
    execution_start(start);
    init_flash();

    .. flash read/write operation

    init_uart();
    key_update();
    provisioner_setup();

    In init_falsh(), the line:

    ret_code = flash_manager_add(&m_flash_manager, &custom_data_manager_config);

    returns 4 (no memory)

    Also I have problems with the code, that needs to be added to the main() loop. 

    simple_on_off_control_send_state() and m_relay are not defined.
  • Hi Seger, 

     

    I'm not 100% sure why you get no memory. You may need to step into the code and check what throw error 4. 

    When you change the priority to Thread mode, you need to update the main loop, please follow the guide here

Related