This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

Time for configuring node increases after deleting from mesh network

Hi guys,

I have the same issue as coca1989 had 2 years ago. Did anybody found a solution?

I am using health model and simple message model. Client and Provisioner are running on one device (demoboard). I can provision and configure upto 5 server nodes (dongles). I get health events from each connected server (all 10 seconds) and I can send and receive small messages on all server nodes.

Now, I would like to remove nodes from the mesh network and reconnect them (reprovisioning and reconfiguration). This are the steps that I am doing:

  1. config_client_server_bind() and config_client_server_set() to the server node I would like to remove from network
  2. config_client_node_reset()
  3. the server gets the node reset event (CONFIG_SERVER_EVT_NODE_RESET) from client and performs node_reset() with:  mesh_stack_config_clear() and mesh_stack_device_reset()
  4. the server responds to the client with CONFIG_CLIENT_EVENT_TYPE_CANCELLED and I do dsm_devkey_delete()

After removing the server node, I can reprovision and reconfigure the node successfully (getting health events and send/receive messages). But the configuration takes longer then the first time. Repeating this process (removing node and reconnecting) increases the configuration time each time.

Here is a time table:

First configuration: 2-3 seconds
Second configuration (after removing node from mesh): 10-11 seconds
Third configuration (after removing node from mesh):20-30 seconds
Fourth configuration (after removing node from mesh): 45 -50 seconds
Fifth configuration (after removing node from mesh): >80 seconds

This is reproduceable. Rebooting the client/provisioner device after removing a server node reduces the configuration time back to 2-3 seconds, but I do not get health events and no messages.

During reconfiguration (after removing the server from network) I am getting SAR mesh events on the server node. At the first configuration (fresh device) I dont have this SAR events.

I guess I have to delete more on client side? Maybe the simple message or health is still active on the last address handles?

  • 1. Replay list size is controlled by `REPLAY_CACHE_ENTRIES` define. Define this macro with suitable size in your application's `nrf_mesh_config_app.h` file. The size is limited by the free RAM memory your application has.

    2. Beware a pitfall of frequently un-provisioning and re-provisioning nodes with changing of unicast address: Once the entry for a certain source address is added to a replay list, this entry is not removed until the device is reset OR device undergoes IV index update cycle. Once the replay list is full, the device will stop accepting messages from the new addresses that are not present in the replay list entries.

    3. Once you unprovision the node, remember to remove the address from the DSM, otherwise, you will end up filling available address space in the DSM module as well.

    4. Address selection for re-provisioning: A node can have multiple elements. So, when you are re-provisioning a node, you have to exclude all the addresses that were present on a node. Irrespective of whether your node has one or more elements, you have to ensure that the elements on the node do not receive duplicate addresses (i.e. the address that is already in use on some active node in the network).

    5. The procedure would be:
    a. Unprovision the node
    b. Remove the address added for publication from the DSM using DSM API. Refer to `config_server.c` module's model publication set handlers to see how this is done.
    c. Use the fresh address for the node such that the address of the primary (and secondary, if any) element on the node is not already used on any other node in the network.
    d. Add this address to DSM and use it for publishing.

  • Hi, thanks for answering.

    I tried to do what you discribed, but no chance. I have still the increasing config time after reprovision.

    There is a m_addresses array in dsm. This keeps the infomation of address handles with publish and subscription count.

    My server node runs with address 500. After removing the node I do:

    On Server node:

      access_model_handle_t model_handle;
      dsm_handle_t publish_addr_handle = DSM_HANDLE_INVALID;
    
      //REMOVE PUBLISH Simple Message
      access_model_id_t sm_model_id;
      sm_model_id.company_id = MYCOMPANY_ID;
      sm_model_id.model_id = 0;
      LIBBTD_ERROR_CHECK(access_handle_get(0, sm_model_id, &model_handle));
        
      LIBBTD_ERROR_CHECK(access_model_publish_address_get(model_handle, &publish_addr_handle));
      LIBBTD_ERROR_CHECK(dsm_address_publish_remove(publish_addr_handle));
      
      //REMOVE PUBLISH Health   
      access_model_id_t health_model_id;
      health_model_id.company_id = ACCESS_COMPANY_ID_NONE;
      health_model_id.model_id = HEALTH_SERVER_MODEL_ID;
      LIBBTD_ERROR_CHECK(access_handle_get(0, health_model_id, &model_handle));
      
      publish_addr_handle = DSM_HANDLE_INVALID;
      LIBBTD_ERROR_CHECK(access_model_publish_address_get(model_handle, &publish_addr_handle));
      LIBBTD_ERROR_CHECK(dsm_address_publish_remove(publish_addr_handle));
    
      mesh_stack_device_reset();

    On client/provisioner:

    //After response event of config_client_node_reset()
    LIBBTD_ERROR_CHECK(dsm_address_publish_remove(m_addr_handle));
    LIBBTD_ERROR_CHECK(dsm_devkey_delete(m_devkey_handle));

    The publish count of m_address in dsm is now handle correctly (add/remove).


    I did another test using a third,fresh device as server node (B) after unprovision of first server device (A):

    1. Power off server node A
    2. Power on server node B
    3. Client/Provisioner start proviosing with same address (500) from server node A for server node B

    The same problem, the configuration time increased from 3 sec. to 10-15 sec. with SAR FAILED Events.

    My REPLAY_CACHE_ENTRIES is set to 40. 

  • One more strange behaivor:

    Like I wrote aboute I do on provisioner/client device after unprovision the server node:

    //After response event of config_client_node_reset()
    LIBBTD_ERROR_CHECK(dsm_address_publish_remove(m_addr_handle));
    LIBBTD_ERROR_CHECK(dsm_devkey_delete(m_devkey_handle));

    dsm_address_publish_remove() is new in my code. Now I am getting a NRF_MESH_ASSERT (NRF_ERROR_NOT_FOUND) on reboot the provisioner/client device. This happens in mesh_stack_init() at access_flash_config_load() in restore_models(). 

    Looks like the publish for my address/address handle still exists even though I deleted it before rebooting after unprovisioning the server node. 

  • About increasing configuration time:
    I think you have misunderstood the point. The point is, you should not use "the same address" (i.e. address: 500) while re-provisioning a node. You must use different addresses while re-provisioning a node.

    About assert:

    JeffZ said:
    Looks like the publish for my address/address handle still exists even though I deleted it before rebooting after unprovisioning the server node. 

    Yes, this happens because you have removed the publish address from the DSM, but it has not reset the model publication settings. Therefore, model publication settings still have an address handle information that no-longer exist.

    So after doing:

      publish_addr_handle = DSM_HANDLE_INVALID;
      LIBBTD_ERROR_CHECK(access_model_publish_address_get(model_handle, &publish_addr_handle));
      LIBBTD_ERROR_CHECK(dsm_address_publish_remove(publish_addr_handle));

    You should use `access_model_publish_address_set()` to set the new publish address handle. If the model publication is configured once, it can't be disabled ( this is the shortcoming of the current `access_model_publish_address_set()` API), so, once the publication is configured, the publish address handle must be valid and should exist in the DSM.

    Unless a new publish address handle is available and configured in the model, the old one should not be deleted.

  • Mittrinh, thanks for your response.

    I understood you well, but I am searching a way to re-use addresses. Just using a new address is not acceptable. I dont want to restart the device (provisioner/client) each time after reaching 'REPLAY_CACHE_ENTRIES'.

    Would this be possible?:

    1. First provisioning of fresh server node with address 500.
    2. Setting handles (lets call it 500.handles) and configuration models (heath, small message, etc.) for node 500
    3. After a while deleting node 500.
    4. Re-provisioning node with 501
    5. Updating 500.handles with new address 501 (500.handles -> 501.handles)
    6. Reusing of address 500 for next node

    If this way is possible, what would be the best practice to do so?
    How I can free address 500 for next nodes?

    Is using always a new address (uint16_t) and/or restarting the provisioner/client after reaching REPLAY_CACHE_ENTRIES your final answer?

    Thanks,
    Jeff

Related