This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

Lost callback messages when the function receive many simultaneous messages

Hello!

I am developing a solution that has a gateway and about 10 devices.

The devices has some sensors and their are connected in gateway. Actually I catch the sensor values and send to gateway, this feature works fine, but when the solution runs with all devices, their sends the message in the same time and some messages are losted in the handle_cb function (gateway).

Someone can help me to resolve or improve this problem, because I wanna have a error rate < 3%.

Parents
  • Hi,

    Can you elaborate the question? For instance, which radio protocol are you using, which SDK, more about your code etc.?

  • Hi!

    I am using a light switch example from SDK14.

    Instead of the server send a command to turn on/off the lights, I inserted some sensors and a timer that sends the sensors values for each 10 seconds in a json object

    In the client side, there is a handle_cb function that receives the messages from all servers, but when the messages arrived in the same time, some messages are losted.

    I did the follow test: When I turn on the servers with a time interval (1 sec, for example), the client side receives all the information with a gap (without loss). However, when I turn on all the servers in the same time, the gateway lost some messages.

  • Hi,

    I am still not sure which example and even which radio protocol you are referring to. Can you specify the full path to the example within the SDK? Are we talking about BLE mesh light switch example, or something else? If mesh, which version of mesh DSK are you using?

  • Hi,

    My solution is based in the ight switch from SDK Mesh 1.0. (BLE mesh light switch example)

  • Hi Renato,

    OK. So the problem is in the client/gateway, which is connected to 10 light switches based on the Mesh SDK example? Can you describe in more detail what you have done and how/where you see the message being lost

    As this is in your code it would also be very useful if you could upload it here, so that we can try to understand it. Particularly seeing this handle_cb function you refer to as the problem would be useful. When you debugged, how did you see that the data is lost in this function? How is it lost?

  • Hi Einar,

    OK. So the problem is in the client/gateway, which is connected to 10 light switches based on the Mesh SDK example?

    Yes, the problem is in the client/gateway.


    Can you describe in more detail what you have done and how/where you see the message being lost

    My solution is based in light switch example from SDK Mesh.

    Instead of the server send a command to turn on/off the lights, I inserted some sensors and a timer that sends the sensors values for each 10 seconds in a json object, using the packet_tx function present in access.c file.

    In the client side, there is a handle_data_cb function, this function is a callback that receives the messages from all servers, but when the messages arrived in the same time, some messages are losted.

    This is the function that receives the messages in the gateway side:

    static const access_opcode_handler_t m_opcode_handlers[] =
    {
    {{SIMPLE_ON_OFF_OPCODE_DATA, ACCESS_COMPANY_ID_NORDIC}, handle_data_cb} //foi criado um novo handle para os dados
    };

    static void handle_data_cb(access_model_handle_t handle, const access_message_rx_t * p_message, void * p_args)
    {
    char *rec;
    char src[3] = {0}, *str;
    uint8_t type;

    __LOG(LOG_SRC_APP, LOG_LEVEL_INFO, "[%d] Mensagem original: %s \n", ++x, p_message->p_data);

    }

    I did the follow test: When I turn on the servers with a time interval (5 sec, for example), the client side receives all the information with a gap (without loss). For example: I turn on the first one server and 5 sec. after I turned on the first one, I turn on the second one... 

    However, when I turn on all the servers in the same time, the gateway lost some messages. I know it because I inserted in the json object a counter that show how much messages were sent from server and another counter that show how much messages were received in the gateway from each server.

Reply
  • Hi Einar,

    OK. So the problem is in the client/gateway, which is connected to 10 light switches based on the Mesh SDK example?

    Yes, the problem is in the client/gateway.


    Can you describe in more detail what you have done and how/where you see the message being lost

    My solution is based in light switch example from SDK Mesh.

    Instead of the server send a command to turn on/off the lights, I inserted some sensors and a timer that sends the sensors values for each 10 seconds in a json object, using the packet_tx function present in access.c file.

    In the client side, there is a handle_data_cb function, this function is a callback that receives the messages from all servers, but when the messages arrived in the same time, some messages are losted.

    This is the function that receives the messages in the gateway side:

    static const access_opcode_handler_t m_opcode_handlers[] =
    {
    {{SIMPLE_ON_OFF_OPCODE_DATA, ACCESS_COMPANY_ID_NORDIC}, handle_data_cb} //foi criado um novo handle para os dados
    };

    static void handle_data_cb(access_model_handle_t handle, const access_message_rx_t * p_message, void * p_args)
    {
    char *rec;
    char src[3] = {0}, *str;
    uint8_t type;

    __LOG(LOG_SRC_APP, LOG_LEVEL_INFO, "[%d] Mensagem original: %s \n", ++x, p_message->p_data);

    }

    I did the follow test: When I turn on the servers with a time interval (5 sec, for example), the client side receives all the information with a gap (without loss). For example: I turn on the first one server and 5 sec. after I turned on the first one, I turn on the second one... 

    However, when I turn on all the servers in the same time, the gateway lost some messages. I know it because I inserted in the json object a counter that show how much messages were sent from server and another counter that show how much messages were received in the gateway from each server.

Children
  • Hi Renato,

    The way I understand you the nodes only send a message every 10 seconds, but if turned on at the same time you will lose some messages. There is a random delay which should reduce the number of collisions, but it will happen from time to time (more often with more packets and more nodes). What retransmission count and TTL do you use? It would be interesting to adjust (increase) those and see if it helps.

    There has been a lot of improvements in the mesh SDK since version 1, so you should consider migrating to the latest release. This is generally a good idea, but the main reason I mention it is that the retransmit count is limited on version 1.

  • Hi Einar,

    The way I understand you the nodes only send a message every 10 seconds, but if turned on at the same time you will lose some messages.

    Yes, it happens because the callback function receives the messages from servers at the same time.

    What retransmission count and TTL do you use?

    I didnt any change in the code responsible to the retransmission count and TTL. Probrably I am using the default from light switch exemple (SDK Mesh).

    Can you explain in the light switch code where I can find this feature and how I can change it to reduce the quantity of messages losted?

  • Hi,

    Renato Silva said:
    Yes, it happens because the callback function receives the messages from servers at the same time.

    OK. Now I read your other comment again where you stated that "In the client side, there is a handle_data_cb function, this function is a callback that receives the messages from all servers, but when the messages arrived in the same time, some messages are losted.", so this means that you actually get the packet then? If so, you can ignore my previous thoughts.

    Just to clarify: Is the case that you get a callback for all expected packets, but the data you print/log in handle_data_cb is corrupted in some cases? If so it would be very interesting to see more of your client code.

    Renato Silva said:
    Can you explain in the light switch code where I can find this feature and how I can change it to reduce the quantity of messages losted?

     The light switch example sets the default TTL to SERVER_COUNT, which again is 3 if you have not changed it. Number of retransmissions is set to 1.

Related