Bluetooth mesh communication failure probability

Hello,

I am developing with reference to "Light switch example".
I am verifying communication reliability with one server and one client.

Communication failure occurs with a probability of about 10%.
I would like to ask you to comment on the validity of this failure rate.

The outline of the conditions is as follows.
 - Only one server and one client are used in the experiment
 - Send from server to client once per minute (Communication failure rate is aggregated daily)
 - At the access layer, use Unacknowledged messages
 - Applications use unsegmented 9-byte messages
 - The application is not retrying to send a message

When I tried segmenting the message(48byte), communication failure no longer occurred.
Therefore, I think that the cause of communication failure of 10% is not the following analog factors,
but the device status such as busy.
 - The physical communication distance has been exceeded
 - The device is located at the dead point
 - There is strong jamming

I referred to the following in Mesh Profile Bluetooth Specification Revision: v1.0.1
3.9.3 Secure Network beacon
Since the Secure Network beacon is transmitted every 10 seconds,
I think that it will fail when this transmission timing and reception timing are simultaneous.

I consider sending a Secure Network beacon as one of the "device status busy".
Can you tell me how long it will take?
If there is a period during which reception is not possible for about 1 second,
the communication failure rate is considered reasonable.

If there are other conditions / conditions where the device cannot receive communication,
it would be helpful if you could give me some hint.

Best regards.

Parents
  • Hello! Sorry about the delay.

    Communication failure occurs with a probability of about 10%.

    This is surprising. It is understandable that there is some failure rate, as GATT-proxy advertising or other BLE activity, flash writing etc. might make the radio miss one package. And if the publish settings are configured to only send one unacked message once, there might be some losses between the nodes. The beacon interval can be as low as 10s, but that still makes your estimation of 10% seem very high. 

    When it comes to the segmented kind not having any errors that makes sense to me, as segmented messages are ACKED, which results in you not knowing if some messages were in fact lost. 

    You should note however that with multiple nodes in the network you will decrease this probability of the message not making it across, as all the nodes will relay the message forward, giving the receiving node multiple opportunities to catch it. 

    You should try it again in a less noisy environment, and disable GATT proxy.

    Regards,

    Elfving

  • Hello,

    Thanks for your comments and advice.

    The information that the device will not be able to receive while flash writing is useful to me.
    Thank you for information.


    Please give me additional information. Under the conditions under discussion,
    what communication failure rate do you think is reasonable?


    About "Communication failure occurs with a probability of about 10%.",
    Since the server was operating to send 9-byte messages 6 times in quick succession in one communication,
    there was a possibility that it could not be sent in the first place. I'm sorry I failed to tell you.
    And, in my experiment, "the server receiving an ACK message (application layer)
    from the client" was defined as successful communication.(Means round trip)
    If it is one way(server -> client), it is estimated that the communication failure rate will drop a little more.


    I'm not familiar with GATT proxy, so I'll investigate and experiment for improvement.

    Best regards.

Reply
  • Hello,

    Thanks for your comments and advice.

    The information that the device will not be able to receive while flash writing is useful to me.
    Thank you for information.


    Please give me additional information. Under the conditions under discussion,
    what communication failure rate do you think is reasonable?


    About "Communication failure occurs with a probability of about 10%.",
    Since the server was operating to send 9-byte messages 6 times in quick succession in one communication,
    there was a possibility that it could not be sent in the first place. I'm sorry I failed to tell you.
    And, in my experiment, "the server receiving an ACK message (application layer)
    from the client" was defined as successful communication.(Means round trip)
    If it is one way(server -> client), it is estimated that the communication failure rate will drop a little more.


    I'm not familiar with GATT proxy, so I'll investigate and experiment for improvement.

    Best regards.

Children
  • Hello,

    Kazu said:
    The information that the device will not be able to receive while flash writing is useful to me.

    Not exactly that it can't receive while flash writing, but that the CPU gets stopped when running flash operations, which might lead to some packet loss. Though this would depend on the SDK. 

    Kazu said:

    what communication failure rate do you think is reasonable?

    Hard to say, but maybe one in a thousand, or 10 thousand. That would at least sound like it is in the right ballpark.

    Kazu said:
    If it is one way(server -> client), it is estimated that the communication failure rate will drop a little more.

    That does make sense, however it wouldn't be enough to explain the entire discrepancy.

    Regards,

    Elfving

  • Hello,

    Thank you for your comments and information.

    I will investigate the specifications and analyze by experiment
    with the following points as the differences from the expected value of the failure probability.
    I will post the results in this thread by the end of April. Please continue the support.
    - Negative influence of "sending a message 6 times in quick succession "
        Since the message to the client uses the group address,
        I think that the client sends the message after receiving it by the relay function.

    <Other information>
    At the time of the experiment, the GATT proxy function was disabled.

     >Not exactly that it can't receive while flash writing, but that the CPU gets stopped when running
     >flash operations, which might lead to some packet loss. Though this would depend on the SDK. 
    Thank you for the supplementary information. OK.

    p.s.
    Thank you for the information on the specific application example
    and the expected value information on the specific communication failure rate.
    It's exciting to know the use cases of technology.

    Best regards.

  • Hello,

    Kazu said:
     Since the message to the client uses the group address,

    As in the client having the group address, or the server? Using group addresses in the context of ACKing is dangerous road, as you wouldn't know the amount of receivers of a message. Segmented messages are ACKed when they are transmitted to a unicast address, either by a recipient or by a friend.

    Kazu said:
    - Negative influence of "sending a message 6 times in quick succession "

    How fast are we talking here? It might be that the buffer is full, and the messages are in fact not being sent. Have you checked for errors like these?

    And as I mentioned before about noisy environments, you could check if the noise of the environment by using the RSSI viewer in nRF Connect for Desktop, and see if anything stands out.

    Kazu said:
    Thank you for the information on the specific application example
    and the expected value information on the specific communication failure rate.

    I actually had that double checked, and it seems I was mistaken; these factors can get you packet loss of maybe a couple of percent. However, 10% still sounds high. But in the context of mesh this still wouldn't be too much of an issue, as relaying of messages would compensate. You could also send the messages as acknowledged (meaning that they require an ACK) which would make it reliable. You can also increase the retransmission of all messages, though this have downsides as it will increase the traffic in the network.

    Regards,

    Elfving

Related