NCS Mesh: Relay function delays response messages

We are facing the problem that when relay feature is enabled on a mesh device (which is running nRF Connect SDK firmware), device response is delayed in a range of seconds.

To reproduce the problem we used light example project from NCS v1.8.0 and generated additional traffic using other device, which transmits unsegmented packets each 200ms. In between that traffic TTL packets are transmitted to which response gets delayed.

If relay feature is off, everything seems to work as expected.

Parents
  • Hi,

    Could you elaborate on what you mean with "enabling the relay feature"? Bluetooth Mesh uses message relaying to send messages from device to device, so this "feature" will always be present when using the protocol https://developer.nordicsemi.com/nRF_Connect_SDK/doc/latest/nrf/ug_bt_mesh_concepts.html#relays.

    The flooding based message relay will cause a lot of redundant traffic that may impact the throughput and reliability of the network, so if you're generating additional traffic and/or flood the frequency with too much data there will be some delay.

    Kind regards,
    Andreas

  • Hi Andreas, 

    we are aware that relaying messages is  a common task in Bluetooth mesh. But you can enable and disable this feature in a device to reduce the flooding. 

    Our observation was as follows:

    In a setup with multiple devices, of which none is relaying (each  is in direct range of each other) we increased the traffic to approximately 5 messages/second by simply forcing one of the devices to send out a message each 200ms (NOTE: no device is suscribed to these messages, so they should be handled on the network layer only).

    A simple GET request to a device that supports the DTT-server model is answered immediately with nearly no delay (30-100ms) which is fine.

    If we activate the relay feature exactly on that device which we poll (which means, that it now forwards one message every 200ms), then the STATUS reported because of a simple GET request is delayed more and more until the delay is about 8 seconds. 

    I cannot imagine that this is the standard behavior. Relaying a message every 200ms should never have such an crazy impact.

Reply
  • Hi Andreas, 

    we are aware that relaying messages is  a common task in Bluetooth mesh. But you can enable and disable this feature in a device to reduce the flooding. 

    Our observation was as follows:

    In a setup with multiple devices, of which none is relaying (each  is in direct range of each other) we increased the traffic to approximately 5 messages/second by simply forcing one of the devices to send out a message each 200ms (NOTE: no device is suscribed to these messages, so they should be handled on the network layer only).

    A simple GET request to a device that supports the DTT-server model is answered immediately with nearly no delay (30-100ms) which is fine.

    If we activate the relay feature exactly on that device which we poll (which means, that it now forwards one message every 200ms), then the STATUS reported because of a simple GET request is delayed more and more until the delay is about 8 seconds. 

    I cannot imagine that this is the standard behavior. Relaying a message every 200ms should never have such an crazy impact.

Children
  • Hi,

    Thank you for elaborating on this and explaining the setup a bit more. I will look into this and discuss these numbers with the Mesh team. I will get back to you as soon as we land on anything conclusive/if we need more information

    Kind regards,
    Andreas

  • Hi,

    The initial thing we want you to check if you observe the same behavior with NCS v2.2.0? In older versions there might be some delays depending on the traffic.

    Kind regards,
    Andreas

  • Hi Andreas,

    we have compared various NCS versions. The newer the version, the better the results, but even in the NCS v2.2.0 there are still some delays.

    Summary of the results (same setup as described before, delays are checked by sending a Config Default TTL Get message and receiving Config Default TTL status message):

    NCS v1.8.0:  delay of up to 8 seconds between Get and Status message. The delay is increasing over a period of 25 seconds after enabling the relay feature until it stays constant at 8 seconds.

    NCS v2.1.0: delay of approximately 1.3 seconds + missing some STATUS messages (no reply at all). 

    NCS v2.2.0: reduced delay, but random between 30ms (immediate) and 900ms, most frequent values are in the range from 100ms to 350ms.

    NCS v2.2.0 / double traffic (1 message each 100ms): the delay  starts to increase in a range of about 2-3 seconds. It also seems that the one or other STATUS messages gets lost (no reply), even if the relay is deactivated (but this we have to investigate a bit more, maybe our scanning device is missing something). 

    NCS v2.1.0 / double traffic (1 message each 100ms): no change in behavior.

    FYI: we use the following network parameters:

    network transmit and relay retransmit: 1 retransmission after 40ms

    Kind regards

     

  • urieder said:
    we have compared various NCS versions. The newer the version, the better the results, but even in the NCS v2.2.0 there are still some delays.

    Thank you for sharing the results for the different versions

    One more thing that got brought up when discussing your results just now was the Publish retransmit count (Typically set to 1 retransmit, that is each message contents is sent a total of two times, i.e. as two separate messages). It could be that a buffer containing outbound packets fills up due to a high retransmit number, causing longer and longer delay until the buffer is filled so you get constant delay (but see some packet loss).

    Can you see how large this number is configured to be in your setup and change it if its too high?

    Kind regards,
    Andreas

Related