This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

Very high response times of reliable unicast messages.

I adapted the light switch demo of the nRF5 SDK for Mesh for sending a series of reliable on/off messages sequentially from an nRF52840 Preview DK to 20 Thingy:52 servers via send_reliable_message(). In this scenario I measure in most cases response times between 20 to 50 milliseconds, which is quite acceptable. However in about one of 20 cases there is a significant higher response time of many seconds. It seems that the delay in this cases is always a multiply of 5 seconds, so I get 5, 10, 15 and so on seconds response time in these cases. Is this a normal behavior of the stack, and do I therefore have to accept these sporadic long response times? Or is there maybe a parameter which can be tweaked in order to get a better performance with reliable messages?

Parents
  • Hi Armin, 

    No I don't think it's a normal behavior. Do you see the same problem when you test with unmodified version of the light switch examples ? 

    Do you see the same problem if you have less number of nodes ? 

     

    Did you run anything else beside mesh on the node ? Running BLE or proxy on the node would affect the mesh performance. 

     

    The mechanism to resend notification is similar to trickle algorithm the interval is multiple by ACCESS_RELIABLE_BACK_OFF_FACTOR by each retry. But your 5-10-15 seconds observation is quite strange, wouldn't match with what we have. 

    Which TTL value did you set ? The timeout is calculated by the TTL value. 

    Did you send the reliable message to a unicast address or to a group address  ?

    Also do you see the packet arrive on the peer device before that 5 -10 .. seconds ? We need to find if it was the original packet missing or the response packet missing. 

  • The behavior is independent from the number of nodes. I tested with 3, 5, 20 and 30 nodes.

    The messages are sent from the light switch client node, which runs on the nRF52840 Preview DK with project light_switch_proxy_client_nrf5284_xxAA_s140_6.0.0. By that it is also the proxy for commissioning via Smartphone. All server nodes running on Thingy:52 with project light_switch_proxy_server_nrf52832_xxAA_s132_6.0.0, so all servers are also proxy.

    The TTL value is the default set in nrf_mesh_config_app.h #define ACCESS_DEFAULT_TTL (SERVER_NODE_COUNT) whereas SERVER_NODE_COUNT is set in light_switch_example_common.h as #define SERVER_NODE_COUNT (30)

    The reliable messages are sent to a unicast address. Group addresses can only be used with unreliable messages!

    Yes, the packet arrives at the servers always very fast. An LED is switched on there when the message arrives and I see almost no visual delay between the time the message is sent which is indicated by a log output at JLink Viewer and the LED lights on. By that the high delay comes actually from the response!

  • It makes no difference whether the Smartphone is connected or not.

    You mention you have to check the advertising interval. Please tell me exactly which parameter I shall provide to you in case.

    I'm sending multiple reliable unicast messages to multiple nodes. After each send I wait for the reply, wait an additional time (200ms) and then send the next.

    I could attach the source here if it helps.

  • yes, please attach the code. 

    You can use a phone to see the advertising interval (nRFConnect app)

  • According to nRFConnect App advertising intervals are significantly different between client and server:

    • Server (Thingy:52): 500ms
    • Client (nRF52840 Preview DK): 10ms

    Is it possible to attach a complete sourcecode file here? I won't prefer pasting my complete main.c here.

  • (Sorry for the delay in the response, we are on the peak of summer vacation in Norway)

    10ms is very short interval. This will largely affect the mesh operation. It must be something wrong here. We will do some investigation. 

    Yes you can post a full file, by using Insert -> Insert Image/video/file , then you can attach the file. We prefer to have the full project. Just in case any setting in the compiler option was wrong. 

Reply Children
  • Hi Armin,

    I can see that you use generic_on_off instead of simple_on_off as in our example. Did you use the generic on off model provided in our thingy example or your own implementation of the model ? You use that model on the server as well ? I would need to zip your whole light switch projects (because there is some definition in the common header files) or better just zip the whole SDK. You may want to delete the compiled binary to reduce the size of the .zip.

  • Did you make some progress in your investigations for this issue?

    Regarding your questions:

    1. I use the generic on off model provided in the thingy example.

    2. I use that model on the server as well.

    I tried to attach the whole affected SDK as you requested. However, although I stripped off all binaries, builds and documentation the .zip still has 53 MB. Obviously this is too big for your upload infrastructure as uploading fails with an error as follows:

    Alternatively here is at least the .zip with the whole light switch projects:

    /cfs-file/__key/communityserver-discussions-components-files/4/light_5F00_switch.zip

  • Hi Armin,

    Very sorry for the delayed response. Hung has been on vacation for the past two weeks & it seems we forgot to follow up on this case. I am very sorry about that. Hung will be back on Monday, but I will try my best to help you out & hopefully we can figure out the issue before he is back. Are you using mesh sdk v2.0.1?

    Kind Regards,

    Bjørn

  • Yes, according to the RELEASE_NOTES.md we use

    BLE Mesh v2.0.1
  • Hi,

    We are very sorry for the delays over the past days. I have been looking at this issue today.

    I think that your reported 10 ms advertising interval is way too short for letting the Mesh stack run properly. Did you test increasing it?

    It would be great if we could build and reproduce this issue locally, but it seems many of the files that are needed to build the project are still missing... Usually we ask for a minimal example showing the erroneous behavior. It may be an option to strip some of the unneeded functionality away from the project instead. Or, we would need all the included c and h files for building the projects.

    If I have understood things correctly, the projects are:

    For nRF52840 DK: examples\light_switch\proxy_client\light_switch_proxy_client_nrf52840_xxAA_s140_6_0_0.emProject

    For the Thingy:52s: examples\light_switch\thingy_provisioning_demo_generic_OnOff_BLINK\light_switch_proxy_server_nrf52832_xxAA_s132_6_0_0.emProject

    Can you confirm or correct that?

    Regards,
    Terje

Related