Bluetooth mesh configuration failure

board: nrf52832

nRF5 SDK version: v17.0.0

nRF5 SDK for Mesh version: v5.0.0

softdevice: S332

application: light switch client + light switch server + provisioner + ble_ant_app_hrm + ble_app_uart_coexist

My application have two characters to choose: Provisioner/self provisioned Client and Server, user can send UART command to select one of them to initialize on device,

and I delete normal BLE  (NUS service)

The problem is, when I set one device as Provisioner/self provisioned Client, and two devices as Server,

the first Server would be provisioned and configured successfully, and the provision of second Server would success, but the configuration of second Server failed occasionally

But, if I turn the first Server off before the provision process of second Server start, the provision and configuration of the second Server would all success

(Once Provisioner/self provisioned Client detect one unprovision Server, ANT+ channel will close automatically)

So I check configuration process, and found it often stuck in receiving opcode CONFIG_OPCODE_MODEL_APP_STATUS or CONFIG_OPCODE_APPKEY_STATUS, although Server receive opcode CONFIG_OPCODE_MODEL_APP_BIND or CONFIG_OPCODE_APPKEY_ADD and reply successfully (return NRF_SUCCESS),

but Provisioner/self provisioned Client didn't receive this reply, although Provisioner/self provisioned Client retry send APP_ADDKEY and wait for an ACK from Server for 2-3 times,

this situation keeps occur

Log:

Provisioner/Client print config_step_execute() in node_setup.c to check what opcode it receive (the value after ":" is the status of message)

Server print send_reply() in config_server.c to check the status of transmission and what opcode it send  (the first value is return value of send_reply(), the second value is opcode id it send)

  • the process of configuration of the first Server (success)        Provisioner/Client                                                                   the first Server                                                               

                          

  • the process of configuration of the second Server (fail)

                    Provisioner/Client                                                                                                     the second Server

                         

 

Then I check  function scanner_rx() in scanner.c of Provisioner/self provisioned Client, filter MAC address of the second Server to see if Provisioner/self provisioned Client receive messages from the second Server, I found Provisioner/self provisioned Client keeps receive messages of the second Server like below 

  • Provisioner/Client:

            one "Get" means get one message from the second Server ( recognized by MAC address) ( scanner_rx() in scanner.c)

            "receive opcode: (mesh_msg_handle() in access.c)            

            

  • the second Server

             "receive opcode: (mesh_msg_handle() in access.c)   

             "access_model_reply()" (send_reply() in config_server.c)             

            

  • the first Server:have been provisioned and configured, help me to print the payload of the second Server ( recognized by MAC address), 1-31 bytes are payload, the last byte is header type, '#' is the end of line  ( scanner_rx() in scanner.c)

            

it seems like Provisioner/self provisioned Client can receive messages from the second Server but the ACK of Configuration, it confuses me, is there any way to analyze the payload from the second Server?  or Is there any chance that the ACK message of Configuration is filtered out by application of Provisioner/Client?

BTW, if all devices have been provisioned and configured, the communication of two characters works well,

and I have tried increase advertising interval to 100 ms and this (change SCANNER_BUFFER_SIZE to 1024), but it didn't work to me.

  • Hey Erin!

    It seems that your provisioner is incrementing the address with 1, while there are two elements in the servers. Which makes the addresses overlap (one node gets one address along with the first element, and the second element of that node gets the next address).That might what is making a mess.

    In either case it seems like we are looking at address collisions here, which typically makes very undefined behavior. 

    Best regards,

    Elfving

  • Hi Elfving,

    Thanks you for your reply!

    I modify #define ACCESS_ELEMENT_COUNT from 2 to 1 (nrf_mesh_config_app.h) on server (migration ver), and it works!

    but in my project  ble_ant_app_hrm + sdk_coexist(light switch client) + provisioner(light switch client self provision), it contain two mode to switch, when it switch to provisioner/client mode, 2 elements should be initialized, and when it switch to server mode, 1 elements should be initialized

    so, I do some changes:

    modify #define ACCESS_ELEMENT_COUNT 1 (nrf_mesh_config_app.h)

    add#define CLIENT_ACCESS_ELEMENT_COUNT 2 (nrf_mesh_config_app.h)

    ACCESS_ELEMENT_COUNT ===> CLIENT_ACCESS_ELEMENT_COUNT (provisioner_helper.c)

    ACCESS_ELEMENT_COUNT ===> CLIENT_ACCESS_ELEMENT_COUNT (access.c)

    ACCESS_ELEMENT_COUNT ===> CLIENT_ACCESS_ELEMENT_COUNT (composition_data.h)

    ACCESS_ELEMENT_COUNT ===> CLIENT_ACCESS_ELEMENT_COUNT (composition_data.c)

    functions related to provisioner/client mode, ACCESS_ELEMENT_COUNT (value:1) ===> CLIENT_ACCESS_ELEMENT_COUNT (value:2) 

    it solved this provision issue, but is this modification ok? or do you have any suggestions for me to try?

    also,"your provisioner is incrementing the address with 1", what address do you mean? and why the second element of the first server gets the next address but it didn't show any log?

    Regards,

    Erin 

  • Hey Erin!

    erin_hong said:

    also,"your provisioner is incrementing the address with 1", what address do you mean? and why the second element of the first server gets the next address but it didn't show any log?

    The unicast address. The address to to a node is for instance 0x003, which is also the address to its first element. If the node had 3 elements their address would automatically be 0x003,0x004, 0x005. If the provisioner has already given 0x004 to another node then we will have overlap and address collisions and generally undefined behavior.

    That is why it works if you start provisioning the server with 1 element first: the first server will eg. get address 0x003 (along with element 1), and the second server gets address 0x004(along with its first element, and the second element gets address 0x005). If you start with the server with two elements then you get addresses 0x003 and 0x004 on the first server and 0x004 on the second server. Which leads to address collisions.

    erin_hong said:

    it solved this provision issue, but is this modification ok? or do you have any suggestions for me to try?

    Yeah that sounds like something that could work. Though this isn't a very scalable solution as you can see. This example uses a static provisioner, which isn't meant to be used as an embedded provisioner or something in for instance a production context.

    As the documentation says: "It works in a fixed, predefined way and can be used as the static provisioner with the following examples(...)". And "The static provisioner has its own limitations and is provided as a tool to evaluate SDK examples without the need to use a mobile application provisioner."

    If you'd want an embedded provisioner I would advice you to make your own and not base it completely on the provisioner example, but I would rather recommend that you didn't use an embedded device as a provisioner at all. It is such a powerful device that something like a cellphone or host app would be better. An exception would be if you had some sort of gateway unit as an interface between an IP-network and a mesh network.

    For a proof of concept, the nRF Mesh app can also be a great asset. The provisioner example is great if you don't change any of its assumptions, this was done here though.

    Best regards,

    Elfving

  • Hi Elfving,

    Thank you for the information! I will consider it.

    Regards,

    Erin

Related