IV Index and LPN devices that leave mesh

Hello,

We're developing an LPN product that will leave the mesh and stay off the mesh for up to 3 days (possibly more). When it rejoins the mesh, the central server/gateway could have some resets in between. 

Is there a chance the IV index will prevent the nodes from communicating once reconnected to the mesh? 

I am looking at this thread and wondering how much is safe to change:
https://devzone.nordicsemi.com/f/nordic-q-a/80558/bt-mesh-iv-update-parameters-timers

Thank you

Parents
  • Hi,

    IV Index update can only happen once every 192 hours (8 days). This means you should have no IV Index related issues if away from the network for 3 days. If the IV Index of the node is lagging too far behind for an IV Index update, the IV Index Recovery procedure will kick in, up to around 40 IV Updates behind. If further behind the node is unable to rejoin the network and must be reprovisioned.

    If you experience issues after a week away from the network, then there might be some issues with the IV Index Recovery procedure. After around 10 months you risk enough IV Index updates to have passed for the node to have completely lost the network.

    Regards,
    Terje

  • "If you experience issues after a week away from the network, then there might be some issues with the IV Index Recovery procedure."

    If this event happens what can be done to recover the IV index? Is there a method of manually triggering it? I've noticed some units stop talking all together to us after a couple of weeks separated but some messages do manage to come back through (mainly those sent by the LPN, not the other way). The only fix I've found is unprovisioning and reprovisioning once it starts up the advertisement. This likely won't be acceptable to our users so I'm hoping there's a safe way to send out the secure beacon, or increase the rate it goes out such that the scenario is less likely to happen. 

    From the link I posted they make a few suggestions about updating some defines.

    NETWORK_MIN_IV_RECOVERY_INTERVAL_MINUTES or even set it to zero for debugging purposes so that a node won't wait for timeout to run the IV Index Recovery procedure. (will this prevent said scenario from happening?)

    Our units are battery powered and have seasons where they're in use and not in use. If the whole mesh was powered off for up to 6 months at a time, would this pose any issue? I'm thinking we need a "storage mode" that would prevent the IV updates from happening. 

    I think I need to clarify the scenario a bit better:

    "IV Index update can only happen once every 192 hours (8 days). This means you should have no IV Index related issues if away from the network for 3 days."

    What could happen with our device is that it goes out of the mesh for 3 days, comes back, *maybe* talks back and forth with the server (sending stuff like battery level back), and then goes out of the mesh.

    A schedule could look like this (and why the 8 days is concerning)

    Day 1: Out of the mesh
    Day 3: Back on mesh briefly,leaves again
    Day 6: Back on mesh briefly,leaves again
    --Day 8: IV update happens--
    Day 9: Back on mesh but can no longer communciate as IV is now out of date


  • Hi,

    First of all, which SDK are you using? For new projects we recommend the nRF Connect SDK, while the older nRF5 SDK for Mesh is in maintenance mode and will not see new development. See our nRF Connect SDK and nRF5 SDK statement. The thread you refer to is about the nRF5 SDK for Mesh, which is the old solution.

    Are you currently experiencing issues with nodes losing track of the network, or is it a theoretical exercise? If you have issues in practice, then we can look further into those. For now I will assume it is a theoretical matter.

    Alex Ross said:
    If the whole mesh was powered off for up to 6 months at a time, would this pose any issue?

    If all nodes, including provisioners, relay nodes, everything is off, then there will be no IV Index Updates. The updates are triggered by network activity. Messages sent from a node are tagged with a sequence number, for replay protection, and IVI Update is triggered when a node gets close to depleting sequence numbers. With no activity, there should be no updates  triggered. Similarly, if network traffic is low, the updates may be much less frequent than once every 192 hours.

    Alex Ross said:

    What could happen with our device is that it goes out of the mesh for 3 days, comes back, *maybe* talks back and forth with the server (sending stuff like battery level back), and then goes out of the mesh.

    A schedule could look like this (and why the 8 days is concerning)

    Day 1: Out of the mesh
    Day 3: Back on mesh briefly,leaves again
    Day 6: Back on mesh briefly,leaves again
    --Day 8: IV update happens--
    Day 9: Back on mesh but can no longer communciate as IV is now out of date

    Thanks for elaborating. In order to perform the IV Index Recovery procedure, the node must listen for the Secure Network beacon, which is (on average) transmitted once every 10 seconds. It contains the network ID and the current IV Index. I see now that according to the Bluetooth Mesh Profile specification, in order to follow the current state of the IV Index, the LPN must poll its Friend node at least once every 96 hours (half the frequency of IV Index updates.) This is most likely due to the update procedure comprising of two main steps, each (at least) 96 hours long. If doing so, the node will keep track of the IV Index.

    If the LPN is not in a friendship, it will try to initiate one. If there has been IV Index updates, this will fail repeatedly, at which point the LPN should check Secure Network Beacons to see if an IV Index Recovery is needed. Some action may be needed from the application. Similarly if the LPN is in a friendship, but has not polled through an IV Index Update, some action may be needed from the application. I have reached out to our mesh development team for further details, and will get back to you when I know more.

    Regards,
    Terje

Reply
  • Hi,

    First of all, which SDK are you using? For new projects we recommend the nRF Connect SDK, while the older nRF5 SDK for Mesh is in maintenance mode and will not see new development. See our nRF Connect SDK and nRF5 SDK statement. The thread you refer to is about the nRF5 SDK for Mesh, which is the old solution.

    Are you currently experiencing issues with nodes losing track of the network, or is it a theoretical exercise? If you have issues in practice, then we can look further into those. For now I will assume it is a theoretical matter.

    Alex Ross said:
    If the whole mesh was powered off for up to 6 months at a time, would this pose any issue?

    If all nodes, including provisioners, relay nodes, everything is off, then there will be no IV Index Updates. The updates are triggered by network activity. Messages sent from a node are tagged with a sequence number, for replay protection, and IVI Update is triggered when a node gets close to depleting sequence numbers. With no activity, there should be no updates  triggered. Similarly, if network traffic is low, the updates may be much less frequent than once every 192 hours.

    Alex Ross said:

    What could happen with our device is that it goes out of the mesh for 3 days, comes back, *maybe* talks back and forth with the server (sending stuff like battery level back), and then goes out of the mesh.

    A schedule could look like this (and why the 8 days is concerning)

    Day 1: Out of the mesh
    Day 3: Back on mesh briefly,leaves again
    Day 6: Back on mesh briefly,leaves again
    --Day 8: IV update happens--
    Day 9: Back on mesh but can no longer communciate as IV is now out of date

    Thanks for elaborating. In order to perform the IV Index Recovery procedure, the node must listen for the Secure Network beacon, which is (on average) transmitted once every 10 seconds. It contains the network ID and the current IV Index. I see now that according to the Bluetooth Mesh Profile specification, in order to follow the current state of the IV Index, the LPN must poll its Friend node at least once every 96 hours (half the frequency of IV Index updates.) This is most likely due to the update procedure comprising of two main steps, each (at least) 96 hours long. If doing so, the node will keep track of the IV Index.

    If the LPN is not in a friendship, it will try to initiate one. If there has been IV Index updates, this will fail repeatedly, at which point the LPN should check Secure Network Beacons to see if an IV Index Recovery is needed. Some action may be needed from the application. Similarly if the LPN is in a friendship, but has not polled through an IV Index Update, some action may be needed from the application. I have reached out to our mesh development team for further details, and will get back to you when I know more.

    Regards,
    Terje

Children
  • Thank you for the detailed response!


    First off I am using the nRF 5 Mesh SDK/17 (co-existence). I am a little hesitant to switch to nRF Connect SDK because of the project development timeline. I'll spend time investigating the new SDK and see what the transition would look like. Power consumption is the biggest concern and something I hope having an RTOS running won't be impacted by- are there numbers for LPN current draw? I assume the phone app compatibility hasn't changed? E.g. the iOS nRF Mesh has worked well for us. The Android version has not, but that's a seperate issue.

    "If the LPN is not in a friendship, it will try to initiate one. If there has been IV Index updates, this will fail repeatedly, at which point the LPN should check Secure Network Beacons to see if an IV Index Recovery is needed. Some action may be needed from the application. Similarly if the LPN is in a friendship, but has not polled through an IV Index Update, some action may be needed from the application. I have reached out to our mesh development team for further details, and will get back to you when I know more."

    I think is this what I'm seeing. I don't have any units displaying the behavior currently. Do you have any suggestions on how to create a scenario where the IV index is bad so I can see what's going wrong? I was thinking of just doing a test where the init value sets the IV index to something that would trigger the secure update process which is where I think things aren't working right.



  • Hello,

    I'm really hoping there's a clear why to validate that the secure beacon update is actually being sent.

    I called for a client:

        status = net_state_iv_index_set(30, p_prov_data->flags.iv_update);
        //status = net_state_iv_index_set(p_prov_data->iv_index, p_prov_data->flags.iv_update);


    And left it at default for the server.

    At no point was the server able to communicate to the client, even when trying to set up publishing and subscription in nRF Mesh. (Updating TTL fails)

    I'm in a situation now where I'm sending out a replacement unit to a client and I'm realizing that there's no real gaurentee that the two will talk now. 

    Does the server send out the secure beacon by default or is it only enabled by the setting in the app?

     

  • Hi,

    Alex Ross said:
    I am a little hesitant to switch to nRF Connect SDK because of the project development timeline.

    I have no issues understanding that rationale, especially if far in devlopment (and with devices out there already.) We are keeping some track of key values for the new SDK (memory requirements, power usage, etc.) although I do not currently have LPN power usage numbers at hand.

    Alex Ross said:
    I assume the phone app compatibility hasn't changed?

    Phone app is unchanged, as is the mesh specification, mesh models, etc. In fact nRF Connect SDK has slighlty better model coverage (and is the place where all new functionality will be developed.)

    Alex Ross said:
    Do you have any suggestions on how to create a scenario where the IV index is bad so I can see what's going wrong?

    For testing purposes you could lift the restriction of 192 hours between IV Index Updates, or maybe even hard code some IV Index "jump" to a new value, for instance set back the IVI for a node to emulate it being "behind".

    I am yet to hear back from the team, but will check with them and (hopefully) get back to you tomorrow.

    Please note Thursday is public holiday here in Norway, and many take Friday off as well for the long weekend.

    Regards,
    Terje

  • Hi,

    I noticed you had a second set of questions.

    The beacon is a collective effort of the network, where at any one place in the network topology one is expected to get, on average, one beacon every ten seconds. I.e. with two nodes, each node would send a beacon every 20 seconds, for a 10 second average. All nodes participate in this, as mandated by spec, and there is no setting controlling it.

    Alex Ross said:
    I'm in a situation now where I'm sending out a replacement unit to a client and I'm realizing that there's no real gaurentee that the two will talk now.

    It must be provisioned into the network, and as such "synced" with the IVI, yes. Does the system have provisioning support at the customer location, or is it "pre-provisioned" (so that provisioning is out-of-scope for the customer?)

    You will likely need some action from the application in order to recover the IVI. As mentioned in my previous reply, I will get back to you, hopefully before week-end (i.e. tomorrow.)

    Regards,
    Terje

  • Hi,

    I got some feedback from our mesh team.

    tesc said:
    If the LPN is not in a friendship, it will try to initiate one. If there has been IV Index updates, this will fail repeatedly, at which point the LPN should check Secure Network Beacons to see if an IV Index Recovery is needed. Some action may be needed from the application.

    The IV Index Recovery will be triggered if receiving a secure beacon with an IV Index higher than the current one. Beacons are sent through the network every 10 seconds (on average), which means there is some waiting time, and the node must be in a scanning state in order to receive the beacon. The recovery procedure itself is instantaneous (it updates a few variables in memory.)

    It does however depend on a couple of prerequisites: It can only happen after a timeout of 192 hours, and only if the node is not in the mimddle of an IVI Update. In order for the LPN to know that 192 hours has passed, it must have been on for that amount of time since the timer was last reset. Timer status is stored periodically, so the total "on time" might be divided in several sessions.

    tesc said:
    Similarly if the LPN is in a friendship, but has not polled through an IV Index Update, some action may be needed from the application.

    Staying in friendship beyond one IV Index update is not possible, since the maximum PollTimeout corresponds to 96 hours. In other words: In order to keep the friendship, one must have one successful poll at least once every IV Index. I did the calculations wrong when checking max PollTimeout for that previous answer, and found an erroneous max PollTimout that was longer than 96 hours. Correct number is 96 hours max.

    Regards,
    Terje

Related