This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

Friend node timeout when LPN is re-provisioned with same address

There seems to be a bug in the mesh stack friendship code: the friend node times out infinitely when an LPN node is re-provisioned with the same address.

Steps to reproduce:

1. Program an nRF52-DK with the stock SDK light_switch_server:  

ninja flash_light_switch_server_nrf52832_xxAA_s132_7.0.1

2. Provision the light switch server with the nRF Mesh App.

3. Program a second nRF52-DK with the stock SDK LPN example:

ninja flash_lpn_nrf52832_xxAA_s132_7.0.1

3. Connect to LPN node with JLinkRTTLogger.

4. Provision the LPN node with the nRF Mesh App:

<t:    1342412>, main.c,  141, Successfully provisioned
<t:    1342417>, main.c,  154, Node Address: 0x0004 

5. Press Button3 on the LPN node to establish a friendship with the light switch server:

<t:    8741395>, main.c,  337, Button 2 pressed
<t:    8741398>, main.c,  277, Initiating the friendship establishment procedure.
<t:    8745101>, main.c,  408, Received friend offer from 0x0003
<t:    8748460>, main.c,  455, Friendship established with: 0x0003

6. Reset the LPN node in the app, then re-provision it. Note that the app assigns the same address to the device:

<t:    6325479>, mai<t:          0>, main.c,  552, ----- BLE Mesh LPN Demo -----
<t:       8877>, main.c,  503, Initializing and adding models
<t:      13683>, mesh_app_utils.c,   65, Device UUID (raw): 6D6551FBA6ED1F4687A9A4973BE61F9A
<t:      13687>, mesh_app_utils.c,   70, Device UUID : FB51656D-EDA6-461F-87A9-A4973BE61F9A
<t:     420362>, ble_softdevice_support.c,  104, Successfully updated connection parameters
<t:     966607>, main.c,  141, Successfully provisioned
<t:     966612>, main.c,  154, Node Address: 0x0004 

7. Pressing Button3 on the LPN node now results in friendship timing out:

<t:    1562101>, main.c,  337, Button 2 pressed
<t:    1562104>, main.c,  277, Initiating the friendship establishment procedure.
<t:    1778877>, main.c,  441, Friend Request timed out

This friendship timeout continues:

- even when many attempts are made

- even after >20 minutes

8. When the light_switch_server is reset (turned off and back on), it will again accept friend requests:

<t:    1562101>, main.c,  337, Button 2 pressed
<t:    1562104>, main.c,  277, Initiating the friendship establishment procedure.
<t:    1778877>, main.c,  441, Friend Request timed out
<t:    7380842>, main.c,  337, Button 2 pressed
<t:    7380845>, main.c,  277, Initiating the friendship establishment procedure.
<t:    7597617>, main.c,  441, Friend Request timed out

# light_switch_server is reset here 

<t:   13608118>, main.c,  337, Button 2 pressed
<t:   13608121>, main.c,  277, Initiating the friendship establishment procedure.
<t:   13611518>, main.c,  408, Received friend offer from 0x0003
<t:   13618461>, main.c,  455, Friendship established with: 0x0003

Versions:

0. nRF52-DK PCA10040 == nRF52832

1. Mesh SDK version 4.0.0

2. SDK version 16.0.0

3. SoftDevice s132_7.0.1

Parents
  • Hi. 

    Thank you for the report. I'll try to reproduce this from the steps you described and investigate the issue. 

    Could you also tell me which version of the nRF5 SDK for Mesh you are working with?

    I'll get back to you with more information. 

    Best regards, 
    Joakim

    EDIT: 
    Just noticed that you listed the version at the bottom of your question. 

  • Thank you.

    Were you able to reproduce? Any update?

    Help much appreciated.

  • I see what you are saying, but the net effect is:

    1. Device is removed and reprovisioned with nRF Mesh App

    2. Device can no longer talk to network

    I see how what the mesh stack is doing follows the specification, but the objective behavior for users is broken.

    Simply removing ONE device and adding a NEW, DIFFERENT device (which is given the same address by the nRF Mesh App) will trigger this bug - it is very easy to trigger in usual operations and results in "broken" behavior where the newly provisioned device cannot talk to the mesh.

    Perhaps the nRF Mesh App might:
    1. Set the correct sequence number on LPN when provisioning?

    2. Track all past addresses and only provision never-before-used addresses?

    3. Any other ideas?

    Thank you

  • We do appreciate the feedback. and I'll forward this internally so that it can be considered for any future releases. 

    I would like to note that a power cycle of the device shouldn't clear the replay list. For optimal security with regards to the replay protection, this should be saved to flash. I do believe this is going to be changed in a future release of the nRF5 SDK for Mesh. 

    Also, the nRF Mesh app isn't actually supposed to be a used in a finalized product, but more as a development tool and a template for developing your own application. As a development tool, it might be good to have the option to provision a device with the same address. That way you can test that the replay protection actually works for your product.

    Best regards, 
    Joakim

  • Thank you, response much appreciated.

    I fear this problem is deeper than just "don't use the nRF Mesh App for production" (we are not, but reporting issues with another app in the past I have been asked to reproduce with you nRF Mesh App).

    As you have pointed out, once the mesh stack starts saving the replay list to flash, there will be NO way to re-provision a device with the same address and have it work reliably.

    The problem goes deeper: there seems to be no key in the underlying JSON format to track previously-used addresses.

    My understanding is that this JSON format follows a standard schema published by Buetooth SIG, yes?

    So, again, the question arises, we should be able to do something to preempt this replay list issue, should it be:

    1. Provisioner sets the correct sequence number on LPN when provisioning?

    2. Replay list is reset at provisioning-time somehow?

    3. JSON schema is extended to track all past addresses?

    4. Any other ideas?

    We are looking for a hint about the proper approach to tackle this from our end.

  • Thanks. 

    I'll forward this to our Mesh developers, so that they can comment on this. Will update the ticket when I get any feedback from them. 

    Br, 
    Joakim

  • Hello, any new information on this?

    Would be great to be able to resolve this issue, it shows up quite a bit when deploying mesh networks in the field.

Reply Children
Related