This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

Mesh LPN RX race condition

Suppose node L is in a friendship with node F. Consider the following: 

1. Some other node C sends a message to L. 

2. F hears the message, and saves the message into L's friend queue.

3. L also happens to be scanning (e.g. L is currently in a Friend Poll receive window), and hears the message itself.

4. F sends the recorded message to L as the Friend Poll reply.

5. However, when L receives the recorded message from F, it will discard it on the Network layer due to message cache hit (it has already seen the message itself in step 2). This happens in the 3rd if statement in `deobfuscated_header_is_valid`.

6. Therefore, the LPN stack on L never receives L's Friend Poll reply. It thus never updates fsn, and will keep requesting the same message from L. 

7. After 5 Friend Polls, the LPN stack on L will terminate the friendship. 

I think in this case the LPN stack should still somehow increase its fsn despite of the message cache hit. 

To reproduce this, I had C continuously sending messages to L (3 messages every second). It usually reproduces within the first 20 seconds or so. 

Thanks! 

Parents
  • Hi Isundaylee, 

    Thanks for reporting this. However, by design, our LPN should not receive packet to itself when it's in friendship. You can have a look at the LPN life cycle figure here.

    But we suspect it might be due to another bug we found in SDK v3.0 and fixed in SDK v3.1. Could you reproduce the issue on SDK v3.1 ? 

    In addition, could you explain a little bit on how you figured out that the LPN discard packet from friend, do you have some log ? 

  • Thanks for the information! I did upgrade to v3.1.0 due to the other LPN bug, but this seems to be a separate one that still repros on v3.1.0.

    Can you elaborate a bit on "LPN should not receive packet to itself when it's in friendship"? Do you mean that LPN should not process packets that it hears directly (i.e. not from a Friend)? In this case what seems to happen is that: 1) LPN sends a Friend Poll, 2) LPN waits for Receive Delay, and starts scanning for replies to the Friend Poll, 3) while the scanner in on, the LPN node hears a direct message before hearing the Friend Poll response. Is there code in the SDK that instructs LPN to ignore all the packets it hears directly? 

    To confirm:

    1) I added log statements to the 3rd and 4th if-statements in `deobfuscated_header_is_valid`. They are hit. 

    2) I added a log statement in `mesh_lpn_rx_notify` right before `m_lpn.fsn++`. They are not hit, thus confirming that the Friend Poll reply messages never reached the LPN module.

    I can provide some log files if that is helpful. 

    Thank you!

Reply
  • Thanks for the information! I did upgrade to v3.1.0 due to the other LPN bug, but this seems to be a separate one that still repros on v3.1.0.

    Can you elaborate a bit on "LPN should not receive packet to itself when it's in friendship"? Do you mean that LPN should not process packets that it hears directly (i.e. not from a Friend)? In this case what seems to happen is that: 1) LPN sends a Friend Poll, 2) LPN waits for Receive Delay, and starts scanning for replies to the Friend Poll, 3) while the scanner in on, the LPN node hears a direct message before hearing the Friend Poll response. Is there code in the SDK that instructs LPN to ignore all the packets it hears directly? 

    To confirm:

    1) I added log statements to the 3rd and 4th if-statements in `deobfuscated_header_is_valid`. They are hit. 

    2) I added a log statement in `mesh_lpn_rx_notify` right before `m_lpn.fsn++`. They are not hit, thus confirming that the Friend Poll reply messages never reached the LPN module.

    I can provide some log files if that is helpful. 

    Thank you!

Children
No Data
Related