This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Mesh low power node: Network layer dropping segmented mesh packages

Hello,

I have I am trying out the LPN sample from nRF5 SDK for Mesh v5.0.0 and I have an issue where certain mesh configuration messages (e.g. set publication, add application key) are not received by the LPN node when it is in a friendship. So far, the issue only seems to affect segmented messages and looking at the log, the network layer seems to fail decryption of the second segment. Below is the network layer log of one friend polling sequence.

<t:    1422754>, network.c,  302, Net TX: 3A80002006006600010101CF567E1EA4712BAD70607CE87A38F48B4C00
<t:    1426692>, network.c,  339,   Net RX (enc): 3A7E0069303F8857D1C85B7A8C3E6F7EAE82F337C754CD315D6FCF0BEA <-- First segment
<t:    1426702>, net_packet.c,  229, Unencrypted data: : 006680B54001DCBDBCC3FAB25C89CFF8CF1D
<t:    1426706>, network.c,  355, Net RX (unenc): 3A03002D500001006680B54001DCBDBCC3FAB25C89CFF8CF1D00000019
<t:    1426714>, transport.c,  931, Got segment 0
<t:    1428352>, network.c,  302, Net TX: 3A80002007006600010100EE8B190E4E7B3BDA70607CE87A38F48B4C00
<t:    1439909>, network.c,  302, Net TX: 3A80002008006600010100BED3FCECCBCFE50B70607CE87A38F48B4C00
<t:    1443821>, network.c,  339,   Net RX (enc): 3A9E242B02B9D157D1C85B7AAC06C5C74E5F7FB070C316F673 <-- Second segment
<t:    1445517>, network.c,  339,   Net RX (enc): 3A9E242B02B9D157D1C85B7AAC06C5C74E5F7FB070C316F673
<t:    1451466>, network.c,  302, Net TX: 3A800020090066000101002D9AB5B01979526B70607CE87A38F48B4C00
<t:    1455402>, network.c,  339,   Net RX (enc): 3A9E242B02B9D157D1C85B7AAC06C5C74E5F7FB070C316F673 <-- Second segment again
<t:    1457089>, network.c,  339,   Net RX (enc): 3A9E242B02B9D157D1C85B7AAC06C5C74E5F7FB070C316F673
<t:    1463022>, network.c,  302, Net TX: 3A8000200A0066000101004083F4C4DAE44EB470607CE87A38F48B4C00
<t:    1474579>, network.c,  302, Net TX: 3A8000200B00660001010026F4BD8BF7B8399970607CE87A38F48B4C00
<t:    1484340>, network.c,  339,   Net RX (enc): 3A9E242B02B9D157D1C85B7AAC06C5C74E5F7FB070C316F673 <-- Second segment again
<t:    1485986>, network.c,  339,   Net RX (enc): 3A9E242B02B9D157D1C85B7AAC06C5C74E5F7FB070C316F673
<t:    1486135>, network.c,  302, Net TX: 3A8000200C0066000101003C67791653B46C6070607CE87A38F48B4C00
<t:    1490033>, network.c,  339,   Net RX (enc): 3A9E242B02B9D157D1C85B7AAC06C5C74E5F7FB070C316F673 <-- Second segment again
<t:    1491727>, network.c,  339,   Net RX (enc): 3A9E242B02B9D157D1C85B7AAC06C5C74E5F7FB070C316F673
<t:    1497692>, transport.c,  499, Dropped SAR session 0, reason 6

Since the second segment fails to be decrypted, it is disregarded and the LPN tries polling again. Eventually the friendship is terminated due to no reply from friend.

I looked closer into the network layer code, and found that, for this case, the package is disregarded due to the following check:

https://github.com/NordicSemiconductor/nRF5-SDK-for-Mesh/blob/master/mesh/core/src/net_packet.c#L133

    /* We check the message cache now, as we'll either have the right deobfuscation, and the
     * src+seq won't change after decryption, or we'll have the wrong deobfuscation, and most likely
     * pass the cache check, but abandon it after decryption. In the unlikely event of a wrongly
     * deobfuscated src+seq matching an existing src+seq in the message cache, we'll wrongly abandon
     * the packet here, but since the decryption would have failed anyway, it doesn't matter. */
    if (msg_cache_entry_exists(p_net_metadata->src, p_net_metadata->internal.sequence_number))
    {
        __INTERNAL_EVENT_PUSH(INTERNAL_EVENT_PACKET_DROPPED, PACKET_DROPPED_NETWORK_CACHE, net_packet_len, p_net_packet);
        return false;
    }

If I remove this check, the segment is successfully decrypted and the message is correctly passed on to the application layer.

I would be pleased to get some help figuring out whether or not this is a bug or if my configuration/setup is faulty.

My setup is:

  • nordic LPN example running on a nRF52840-DK
  • provisioner and friend node is a linux host running the BlueZ mesh stack (controlled with the mesh-cfgclient tool)

Please let me know your thoughts and also if there are any important information missing.

Thanks you!

  • Hi,

    sorry, should have mentioned that. I'm using BlueZ version 5.62.

  • Hi,

    Our developer suspected that it might be BlueZ mesh friend encodes the same sequence number for every segment of the segmented message. Looking at the log the assumption seems to be correct. SeqNum is the same in both segments. In the first segment (segment 0), it should have been 0x0001E1, instead of 0x0001E2. Our developer took a look at BlueZ code and it looked ok at first sight, but he will be leaving for the holiday so he won't able to take a deeper look. A suggestion is to ask this question in BlueZ: https://github.com/bluez/bluez/issues as well.

    <t: 1168516>, network.c, 348, Net RX (unenc): 377F*0001E2*000100AC80878801B07010C5FBD74A9E3C1A5666000000C5
    <t: 1168524>, transport.c, 931, Got segment 0
    <t: 1170162>, network.c, 302, Net TX: 378000800800AC00010100F5B56107791623E75CF30AEE56DFA0287D00
    <t: 1174330>, network.c, 339, Net RX (enc): 377CBAD2D9B0E5EB3C24BAF04D3AC91C26046411F92DC7DC4C
    <t: 1174337>, net_packet.c, 228, Unencrypted data: : 00AC80878821A6B884485566454E
    <t: 1174341>, network.c, 348, Net RX (unenc): 377F*0001E2*000100AC80878821A6B884485566454E00000000
    <t: 1174349>, transport.c, 931, Got segment 1

  • Hi,

    Yes, that looks to be correct. I will post a question for BlueZ also, and keep you update with any findings.

    Thanks a lot for taking the time and looking into it.

Related