Mesh SDK: Unable to publish to nodes from interactive_pyaci without doing a reset

Hi,

I have provisioned some nodes using interactive_pyaci successfully. Functionality is as expected. As a test, I then shut interactive_pyaci down, rebooted the nRF module and I can successfully re-add all the nodes from the database. Still able to publish and subscribe to the nodes. This exercise was to emulate a gateway going down and coming back up.

Then I exited, waiting over the weekend before coming back and doing the same test. This time, it doesn't work. On the nodes, I see this on RTT:

00> <t:    7089681>, net_packet.c,  228, Unencrypted data: : 0011EE00C401A86E455CA8B264AACA685665

In interactive_pyaci, no response is received:

In [72]: cc.composition_data_get()

>>>>>>>>>>>>>>>>>> OpCode: 0xab
>>>>>>>>>>>>>>>>>> Data: 0x0e 0xab 0x09 0x00 0x01 0x00 0x01 0x00 0x08 0x00 0x00 0x00 0x80 0x08 0x00

<<<<<<<<<<<<<<<<<<<<< Event name:  CmdRsp
<<<<<<<<<<<<<<<<<<<<< OpCode: 0x84
<<<<<<<<<<<<<<<<<<<<< Payload:  {'opcode': 171, 'status': 0, 'data': bytearray(b'1\x00\x00\x00')}
In [73]: <<<<<<<<<<<<<<<<<<<<< Data field: 0x31 0x00 0x00 0x00

2022-02-15 10:00:29,807 - INFO - ttyAMA0: PacketSend: {'token': 49}
<<<<<<<<<<<<<<<<<<<<< Event name:  MeshTxComplete
<<<<<<<<<<<<<<<<<<<<< OpCode: 0xd2
<<<<<<<<<<<<<<<<<<<<< Payload:  {'token': 49}

2022-02-15 10:00:29,860 - INFO - ttyAMA0: {event: MeshTxComplete, data: {'token': 49}}

This probably means that the messages are received by the mesh but are not accepted by the nodes. I tried a reset on the node I'm publishing to and finally I get the output I expected.

I am thinking the IV index may be outdated but is there any other reason for this issue? IIRC IV indexes shouldn't be out of range until well over 20 days.

P.S. I am aware that interactive_pyaci is not meant to be used in a production setting for a gateway. I am monitoring its protocols and creating my own production gateway software.

  • Hi Muhammad, 

    It's correct. You can only set the net states externally once after the mesh stack is initialized. 
    Here is the description of the net states set handler in the stack: 


    /**
    * Sets the initial IV index and IV update state, and starts sequence number from the supplied block.
    *
    * @param[in] iv_index Initial IV index value.
    * @param[in] iv_update @c true if an IV index update is in progress.
    * @param[in] next_seqnum_block The next seqnum block to allocate.
    *
    * @retval NRF_SUCCESS Successfully set the IV index and IV update state.
    * @retval NRF_ERROR_INVALID_STATE The function was called when the IV index and update state was
    * already set.
    */
    uint32_t net_state_iv_index_and_seqnum_block_set(uint32_t iv_index, bool iv_update, uint32_t next_seqnum_block);

  • Great! One last question about this. I reset my gateway again, added the keys then set the gateway's sequence number to 8192 to talk to my nodes, but the nodes themselves are on sequence number 630+.

    I couldn't communicate to them again till I reset one more time and set my sequence number to 16384. I reset the gateway 5 minutes later and it worked with 16384. I tried it again an hour later, and this time had to set the sequence number to 24576.

    I know you said 

    But if you reset you need to set it to higher number. I would suggest to make a jump of 8192 (That's what we do in normal mesh node, where we jump NETWORK_SEQNUM_FLASH_BLOCK_SIZE after each reset) 

    so that explains the first part. But why does it work sometimes if I don't jump one block size of 8192?

  • Hi Muhammad, 


    I don't have an explanation for that. I would suggest to print out some log on the node's side to check why it sometimes works with older sequence number. It's handled in the replay_cache_has_elem() function. You can find this function is called inside transport_packet_in() in transport.c . If the a packet from the same source is already in the cache and if the sequence number is lower than the packet cached then it will be rejected. 

    My suspicion is that there was no packet from the same source is stored, could be due to that the node was reset earlier (replay cache is not stored into flash as far as I remember). 

Related