This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

Possible race conditions in pm_peers_delete?

Hi

I'm using the PeerManager/FDS/FStorage in my project since a few months. We have this software running on real hardware and on simulated environments. While researching the different behavior in the simulated environment i found some possible race conditions related to asynchronous calls to the FDS.

Environment: The system has two or more bonded peers

Here is what i looked at:

  1. We call pm_peers_delete at startup to erase all those bonded peers
  2. As a subfunction later peer_data_clear gets called synchronously
  3. This function calls the asynchronous erase operation by calling fds_file_delete
  4. Back to step 3 for any peer to delete

Possible race condition 1:

When fds_file_delete is called the m_pds.clearing is set to true on success. If the asynchronous erase ends before the "fds_file_delete" call returns, the operation completed callback gets called before m_pds.clearing is set to true. As a consequence the m_pds.clearing is never reset back to false and erasing bonds is blocked forever.

Possible race condition 2:

The function peer_data_clear is called in a loop multiple times for any peer to delete. If the asynchronous delete takes longer than the time between these two synchronous calls? The second call of peer_data_clear will not work properly because the m_pds_clearing is still set to true when trying to erase the second bonding.

The race condition one is what happend in my simulated environment and i could easily modify the code to work also in these conditions. The second race condition i've never seen happen, but i'm not sure it never will.

Any thouhts about these race conditions?

Regards Adrian

Related