This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

Socket API "send" hangs when sending multiple TCP messages

Hi, 

   I am developing an application that requires communication between two nrf52840-based devices. There is a scenario in my application where the devices needs to exchange multiple TCP packets, however after transmitting nearly 50 messages, send API hangs. The inter-message transmission interval is 70-200 ms. I am using TCP over IEEE 802.15.4 with POSIX names enabled. My application's message size 14-byte. I would be really thankful for your help. 

Best regards,

Omer

  • Hi Carl, 

        Thanks for the message. I did try to increase the buffers you have mentioned, however I still observe the same behavior. 

     

        TCP/UDP: In my own application, I am using TCP and my application hangs. In the echo client server application, I am using UDP, and these applications also hang. Therefore, we cannot say that there is no issue with TCP/UDP implementation. 

    Best regards,

    Omer

  • Hi again, Omer!

    I've gotten some feedback from the developers. Listing it here:

    • For many devices the chosen entropy node is Cryptocell by default. This can immensely slow down the applications. You can change this by adding nrf52840dk_nrf52840.overlay files containing the code below in your applications.
       
      / {
      	/*
      	* In some default configurations within the nRF Connect SDK,
      	* e.g. on nRF52840, the chosen zephyr,entropy node is &cryptocell.
      	* This devicetree overlay ensures that default is overridden wherever it
      	* is set, as this application uses the RNG node for entropy exclusively.
      	*/
      	chosen {
      		zephyr,entropy = &rng;
      	};
      };
    • They suggest using the applying the changes below to the overlay-802154.conf of the applications to ensure correct operation on the nRF devices.

      diff --git a/samples/net/sockets/echo_client/overlay-802154.conf b/samples/net/sockets/echo_client/overlay-802154.conf
      index 2fc07cf685..5707ebbffe 100644
      --- a/samples/net/sockets/echo_client/overlay-802154.conf
      +++ b/samples/net/sockets/echo_client/overlay-802154.conf
      @@ -1,7 +1,8 @@
       CONFIG_BT=n
       
       # Disable TCP and IPv4 (TCP disabled to avoid heavy traffic)
      -CONFIG_NET_TCP=n
      +CONFIG_NET_TCP=y
      +CONFIG_NET_UDP=n
       CONFIG_NET_IPV4=n
       
       CONFIG_NET_CONFIG_NEED_IPV6=y
      @@ -14,5 +15,6 @@ CONFIG_NET_CONFIG_PEER_IPV6_ADDR="2001:db8::1"
       CONFIG_NET_L2_IEEE802154=y
       CONFIG_NET_L2_IEEE802154_SHELL=y
       CONFIG_NET_L2_IEEE802154_LOG_LEVEL_INF=y
      +CONFIG_NET_L2_IEEE802154_FRAGMENT_REASS_CACHE_SIZE=8
       
       CONFIG_NET_CONFIG_IEEE802154_CHANNEL=26
    • Lastly, they noticed (at least for you echo client/server test) that may lead to hangs/instability. Quoting: 
    Their client app, after sending the UDP packet enters the recv() function in a blocking manner, w/o any timeout configured. This means that the client assumes that it'll always receive a response from the server. This is not a correct approach in general, but it's even worse in lossy networks like 802.15.4, where it's not that uncommon that either request or reponse may be lost. Especially that Zephyr's 802.15.4 MAC does not use ACK mechanism by default (have to be enabled explicitly in the config file).

    Please make sure to take the above into account!

    If anything is unclear, please reach out!

    Best regards,
    Carl Richard

  • Hi Carl, 

        Thanks a lot for the detailed response. I was using UDP for the echo client/server application, and yes reliability can be an issue. However, I did enable layer 2 ACKs by having the following in proj.conf file, "CONFIG_NET_L2_IEEE802154_ACK_REPLY=y". Based on zephyr documentation I think this is the way to enable layer 2 ACKs, right? With all the modifications to the proj.conf and overlay files that you have suggested in your recent answer the problem still exists. 

      Second, the original application that I am developing is based on TCP. TCP is a reliable protocol, i.e., if the message gets lost then TCP will automatically retransmit the message. Hence, the hanging issue is not due to lost packets.   

      Third, I modified echo client/server application so that the applications use TCP instead of UDP for communication. Again, I also incorporated the changes you have suggested in the proj.conf and overlap files, however the hanging issue still exists (with this message I am attaching all the sources). For a few minutes the messages are exchanged, however afterwards the system hangs. 

      I suspect there might be an issue new "net_pkt buffer_allocation". May be buffers are not corrected deallocated, and it might be causing the issue while buffer allocation. My suggestion would be to run the attached pieces of codes, and your team may identify the problem. 

    Best regards,

    Omermy_echo_server.zipmy_echo_client.zip  

  • Hello again, Omer!

    We have not managed to reproduce your issue with the applications you attached. Our tests were done using NCS v1.5.0. What version are you running?

    The developers also noted that setting the ACK should be done differently when using the nRF driver for 802.15.4. To do this in runtime you need to o configure the L2 so that it sets the ACK request bit in the outgoing frames:

    net_mgmt(NET_REQUEST_IEEE802154_SET_ACK, iface, NULL, 0);

    For testing purposes this can also be done using the ieee802154 shell command: ieee802154 ack set

    Could you also share the HW version of your nRF52840 DKs and the file "<project_root>/zephyr/include/generated/autoconf.h"?

    Best regards,
    Carl Richard

  • Hi Carl, 

       Thanks very much for your reply and help. I am using NCS v1.3.0, and now I have switched to NCS v1.5.0. In my initial tests, the application is no longer hanging after I produce an executable using NCS v1.5.0. I will do more rigorous testing, and if I found something I will get in touch. 

    I tried to use net_mgmt(NET_REQUEST_IEEE802154_SET_ACK, iface, NULL, 0) to enable L2 ACKs, however I got compile time errors. One of the error is "NET_REQUEST_IEEE802154_SET_ACK" not defined. Could you please tell me what header files I need to include so the "NET_REQUEST_IEEE802154_SET_ACK" can be found? Second, could you please also tell me how I can obtain "iface" on nRF52840 DK so that I can pass it to the net_mgmt function? 

    Once again thank you very much for your help. 

    Best regards,

    Omer

Related