This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

Socket send function hangs

Hi,

I have a custom board with nrf9160 and I'm using it to send messages to the AWS server via mqtt. I managed to reproduce a bug where the TLS socket send function hangs (similar to -https://devzone.nordicsemi.com/f/nordic-q-a/62249/hang-in-sendmsg-on-the-nrf9160). I was using v1.3 of ncs.

Steps to reproduce - 

  • Connect to LTE netwrok. 
  • Connect to AWS via MQTT
  • When a successful connection is established, send a disconnect message
  • When successfully disconnected, reconnect and keep repeating this loop
  • On the 21st connection/disconnection cycle the function gets stuck in the mqtt_client_tls_write function while calling the 
    send(client->transport.tls.sock, data + offset, datalen - offset, 0);
    function

I switched to the latest master branch of ncs and changed the send function to send(client->transport.tls.sock, data + offset, datalen - offset, MSG_DONTWAIT) Now the send function doesn't hang but returns a value of -1 and the errno is -EAGAIN when I follow the steps to reproduce this bug.

How do I handle the socket timeout gracefully? After this timeout, if I try to reconnect to AWS I get 

aws_iot: getaddrinfo, error -10
aws_iot: client_broker_init, error: -10

and I have to restart the device to get it working again

I also noticed an error case which doesn't seem to be handled in aws_iot.c. When the socket send function returns -1, then in file mqtt.c, the function client_connect calls client_disconnect which in turn calls disconnect_event_notify function. This notifies the aws_iot client with 

evt.type = MQTT_EVT_CONNACK;
evt.result = -ECONNREFUSED;
which is not handled properly in aws_iot.c. Handling this properly still causes the same error I mentioned before though.
Any help will be appreciated. Thanks in advance!
Nikil
Parents Reply Children
No Data
Related