Random blocking issue with getaddrinfo function

Hello,

I am currently using some code from the HTTP client sample to send HTTP requests to a server from my application. I have encountered an issue where the thread from which I send the request occasionally blocks. The serial log indicates that the execution stops at the getaddrinfo function and hangs indefinitely. I have left it running for up to 20 minutes before resetting, but getaddrinfo did not return.

This issue occurs randomly, sometimes minutes after booting, and sometimes after hours of sending data to the server. Other threads (like the LCD) are working fine and remain responsive during this time.

Here are the details of my setup:

Module: Avnet AVT9152
SoC: nRF9160
SDK: nRF Connect SDK v2.5
Modem Firmware: Latest version

Here is the function I use for HTTP requests:

int http_client_send_request(struct http_request *request) {
    int err;
    int fd;
    struct addrinfo *res;
    struct addrinfo hints = {
        .ai_family = AF_INET,
        .ai_socktype = SOCK_STREAM,
    };

    /* Set socket timeout */
    struct timeval timeo = {
        .tv_sec = 10,
        .tv_usec = 0,
    };

    /* Set the callback function and receive buffer */
    if (request->response == NULL) {
        request->response = http_response_cb;
    } else {
        LOG_INF("Response callback already set");
    }

    LOG_INF("Getting address info\n");
    /* Resolve hostname */
    err = getaddrinfo(request->host, HTTPS_PORT, &hints, &res);
    if (err) {
        LOG_ERR("Failed to resolve hostname %s, error: %d\n", request->host, err);
        return err;
    }
    LOG_INF("Creating socket\n");
    /* Create socket */
    fd = socket(res->ai_family, SOCK_STREAM, IPPROTO_TLS_1_2);
    if (fd < 0) {
        LOG_INF("Failed to create socket, error: %d\n", errno);
        err = -errno;
        goto clean_up;
    }

    /* Setup TLS socket options */
    err = tls_setup(fd);
    if (err) {
        goto clean_up;
    }

    err = setsockopt(fd, SOL_SOCKET, SO_SNDTIMEO, &timeo, sizeof(timeo));
    if (err) {
        LOG_ERR("Failed to set socket timeout, error: %d\n", errno);
        err = -errno;
        goto clean_up;
    }

    err = setsockopt(fd, SOL_SOCKET, SO_RCVTIMEO, &timeo, sizeof(timeo));
    if (err) {
        LOG_ERR("Failed to set socket timeout, error: %d\n", errno);
        err = -errno;
        goto clean_up;
    }
    LOG_INF("Socket created\n");
    /* Connect to server */
    ((struct sockaddr_in *)res->ai_addr)->sin_port = htons(HTTP_PORT);
    err = connect(fd, res->ai_addr, res->ai_addrlen);
    if (err < 0) {
        LOG_ERR("Failed to connect to server, error: %d\n", errno);
        err = -errno;
        goto clean_up;
    }
    LOG_INF("Connected to server\n");

    /* Send HTTP request */
    err = http_client_req(fd, request, 5000, request->recv_buf);
    if (err < 0) {
        LOG_ERR("Failed to send HTTP request, error: %d\n", errno);
        err = -errno;
        goto clean_up;
    }
    LOG_INF("HTTP request sent\n");
    
clean_up:
    
    freeaddrinfo(res);
    close(fd);
    LOG_INF("Socket closed");
    return err;
}
cellular.conf


I would appreciate any guidance or suggestions to resolve this issue. Thank you in advance for your help.

Parents
  • Hello,

    have you tried running it in the debugger to see what part of the code it hangs at?

  • I caught it at this part:
        /* Enter low power state */
        _sleep_if_allowed wfi

        /*
         * Clear PRIMASK and flush instruction buffer to immediately service
         * the wake-up interrupt.
         */
        cpsie    i
        isb

        bx    lr



    Then I let the program run and paused it a few times and it mostly jumps around timeout.c, thread.c and timer.c.


    These are the last lines on the serial log:
    [00:04:01.572,967] <inf> livehub_api: lcd:Server ping

    [00:04:01.572,998] <inf> http_client: Sending HTTP POST request from server_sender_thread

    [00:04:01.573,364] <inf> http_client: Getting address info

    [00:04:03.166,351] <inf> modem_control: RRC mode: Idle

    As you can see in my http_client_send_request function I posted the execution stops at getaddrinfo(), according to LOG prints.

  • Yes, it just enters idle mode. I read about it, and it should return with a timeout if the DNS server is not reached. So this behavior is not expected. Can you do a modem trace and reproduce the issue?

Reply Children
Related