Help with TLS Socket Creation on Thingy53 - ENOMEM (Error 23)

I'm working on a Thingy53 project with Zephyr RTOS that requires TLS connections, but I'm encountering persistent "Failed to create TLS socket: 23" errors (ENOMEM - out of memory). I've implemented several memory optimizations and error handling strategies but still face issues with TLS socket creation.

  1. Initial Problem Identification

    • Thingy53 device experiencing ENOMEM errors during TLS socket creation
    • Memory allocation test function using undefined k_malloc_free_get
  2. Configuration Optimization

    • Consolidated duplicate definitions in prj.conf
    • Increased heap memory (CONFIG_HEAP_MEM_POOL_SIZE=180000)
    • Increased mbedTLS heap (CONFIG_MBEDTLS_HEAP_SIZE=80000)
    • Restored GPIO and WiFi management configurations
  3. TLS Cipher Configuration Issues

    • Identified problematic CONFIG_MBEDTLS_SSL_CIPHERSUITES configuration
    • Attempted various syntax fixes (removing quotes, escaping quotes)
    • Ultimately removed the problematic configuration to allow builds to proceed
  4. First Socket Implementation Approach

    • Tried incremental approach: TCP socket → Connect → TLS upgrade
    • Failed with ENOBUFS (error 109) when setting TLS_SEC_TAG_LIST
  5. Second Socket Implementation Approach

    • Created TLS socket directly using IPPROTO_TLS_1_2
    • Set all TLS options before connecting
    • Eliminated socket reconfiguration to allow proper resource allocation
  6. Memory Management Improvements

    • Implemented socket tracking system to prevent resource leaks
    • Added retry mechanism with increasing delays
    • Added specific ENOMEM error handling with resource cleanup
    • Improved TLS configuration sequence (hostname before security tags)
  7. Current Status

    • Despite improvements, still encountering ENOMEM (error 23)
    • Memory allocation test succeeds but TLS socket creation fails
    • Implemented more aggressive resource management with k_yield() and socket tracking
static int create_tls_socket(const char *hostname, struct sockaddr_in *server_addr)
{
    int sock;
    int ret;
    int retry_count = 0;
    const int retry_delay_ms = 1000;
    
    /* Check memory status before attempting socket creation */
    check_memory_status("before TLS socket creation");
    
    /* We'll try a different approach if the previous one keeps failing with ENOMEM */
    while (retry_count < MAX_SOCKET_RETRIES) {
        /* First, ensure any previous socket resources are fully released */
        k_sleep(K_MSEC(retry_delay_ms));
        k_yield();  /* Allow other threads to run and potentially free resources */
        
        /* Create TLS socket */
        sock = socket(AF_INET, SOCK_STREAM, IPPROTO_TLS_1_2);
        if (sock < 0) {
            if (errno == ENOMEM) {
                LOG_ERR("Failed to create TLS socket: Out of memory (ENOMEM)");
                /* Try to release memory by forcing a garbage collection */
                close_all_sockets();
                retry_count++;
                continue;
            } else {
                LOG_ERR("Failed to create TLS socket: %d", errno);
                retry_count++;
                continue;
            }
        }
        
        LOG_INF("TLS socket created successfully");
        track_socket(sock);
        
        /* Set TLS socket options */
        sec_tag_t sec_tag_list[] = {
            CA_CERT_TAG,
        };
        
        /* Set receive and send timeout to avoid blocking indefinitely */
        struct timeval timeout = {
            .tv_sec = 10,
            .tv_usec = 0,
        };
        
        ret = setsockopt(sock, SOL_SOCKET, SO_RCVTIMEO, &timeout, sizeof(timeout));
        if (ret < 0) {
            LOG_ERR("Failed to set socket receive timeout: %d", errno);
            close(sock);
            untrack_socket(sock);
            retry_count++;
            continue;
        }
        
        // [Additional socket option settings omitted for brevity]
        
        /* All options set successfully */
        LOG_INF("TLS socket configured successfully");
        return sock;
    }
    
    LOG_ERR("Failed to create TLS socket after %d attempts", MAX_SOCKET_RETRIES);
    return -ENOMEM;
}

I'm tracking sockets and cleaning them up:
void track_socket(int sock)
{
// Track socket for potential cleanup
}

void untrack_socket(int sock)
{
// Remove socket from tracking
}

void close_all_sockets(void)
{
// Close all tracked sockets to free resources
}

I've made the following memory optimizations in prj.conf:

Increase heap memory pool size

CONFIG_HEAP_MEM_POOL_SIZE=200000
CONFIG_HEAP_MEM_POOL_IGNORE_MIN=y

TLS Configuration

CONFIG_MBEDTLS=y
CONFIG_MBEDTLS_BUILTIN=y
CONFIG_MBEDTLS_ENABLE_HEAP=y
CONFIG_MBEDTLS_HEAP_SIZE=85000
CONFIG_MBEDTLS_SSL_MAX_CONTENT_LEN=1536

Reduce TLS resource usage

CONFIG_NET_SOCKETS_TLS_MAX_CONTEXTS=2
CONFIG_NET_SOCKETS_TLS_MAX_CREDENTIALS=2
CONFIG_NET_CONTEXT_MAX_CONN=4

Are there additional memory optimizations I should be making for TLS on constrained devices like the Thingy53?
Could there be any memory leaks or resource allocation issues that I'm missing in my implementation?
Are there specific mbedTLS configurations I should be adjusting to reduce memory usage further?
Is there a way to debug exactly what's happening during TLS socket creation that's causing the ENOMEM error?
Should I be implementing a different approach for TLS on this device?

Parents
  • Hi

    Just to get an overview, may I ask that you in the future insert the code in a text block so it's easier to read and review your replies as it can get quite messy with hundreds of lines of code as plain text:

    The error codes from these "Failed to set TLS ..." messages all refer to there being too many references, so did you make any increases to the following configs:

    CONFIG_NET_SOCKETS_TLS_MAX_CONTEXTS=2
    CONFIG_NET_SOCKETS_TLS_MAX_CREDENTIALS=2
    CONFIG_NET_CONTEXT_MAX_CONN=4
    CONFIG_POSIX_MAX_FDS

    I agree with you that the Azure authentication failure is likely due to insufficient verification/security.

    Best regards,

    Simon

Reply
  • Hi

    Just to get an overview, may I ask that you in the future insert the code in a text block so it's easier to read and review your replies as it can get quite messy with hundreds of lines of code as plain text:

    The error codes from these "Failed to set TLS ..." messages all refer to there being too many references, so did you make any increases to the following configs:

    CONFIG_NET_SOCKETS_TLS_MAX_CONTEXTS=2
    CONFIG_NET_SOCKETS_TLS_MAX_CREDENTIALS=2
    CONFIG_NET_CONTEXT_MAX_CONN=4
    CONFIG_POSIX_MAX_FDS

    I agree with you that the Azure authentication failure is likely due to insufficient verification/security.

    Best regards,

    Simon

Children
  • I have done this:

    CONFIG_NET_SOCKETS_TLS_MAX_CONTEXTS=4
    CONFIG_NET_SOCKETS_TLS_MAX_CREDENTIALS=2
    CONFIG_NET_CONTEXT_MAX_CONN is not available for Thingy53
    CONFIG_POSIX_MAX_FDS=24

    Tried also 
    CONFIG_NET_SOCKETS_TLS_MAX_CONTEXTS=2
    CONFIG_NET_SOCKETS_TLS_MAX_CREDENTIALS=2
    CONFIG_NET_CONTEXT_MAX_CONN=4
    CONFIG_POSIX_MAX_FDS

  • And you still see the same error? Can you try increasing it by a lot, like both TLS_MAX_CONTEXTS and TLS_MAX_CREDENTIALS to 10?

    Best regards,

    Simon

  • I have tried TLS_MAX_CONTEXTS and TLS_MAX_CREDENTIALS to 10.

    What I've done to fix the Azure authentication issue:

    I have inspected the Zephyr code which couldn't successfully complete Azure cognitive services authentication.
    I've compared it with your my curl-based C implementation that uses the same endpoint and API key.
    I have identified several issues:

    TLS socket options failing with error 109 (ENOTSUP)
    Possible HTTP request formatting issues
    Connection being reset by the server after sending the request

    I've tried multiple approaches:

    Fixed HTTP request formatting to match curl exactly
    Simplified TLS configuration
    Tried sending the HTTP headers and empty line separately
    Added more debugging

    Nothing did work.

    The latest log shows:

    [00:00:11.880,798] <inf> main: State: COMPLETED
    [00:00:11.881,774] <inf> main: SSID: iPhone
    [00:00:11.882,659] <inf> main: Getting Azure authentication token...
    [00:00:11.883,605] <inf> main: Connecting to westeurope.api.cognitive.microsoft.com...
    [00:00:11.884,796] <inf> main: Checking for unhealthy sockets
    [00:00:11.885,589] <inf> main: Resolving hostname: westeurope.api.cognitive.microsoft.com
    [00:00:12.398,315] <inf> main: Tracking socket 14 in slot 0
    [00:00:12.399,139] <wrn> main: Failed to set receive timeout: 109
    [00:00:12.400,207] <wrn> main: Failed to set send timeout: 109
    [00:00:12.401,214] <inf> main: CA certificate added to TLS credentials
    [00:00:12.402,252] <inf> main: Proceeding with TLS connection
    [00:00:12.403,228] <inf> main: Connecting to westeurope.api.cognitive.microsoft.com:443...
    [00:00:12.644,805] <inf> main: Successfully connected to westeurope.api.cognitive.microsoft.com:443
    [00:00:12.646,209] <inf> main: Connected to Azure server, socket fd: 14
    [00:00:12.647,186] <inf> main: Sending HTTP request (265 bytes):
    [00:00:12.648,101] <inf> main: ------ BEGIN HTTP REQUEST ------
    [00:00:12.649,078] <inf> main: POST /sts/v1.0/issuetoken HTTP/1.1
    Host: westeurope.api.cognitive.microsoft.com
    Ocp-Apim-Subscription-Key: 5Mkszj3yFl2vmbRDchDUHDAOMiZAYmzaLzA508JrnpRYYKBsv4nqJQQJ99BCAC5RqLJXJ3w3AAAYACOGVfQg
    Content-Type: application/x-www-form-urlencoded
    Content-Length: 0
    [00:00:12.652,008] <inf> main: ------ END HTTP REQUEST ------
    [00:00:12.653,778] <inf> main: HTTP request sent: 265 bytes
    [00:00:12.654,724] <inf> main: Waiting for HTTP response...
    [00:00:12.759,216] <wrn> main: Connection reset by peer
    [00:00:12.760,009] <err> main: Failed to get TLS error: 109
    [00:00:12.760,986] <inf> main: Disconnecting from server, socket 14
    [00:00:12.762,054] <inf> main: Untracking socket 14 from slot 0
    [00:00:12.763,061] <wrn> main: Socket shutdown failed: 134
    [00:00:12.764,160] <inf> main: Socket 14 successfully closed
    [00:00:12.865,234] <err> main: Azure authentication failed: -5

    Code:

    /* TLS configuration in connect_to_server */
    if (use_tls) {
    ret = setup_tls_credentials();
    if (ret < 0) {
    LOG_ERR("Failed to set up TLS credentials: %d", ret);
    close(sock);
    untrack_socket(sock);
    zsock_freeaddrinfo(result);
    return ret;
    }
    
    /* Set TLS options */
    int tls_options = TLS_PEER_VERIFY_REQUIRED;
    ret = zsock_setsockopt(sock, SOL_TLS, TLS_PEER_VERIFY, &tls_options, sizeof(tls_options));
    if (ret < 0) {
    LOG_ERR("Failed to set TLS peer verification: %d", errno);
    close(sock);
    untrack_socket(sock);
    zsock_freeaddrinfo(result);
    return -EIO;
    }
    
    /* Set TLS certificate */
    struct tls_credential cert = {
    .type = TLS_CREDENTIAL_CA_CERTIFICATE,
    .tag = CA_CERT_TAG
    };
    ret = zsock_setsockopt(sock, SOL_TLS, TLS_CREDENTIAL, &cert, sizeof(cert));
    if (ret < 0) {
    LOG_ERR("Failed to set TLS certificate: %d", errno);
    close(sock);
    untrack_socket(sock);
    zsock_freeaddrinfo(result);
    return -EIO;
    }
    
    LOG_INF("Proceeding with TLS connection");
    }


    /* print_tls_error_info function to debug TLS issues */
    void print_tls_error_info(int sock)
    {
    int err;
    socklen_t optlen = sizeof(err);
    
    if (getsockopt(sock, SOL_TLS, TLS_ERROR_OPTION, &err, &optlen) == 0) {
    LOG_ERR("TLS error code: %d", err);
    
    switch (err) {
    case MBEDTLS_ERR_SSL_CONN_RESET:
    LOG_ERR("Connection reset by peer");
    break;
    case MBEDTLS_ERR_SSL_BAD_HS_SERVER_HELLO:
    LOG_ERR("Bad server hello message");
    break;
    /* Other error cases... */
    }
    }
    }
    

  • /* HTTP request sent in two parts in get_azure_auth_token */
    const char *http_headers =
    "POST /sts/v1.0/issuetoken HTTP/1.1\r\n"
    "Host: westeurope.api.cognitive.microsoft.com\r\n"
    "Ocp-Apim-Subscription-Key: 5Mkszj3yFl2vmbRDchDUHDAOMiZAYmzaLzA508JrnpRYYKBsv4nqJQQJ99BCAC5RqLJXJ3w3AAAYACOGVfQg\r\n"
    "Content-Type: application/x-www-form-urlencoded\r\n"
    "Content-Length: 0\r\n";
    
    /* Empty line to separate headers from body */
    const char *empty_line = "\r\n";
    
    /* Send headers first */
    int sent = zsock_send(sock, http_headers, headers_len, 0);
    
    /* Add small delay to ensure headers are processed */
    k_sleep(K_MSEC(100));
    
    /* Now send the empty line separately */
    sent = zsock_send(sock, empty_line, 2, 0);
    

  • Maybe I am doing something wrong with the TLS or HTTP request?

Related