Enabling the TLS layer to get a HTTPS connection going.

7343.nrf7002dk_nrf5340_cpuapp_ns.conf3124.prj.confHello everyone.

WE're trying to make a https connection with google.com and execute a GET request.

Wifi connection is working; DHCP seems to be working (my personal assumption given the log message we get: "Resolved: [(1, 1, 6, '', ('142.250.201.206', 443))]" which indicates that getaddrinfo() works); but when trying to initiate the socket via TLS, something strange happens: we get the error "OSError: 109".

Inserting some debug prints inside subsys/net/lib/sockets/, we found the culprit to be the function "int zsock_setsockopt_ctx(struct net_context *ctx, int level, int optnameconst void *optval, socklen_t optlen)".

The function call that triggers error 109 is:  res = setsockopt(socket->ctx, SOL_TLS, TLS_PEER_VERIFY, &verify, sizeof(verify));

No matter what other option we try to set via setsockopt(), it will fail with the 109 error since the implementation for setsockopt() is somehow set to sockets_inet.c (whose implementation does not recognise SOL_TLS as a valid in its switches) instead of sockets_tls.c (which has handling for SOL_TLS in its switches). My personal hunch is that the config options set in the project are somehow wrong. Can someone please take a look over our .conf files? Maybe we can find the culprit. :)

We can provide any extra code snippets that are necessary for debugging and/ or run any tests. Have a great day and hope to hear from you soon!

  • Hi, 

     

    Is one config working, while the other is not working?

    The config CONFIG_NET_SOCKETS_POLL_MAX shall be 10 or higher, as the WPA supplication uses several sockets, and so does each TLS connection.

    It also looks like the MPI size is 256, while MBEDTLS_MPI_MAX_SIZE should be 512.

    Tudor B. said:
    Btw! I re-ran the example and could not for the life of me get it to connect to google.com and make the request. DNS resolved an IP, certificate steps passed, but it always fails with

    One thing that can help here is the mbedtls debug output configs that I shared earlier.

    However, I see from your updated response that the issue was related to the proto input variable.

    Tudor B. said:
    "OSError:23" 

    I assume this is errno?

    #define ENFILE 23 /**< File table overflow */

    You're overflowing the amount of open sockets.

    This could be due to the max amount configured in your build, or that you're not closing sockets.

    Try increasing CONFIG_ZVFS_OPEN_MAX and CONFIG_NET_SOCKETS_POLL_MAX (which again sets CONFIG_ZVFS_POLL_MAX).

    Tudor B. said:
    Edit: extra point that I wanna add: must the board's time be updated before the TLS and/or connection steps take place? Or is the board's time relevant in any way for this?

    I believe you're referring to checking validity?

    https://github.com/nrfconnect/sdk-mbedtls/blob/main/include/mbedtls/mbedtls_config.h#L133-L152 

    mbedtls is capable of doing this.

     

    Kind regards,

    Håkon

  • It also looks like the MPI size is 256, while MBEDTLS_MPI_MAX_SIZE should be 512.

    That's for the example. By checking the compared files, I see that our project has indeed 512.

    Try increasing CONFIG_ZVFS_OPEN_MAX and CONFIG_NET_SOCKETS_POLL_MAX (which again sets CONFIG_ZVFS_POLL_MAX).

    Again an interesting point since I see that for the https_client sample project, CONFIG_NET_SOCKETS_POLL_MAX is set to 0, while for us it's set to 10. Will try increasing it to 16 and rerun the test.

    I believe you're referring to checking validity?

    Yes, my lead asked if it might be the case that TLS fails due to the devkit not getting its time set before attempting any TLS steps.

  • Set CONFIG_MBEDTLS_MPI_MAX_SIZE=512 and CONFIG_NET_SOCKETS_POLL_MAX=16. Same behaviour.

    Edit: forgot to mention:

    CONFIG_LOG=y
    CONFIG_MBEDTLS_DEBUG=y
    CONFIG_MBEDTLS_SSL_DEBUG_ALL=y
    CONFIG_MBEDTLS_LOG_LEVEL_DBG=y
    CONFIG_MBEDTLS_DEBUG_C=y
    CONFIG_MBEDTLS_DEBUG_LEVEL=4
    # Handle the large influx of prints
    CONFIG_LOG_BUFFER_SIZE=16384
    CONFIG_LOG_BACKEND_UART=y

    are active in the build, but the same scarce log comes up.

  • Hi,

     

    If mbedtls logging is enabled, but nothing is outputted, the failure is not with mbedtls.

     

    The issue is local, and if you are still seeing -23 (and this is the errno??? please confirm) - then the issue is still due to socket overflow.

    Verify that the value you set here:

    Håkon Alseth said:
    CONFIG_NET_SOCKETS_POLL_MAX

    Is reflected in the config set here:

    Håkon Alseth said:
    CONFIG_ZVFS_POLL_MAX

     

    And please note that this file

    Tudor B. said:
    The file in question is located at: /build/zephyr/zephyr/misc/generated/configs.c

    is not the .config file. files with prefix "." can be looked as hidden, especially if you are running windows and do not have show all files in your explorer.

    The .config file is located in ../build-folder/application-name/zephyr/.config 

     

    Tudor B. said:
    Yes, my lead asked if it might be the case that TLS fails due to the devkit not getting its time set before attempting any TLS steps.

    mbedtls does not print any debug logs, which indicate that the system does not get that far before the failure occurs, meaning that the failure is else-where in the network subsys, highly likely in the socket allocation.

     

    Kind regards,

    Håkon

  • GEN_ABSOLUTE_SYM_KCONFIG(CONFIG_NET_SOCKETS_POLL_MAX, 16);
    GEN_ABSOLUTE_SYM_KCONFIG(CONFIG_ZVFS_POLL_MAX, 16);

    Confirms they are identical and set to 16.

    is not the .config file. files with prefix "." can be looked as hidden, especially if you are running windows and do not have show all files in your explorer.

    The .config file is located in ../build-folder/application-name/zephyr/.config 

    I saw mentions in other tickets about the "magical .config" file but I've searched my entire project folder and the SDK folder itself and no such file exists. I'm running on Mac. The file that I mentioned seems to be its equivalent (probably a v3.0.0 thing). I'm a bit convinced by this since every time I build it gets regenerated.

    and this is the errno??? please confirm

    Sadly I can't confirm with 100% certainty. The Zephyr stack + the Nordic libs are so complex and intricate that all I can say is... I think so. :))

    What I can say to elaborate a bit:

    Inside "static int ztls_socket(int family, int type, int proto)", which is in sockets_tls.c at one point the function reaches these lines:
    	printk("Dead here...15\n");
    	ctx = tls_alloc();
    	if (ctx == NULL) {
    		errno = ENOMEM;
    	printk("Dead here...16\n");
    		goto free_fd;
    	}
    	
    	printk("Dead here...17\n");
    
    	sock = zsock_socket(family, type, proto);
    	if (sock < 0) {
    		printk("Dead here...17.5: %d\n", sock);
    		goto release_tls;
    	}
    	printk("Dead here...18\n");


    Also, in my my modsocket.c, I have the function "STATIC mp_obj_t socket_make_new(const mp_obj_type_t *type, size_t n_args, size_t n_kw, const mp_obj_t *args) {" which at one point reaches these lines:
         printk("Trying to initialize socket...\n");
         socket->ctx = zsock_socket(family, socktype, proto);
         RAISE_SOCK_ERRNO(socket->ctx);
         printk("Done initializing socket! \n");
    (as you can see, I've added some debug messages to try and figure things out)
    Given the above two code snippets, the logs look like this:
    >>> s = None
    >>> s = socket.socket()
    Trying to initialize socket...
    Dead here...15
    [00:02:07.799,316] <dbg> net_sock_tls: tls_alloc: (mp_main): Allocated TLS context, 0x2000a0d0
    Dead here...17
    Dead here...17.5: -1
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    OSError: 23
    >>>

    Which shows us that "zsock_socket()" returns -1. But when a function with the same name gets called in my modsocket.c, the value returned is 23. I have no idea how to interpret it, or if the two "ztls_socket()" functions are one and the same or not.
Related