Enabling the TLS layer to get a HTTPS connection going.

7343.nrf7002dk_nrf5340_cpuapp_ns.conf3124.prj.confHello everyone.

WE're trying to make a https connection with google.com and execute a GET request.

Wifi connection is working; DHCP seems to be working (my personal assumption given the log message we get: "Resolved: [(1, 1, 6, '', ('142.250.201.206', 443))]" which indicates that getaddrinfo() works); but when trying to initiate the socket via TLS, something strange happens: we get the error "OSError: 109".

Inserting some debug prints inside subsys/net/lib/sockets/, we found the culprit to be the function "int zsock_setsockopt_ctx(struct net_context *ctx, int level, int optnameconst void *optval, socklen_t optlen)".

The function call that triggers error 109 is:  res = setsockopt(socket->ctx, SOL_TLS, TLS_PEER_VERIFY, &verify, sizeof(verify));

No matter what other option we try to set via setsockopt(), it will fail with the 109 error since the implementation for setsockopt() is somehow set to sockets_inet.c (whose implementation does not recognise SOL_TLS as a valid in its switches) instead of sockets_tls.c (which has handling for SOL_TLS in its switches). My personal hunch is that the config options set in the project are somehow wrong. Can someone please take a look over our .conf files? Maybe we can find the culprit. :)

We can provide any extra code snippets that are necessary for debugging and/ or run any tests. Have a great day and hope to hear from you soon!

Parents
  • Hi,

     

    I used net/https_client for this exercise.

    You need to download r1.pem from here: https://pki.goog/repository/

     

    Place this in certs/ folder, and make sure that you change the file in CMakeLists.txt, change the domain in kconfig, and add the required configurations in the board .conf file:

    diff --git a/samples/net/https_client/CMakeLists.txt b/samples/net/https_client/CMakeLists.txt
    index 2a937786ed..39276fd2e2 100644
    --- a/samples/net/https_client/CMakeLists.txt
    +++ b/samples/net/https_client/CMakeLists.txt
    @@ -14,7 +14,7 @@ set(gen_dir ${CMAKE_CURRENT_BINARY_DIR}/certs)
     zephyr_include_directories(${gen_dir})
     generate_inc_file_for_target(
         app
    -    cert/DigiCertGlobalG2.pem
    +    cert/r1.pem
         ${gen_dir}/DigiCertGlobalG2.pem.inc
         )
     
    diff --git a/samples/net/https_client/Kconfig b/samples/net/https_client/Kconfig
    index 90ad33f42e..bb22e82794 100644
    --- a/samples/net/https_client/Kconfig
    +++ b/samples/net/https_client/Kconfig
    @@ -15,7 +15,7 @@ config SAMPLE_TFM_MBEDTLS
     
     config HTTPS_HOSTNAME
            string "HTTPS hostname"
    -       default "example.com"
    +       default "google.com"
     
     endmenu
     
    diff --git a/samples/net/https_client/boards/nrf7002dk_nrf5340_cpuapp_ns.conf b/samples/net/https_client/boards/nrf7002dk_nrf5340_cpuapp_ns.conf
    index 9eb362cb16..8366313af8 100644
    --- a/samples/net/https_client/boards/nrf7002dk_nrf5340_cpuapp_ns.conf
    +++ b/samples/net/https_client/boards/nrf7002dk_nrf5340_cpuapp_ns.conf
    @@ -69,3 +69,20 @@ CONFIG_MBEDTLS_TLS_LIBRARY=y
     CONFIG_TFM_PROFILE_TYPE_SMALL=y
     CONFIG_PM_PARTITION_SIZE_TFM_SRAM=0xc000
     CONFIG_PM_PARTITION_SIZE_TFM=0x20000
    +
    +CONFIG_MBEDTLS_SSL_SERVER_NAME_INDICATION=y
    +CONFIG_MBEDTLS_SSL_RENEGOTIATION=y
    +CONFIG_MBEDTLS_SSL_MAX_FRAGMENT_LENGTH=y
    +CONFIG_MBEDTLS_SSL_SESSION_TICKETS=y
    +CONFIG_PSA_WANT_RSA_KEY_SIZE_4096=y
    +CONFIG_MBEDTLS_MPI_MAX_SIZE=512
    +
    +CONFIG_LOG=y
    +CONFIG_MBEDTLS_DEBUG=y
    +CONFIG_MBEDTLS_SSL_DEBUG_ALL=y
    +CONFIG_MBEDTLS_LOG_LEVEL_DBG=y
    +CONFIG_MBEDTLS_DEBUG_C=y
    +CONFIG_MBEDTLS_DEBUG_LEVEL=4
    +# Handle the large influx of prints
    +CONFIG_LOG_BUFFER_SIZE=16384
    +CONFIG_LOG_BACKEND_UART=y
    

    I also need to add CONFIG_NET_IPV6=n due to a local network issue at my end.

     

    Kind regards,

    Håkon

  • There are many options and suboptions in the link you sent me. Which one is the correct one?

    When attempting to get it working, I got r1.der and then created r1.der.inc. But I'm not sure which option I chose.

  • GEN_ABSOLUTE_SYM_KCONFIG(CONFIG_NET_SOCKETS_POLL_MAX, 16);
    GEN_ABSOLUTE_SYM_KCONFIG(CONFIG_ZVFS_POLL_MAX, 16);

    Confirms they are identical and set to 16.

    is not the .config file. files with prefix "." can be looked as hidden, especially if you are running windows and do not have show all files in your explorer.

    The .config file is located in ../build-folder/application-name/zephyr/.config 

    I saw mentions in other tickets about the "magical .config" file but I've searched my entire project folder and the SDK folder itself and no such file exists. I'm running on Mac. The file that I mentioned seems to be its equivalent (probably a v3.0.0 thing). I'm a bit convinced by this since every time I build it gets regenerated.

    and this is the errno??? please confirm

    Sadly I can't confirm with 100% certainty. The Zephyr stack + the Nordic libs are so complex and intricate that all I can say is... I think so. :))

    What I can say to elaborate a bit:

    Inside "static int ztls_socket(int family, int type, int proto)", which is in sockets_tls.c at one point the function reaches these lines:
    	printk("Dead here...15\n");
    	ctx = tls_alloc();
    	if (ctx == NULL) {
    		errno = ENOMEM;
    	printk("Dead here...16\n");
    		goto free_fd;
    	}
    	
    	printk("Dead here...17\n");
    
    	sock = zsock_socket(family, type, proto);
    	if (sock < 0) {
    		printk("Dead here...17.5: %d\n", sock);
    		goto release_tls;
    	}
    	printk("Dead here...18\n");


    Also, in my my modsocket.c, I have the function "STATIC mp_obj_t socket_make_new(const mp_obj_type_t *type, size_t n_args, size_t n_kw, const mp_obj_t *args) {" which at one point reaches these lines:
         printk("Trying to initialize socket...\n");
         socket->ctx = zsock_socket(family, socktype, proto);
         RAISE_SOCK_ERRNO(socket->ctx);
         printk("Done initializing socket! \n");
    (as you can see, I've added some debug messages to try and figure things out)
    Given the above two code snippets, the logs look like this:
    >>> s = None
    >>> s = socket.socket()
    Trying to initialize socket...
    Dead here...15
    [00:02:07.799,316] <dbg> net_sock_tls: tls_alloc: (mp_main): Allocated TLS context, 0x2000a0d0
    Dead here...17
    Dead here...17.5: -1
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    OSError: 23
    >>>

    Which shows us that "zsock_socket()" returns -1. But when a function with the same name gets called in my modsocket.c, the value returned is 23. I have no idea how to interpret it, or if the two "ztls_socket()" functions are one and the same or not.
  • After injecting A LOT MORE debug code, I've reached this:
    /opt/nordic/ncs/v3.0.0/zephyr/lib/os/fdtable.c: static int _find_fd_entry(void)

    The contents of the function:

    static int _find_fd_entry(void)
    {
    	int fd;
    	for (fd = 0; fd < ARRAY_SIZE(fdtable); fd++) {
    		if (!atomic_get(&fdtable[fd].refcount)) {
    			return fd;
    		}
    	}
    	printk("Dead here...21\n");
    	errno = ENFILE;
    	return -1;
    }

    Notice the print that I've inserted in the function, because here's the log:

    >>> s = None
    >>> s = socket.socket()
    Trying to initialize socket...
    Family: 1, socktype: 1, proto: 258
    Dead here...15
    [00:00:58.780,853] <dbg> net_sock_tls: tls_alloc: (mp_main): Allocated TLS context, 0x2000a0d0
    Dead here...17
    Dead here...21
    Dead here...17.5: -1
    Dead here...20
    Returned value: -1
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    OSError: 23

    So it seems that this function is the one that throws the ENFILE error. Also this answers your question about how certain I am that this is truly an errno. Now we can be certain. :D

    I'll try digging deeper to see what makes this function throw that errno. Any support is much appreciated.

  • A breakthrough happened! It seems that I had to increase CONFIG_ZVFS_OPEN_MAX even more!

    By setting:

    CONFIG_NET_SOCKETS_POLL_MAX=20
    CONFIG_ZVFS_OPEN_MAX=20

    I managed to move forward a bit! The new log is:

    >>> s = None
    >>> s = socket.socket()
    Trying to initialize socket...
    Family: 1, socktype: 1, proto: 258
    Allocated fd: 15
    [00:01:37.937,255] <dbg> net_sock_tls: tls_alloc: (mp_main): Allocated TLS context, 0x2000a190
    Allocated fd: 16
    [00:01:37.952,911] <dbg> net_sock: zsock_socket_internal: (mp_main): socket: ctx=0x2000b0e8, fd=16
    Returned value: 15
    Done initializing socket!
    >>>
    >>>
    >>>
    >>> print("Connecting to:", result[0][-1])
    Connecting to: ('142.250.201.206', 443)
    >>> s.connect(result[0][-1])
    DNS message size: 44
    DNS message content (hex):af 84 81 80 00 01 00 01 00 00 00 00 06 67 6f 6f 67 6c 65 03 63 6f 6d 00 00 01 00 01 c0 0c 00 01 00 01 00 00 00 4a 00 04 8e fa c9 ce
    [00:01:52.708,740] <dbg> mbedtls: zephyr_mbedtls_debug: WEST_TOPDIR/modules/crypto/mbedtls/library/ssl_tls.c:1331: The SSL configuration is tls12 only.
    [00:01:52.723,052] <err> mbedtls: WEST_TOPDIR/modules/crypto/mbedtls/library/ssl_tls.c:1401: alloc(zu bytes) failed
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    OSError: [Errno 12] ENOMEM
    >>> print("TCP Connected.")
    [00:02:02.875,854] <dbg> net_sock: zsock_received_cb: (rx_q[0]): ctx=0x2000b0e8, pkt=(nil), st=0, user_data=(nil)
    [00:02:02.886,566] <dbg> net_sock: zsock_received_cb: (rx_q[0]): Marked socket 0x2000b0e8 as peer-closed

    (for brevity's sake, I removed some of the "Dead here..." messages that I've inserted)

  • Slight problem with the library. In the file ssl_tls.c, inside the function "mbedtls_ssl_setup()", the line:

    MBEDTLS_SSL_DEBUG_MSG(1, ("alloc(%" MBEDTLS_PRINTF_SIZET " bytes) failed", in_buf_len));
    actually prints what you saw in my logs:
    [00:00:36.237,976] <err> mbedtls: WEST_TOPDIR/modules/crypto/mbedtls/library/ssl_tls.c:1402: alloc(zu bytes) failed
    I inserted my own debug code above:
    printk("Dead here...55. Tried to allocate %d bytes and failed\n", in_buf_len);
    and it printed:
    Dead here...55. Tried to allocate 16717 bytes and failed
    The way things are looking, I think we need to find a solution to reduce the RAM usage since we're currently at:
    RAM:      383248 B       416 KB     89.97%
  • Hi!

     

    Great to hear that you fixed the socket issue.

    Tudor B. said:
    actually prints what you saw in my logs:
    [00:00:36.237,976] <err> mbedtls: WEST_TOPDIR/modules/crypto/mbedtls/library/ssl_tls.c:1402: alloc(zu bytes) failed
    I inserted my own debug code above:
    printk("Dead here...55. Tried to allocate %d bytes and failed\n", in_buf_len);
    and it printed:
    Dead here...55. Tried to allocate 16717 bytes and failed

    Could you share the full .config file? I suspect the configured mbedtls heap is too low here.

    Tudor B. said:
    The way things are looking, I think we need to find a solution to reduce the RAM usage since we're currently at:
    RAM:      383248 B       416 KB     89.97%

    You have enabled station and softap, where as only softap uses approx. 222kB RAM, as shown here:

    https://docs.nordicsemi.com/bundle/ncs-latest/page/nrf/protocols/wifi/sap_mode/mem_requirements_sap.html

    In addition, mbedtls will add approx. 80 kB.

     

    Your memory fit is tight, especially when adding micropython into the feature list.

     

    Kind regards,

    Håkon

Reply
  • Hi!

     

    Great to hear that you fixed the socket issue.

    Tudor B. said:
    actually prints what you saw in my logs:
    [00:00:36.237,976] <err> mbedtls: WEST_TOPDIR/modules/crypto/mbedtls/library/ssl_tls.c:1402: alloc(zu bytes) failed
    I inserted my own debug code above:
    printk("Dead here...55. Tried to allocate %d bytes and failed\n", in_buf_len);
    and it printed:
    Dead here...55. Tried to allocate 16717 bytes and failed

    Could you share the full .config file? I suspect the configured mbedtls heap is too low here.

    Tudor B. said:
    The way things are looking, I think we need to find a solution to reduce the RAM usage since we're currently at:
    RAM:      383248 B       416 KB     89.97%

    You have enabled station and softap, where as only softap uses approx. 222kB RAM, as shown here:

    https://docs.nordicsemi.com/bundle/ncs-latest/page/nrf/protocols/wifi/sap_mode/mem_requirements_sap.html

    In addition, mbedtls will add approx. 80 kB.

     

    Your memory fit is tight, especially when adding micropython into the feature list.

     

    Kind regards,

    Håkon

Children
  • We can drop the AP mode (comment it out) and make an internal note akin to: "if you want sockets and https communication -> AP off; AP on -> sockets and https off".

    8360.nrf7002dk_nrf5340_cpuapp_ns.conf

    8787.prj.conf

  • How can I disable SoftAP and leave just STA mode active?

  • Running the "west build -d ./build/zephyr -t ram_report" command produces the following output:
    ram_report.txt

    I've filtered through it and found only a few major RAM hoggers:
    │ ├── _k_mem_slab_buf_tcp_conns_slab 3000 0.49% 0x20063440 noinit
    │ ├── fw_patch 81572 13.36% 0x000b7d68 rodata
    │ ├── heap.lto_priv.0 32768 5.37% 0x2000f024 bss
    │ ├── iface_wq_stack 4400 0.72% 0x20030108 noinit
    │ ├── kheap__system_heap 65536 10.74% 0x2001b1e8 noinit
    │ ├── kheap_net_buf_mem_pool_rx_bufs 4096 0.67% 0x20031238 noinit
    │ ├── kheap_net_buf_mem_pool_tx_bufs 4096 0.67% 0x20032238 noinit
    │ ├── kheap_wifi_drv_ctrl_mem_pool 20000 3.28% 0x20056f60 noinit
    │ ├── kheap_wifi_drv_data_mem_pool 130000 21.30% 0x20037390 noinit
    │ ├── mbedtls_heap 16384 2.68% 0x200171c4 bss
    │ ├── mgmt_stack 4600 0.75% 0x20033638 noinit
    │ ├── mp_thread_stack_array 20480 3.36% 0x2005de80 noinit
    │ ├── supplicant_thread_stack 5600 0.92% 0x2002eb28 noinit
    │ ├── z_main_stack 12288 2.01% 0x2002b328 noinit

    Also worth noting: I suspect AP is disabled based on the most recent .conf files that I've uploaded. I even tried to tune some RX and TX buffs, but it only lowered RAM usage by ~1.65%:

    RAM:      374936 B       416 KB     88.02%

    Also, I've explicitly set: CONFIG_NRF70_AP_MODE=n

    and I found and took in virtually everything from: https://github.com/nrfconnect/sdk-nrf/blob/main/samples/wifi/throughput/overlay-memory-optimized.conf

    which was suggested in the "WiFi stack configuration and performance" documentation page: https://docs.nordicsemi.com/bundle/ncs-latest/page/nrf/protocols/wifi/stack_configuration.html

    I even tried disabling CONFIG_WIFI_NM_WPA_SUPPLICANT, but RAM usage didn't lower by much and anyway I suspect we need this one to enable STA mode to connect to a WiFi.

  • I tried playing with various stacks, including CONFIG_MAIN_STACK_SIZE. But lowering other stacks and squeezing in CONFIG_MBEDTLS_HEAP_SIZE=81920, I reach a RAM usage of:

    RAM:      407704 B       416 KB     95.71%

    The image flashes, but when I try the same scenario of connecting to WiFi, doing getaddrinfo(), then opening a socket, I get a Stack Overflow when trying to connect to the WiFi network:

    Network ID: DIGI-4TYa & Network MAC: AC:CC:36:55:68:E9
    Network ID: DIGI-4uE3 & Network MAC: AC:CC:36:4D:9A:51
    Network ID: DIGI-C7uG & Network MAC: 28:F8:D6:C7:E5:91
    Network ID: DIGI-kTWh & Network MAC: 1C:BF:CE:9E:E2:38
    Network ID: Tea2.4 & Network MAC: AE:CC:36:1D:9A:51
    Network ID: DIGI-9x4D & Network MAC: F0:A7:31:6F:2D:7A
    Network ID: DIGI-Wpk7 & Network MAC: 74:31:AF:15:E0:41
    Network ID: HomeA&A & Network MAC: 92:A2:F4:9E:B3:D8
    MAC: F4:CE:36:00:1C:F4
    [(b'', b'62:ED:00:CD:74:72\x00', 44, -47, 1, False), (b'TP-Link_7474', b'40:ED:00:CD:74:72\x00', 44, -47, 1, False), (b'', b'62:ED:00:CD:74:73\x00', 4, -47, 1, False), (b'TP-Link_7474', b'40:ED:00:CD:74:]
    >>>
    >>> wlan.connect("TP-Link_7474", "55920322", network.SECURITY_PSK, 4)
    [00:00:27.243,804] <err> os: ***** USAGE FAULT *****
    [00:00:27.249,450] <err> os:   Stack overflow (context area not valid)
    [00:00:27.256,713] <err> os: r0/a1:  0x20033528  r1/a2:  0x00000001  r2/a3:  0x00078000
    [00:00:27.265,472] <err> os: r3/a4:  0x00000000 r12/ip:  0x00008000 r14/lr:  0x0002446d
    [00:00:27.274,200] <err> os:  xpsr:  0x41000200
    [00:00:27.279,479] <err> os: s[ 0]:  0xaaaaaaaa  s[ 1]:  0xaaaaaaaa  s[ 2]:  0xaaaaaaaa  s[ 3]:  0xaaaaaaaa
    [00:00:27.289,978] <err> os: s[ 4]:  0xaaaaaaaa  s[ 5]:  0xaaaaaaaa  s[ 6]:  0xaaaaaaaa  s[ 7]:  0xaaaaaaaa
    [00:00:27.300,445] <err> os: s[ 8]:  0xaaaaaaaa  s[ 9]:  0xaaaaaaaa  s[10]:  0xaaaaaaaa  s[11]:  0xaaaaaaaa
    [00:00:27.310,943] <err> os: s[12]:  0x00000000  s[13]:  0x000730db  s[14]:  0x2000a060  s[15]:  0x2000a060
    [00:00:27.321,411] <err> os: fpscr:  0x000f4240
    [00:00:27.326,690] <err> os: Faulting instruction address (r15/pc): 0x0002879a
    [00:00:27.334,655] <err> os: >>> ZEPHYR FATAL ERROR 2: Stack overflow on CPU 0
    [00:00:27.342,590] <err> os: Current thread: 0x20009e08 (mp_main)
    [00:00:27.349,426] <err> os: Halting system

Related