nRF54L15 DK: Zephyr POSIX UDP socket over OpenThread/NAT64 sends CoAP successfully, but poll()/recv() does not receive returned packet

Hello Nordic team,

I am testing a plain CoAP telemetry upload from an nRF54L15 DK over Thread using a Raspberry Pi OTBR and NAT64 to ThingsBoard Cloud.

Hardware / software:

  • Board: nRF54L15 DK

  • nRF Connect SDK: v3.2.4-4c3fc0d44534

  • Zephyr: v4.2.99-9673eec75908

  • OpenThread with a Raspberry Pi OTBR

  • Destination: ThingsBoard Cloud CoAP endpoint

  • NAT64 IPv6 address used by the device: fd57:cacb:e8cf:2::343a:6f12

  • Plain CoAP port: 5683

The application uses Zephyr/POSIX UDP sockets together with Zephyr’s CoAP packet builder.

The relevant socket flow is:

sock = socket(AF_INET6, SOCK_DGRAM, IPPROTO_UDP);

memset(&server_addr, 0, sizeof(server_addr));
server_addr.sin6_family = AF_INET6;
server_addr.sin6_port = htons(5683);
server_addr.sin6_scope_id = 0U;
inet_pton(AF_INET6, "fd57:cacb:e8cf:2::343a:6f12", &server_addr.sin6_addr);

connect(sock, (struct sockaddr *)&server_addr, sizeof(server_addr));

send(sock, request.data, request.offset, 0);

poll(..., 5000);
recv(sock, response_buf, sizeof(response_buf), MSG_DONTWAIT);

The CoAP message is a Confirmable POST to:

/api/v1/<ThingsBoard device access token>/telemetry

with JSON payload, for example:

{"temperature":25,"counter":0,"source":"zephyr-coap-ack"}

The generated CoAP packet length is 113 bytes.

Sending works reliably. ThingsBoard receives the telemetry. The OTBR tcpdump shows that ThingsBoard sends a 12-byte CoAP response back to the same UDP source port.

Example OTBR tcpdump:

wpan0 In  IP 192.168.255.3.48998 > 52.58.111.18.5683: UDP, length 113
eth0  Out IP 192.168.178.43.48998 > 52.58.111.18.5683: UDP, length 113
eth0  In  IP 52.58.111.18.5683 > 192.168.178.43.48998: UDP, length 12
wpan0 Out IP 52.58.111.18.5683 > 192.168.255.3.48998: UDP, length 12

The response is then retransmitted by ThingsBoard, which is expected because the device does not ACK the separate Confirmable CoAP response:

eth0  In  IP 52.58.111.18.5683 > 192.168.178.43.48998: UDP, length 12
wpan0 Out IP 52.58.111.18.5683 > 192.168.255.3.48998: UDP, length 12

With tcpdump -X, the CoAP response was decoded as:

48 41 ... <8 byte token>

So this is a separate Confirmable 2.01 Created response.

I also tested a Non-confirmable POST. In that case ThingsBoard responded with:

58 41 ... <8 byte token>

So ThingsBoard also sends a Non-confirmable 2.01 Created response for a NON request. This confirms that the issue is not only related to missing ACK handling.

The problem:

  • The outgoing packet works.

  • ThingsBoard receives the telemetry.

  • ThingsBoard replies.

  • The response is visible on the OTBR.

  • The response is sent back to the same UDP source port.

  • But on the nRF54L15 DK, poll() times out and recv() does not receive the response.

Additional tests already tried:

  • connect() + send() + poll() + recv()

  • sendto() without connect(), with receive path unchanged

  • explicit bind() to in6addr_any and port 0

  • poll() with timeout and with infinite timeout

  • recv(MSG_DONTWAIT) after poll()

  • logging reduced and switched from printf() to Zephyr LOG to avoid UART timing effects

The result remains that the response is visible on the OTBR but not delivered to the application socket.

As additional context: I already tested the native OpenThread CoAP API for plain CoAP, and sending telemetry with it worked. So using OpenThread’s native CoAP path may be a viable option for the unencrypted case.

However, for the encrypted case I would like to use CoAPS with X.509 client certificates against ThingsBoard Cloud. With OpenThread’s CoAP Secure API I was not able to establish the X.509/DTLS connection. otCoapSecureConnect() failed locally before any DTLS packet was visible on the OTBR. PSK-based CoAP Secure worked, but X.509 did not.

The client certificate and private key worked from a CoAPS client on the Raspberry Pi against ThingsBoard Cloud. My current suspicion is that the ThingsBoard Cloud server certificate chain or selected cipher suite may require RSA-based authentication or a configuration that is not supported by the OpenThread CoAP Secure setup I am using. My OpenThread/NCS build has ECDHE-ECDSA enabled, and ECJPAKE could not be disabled because the prebuilt Nordic OpenThread library requires it.

Questions:

  1. Is using Zephyr/POSIX UDP sockets over OpenThread/NAT64 expected to work for this use case on nRF54L15 DK / NCS v3.2.4?

  2. Is the OTBR tcpdump view with IPv4-like addresses on wpan0 expected in this NAT64 setup, and should the end device still receive the packet through an AF_INET6 UDP socket connected to the synthetic NAT64 IPv6 address?

  3. Are there known limitations or required Kconfig options for receiving UDP responses through POSIX sockets over OpenThread/NAT64?

  4. Would you recommend using native OpenThread UDP/CoAP APIs instead of Zephyr/POSIX sockets for this use case?

  5. If native OpenThread CoAP is the recommended path: what would be the recommended approach for CoAPS with X.509 client certificates against a public cloud endpoint such as ThingsBoard Cloud?

  6. Is OpenThread CoAP Secure with X.509 expected to work against such a cloud endpoint, provided the certificates and Kconfig options are correct?

  7. What would be the best way to verify whether the returned packet reaches the OpenThread IPv6/UDP layer on the device but is not delivered to the Zephyr socket?

Relevant prj.conf options include:

CONFIG_POSIX_API=y
CONFIG_ZVFS_POLL_MAX=4
CONFIG_LOG=y
CONFIG_LOG_MODE_DEFERRED=y
CONFIG_LOG_DEFAULT_LEVEL=3
CONFIG_OPENTHREAD_THREAD_STACK_SIZE=8192

I can provide the complete minimal source file, prj.conf, UART logs, and OTBR tcpdump logs. The ThingsBoard device access token and credentials are replaced by placeholders.

Best regards,
Markus

CoAP_Zephyr_API_Nordic.zip

Parents
  • Hi Markus,

    I am working on your case and need time to prepare your answer. I will update the case once I gather enough information. 

    Regards,
    Amanda H. 

  • Hi,

    I wanted to give a quick update: the issue is now resolved.

    I cannot say with 100% certainty what the original root cause was, because I changed a few things during debugging. It may have been a combination of an inconsistent OTBR/RCP setup and the client-side OpenThread configuration.

    The original OTBR/RCP setup was replaced with a clean native OTBR installation and a freshly flashed RCP. After that, the Thread network itself became stable.

    On the Zephyr/OpenThread client side, the important finding was that the device did not initially create an OMR/SLAAC address from the prefix advertised by the border router. It only had mesh-local and link-local addresses, and therefore packets to the NAT64 prefix were not sent at all.

    After building OpenThread from source instead of using the prebuilt Nordic OpenThread library, and enabling SLAAC with:

    CONFIG_OPENTHREAD_SLAAC=y

    the device now automatically receives an address from the OMR prefix, for example:

    fd2a:4782:c31a:1:....

    With that address present, NAT64 works correctly and the Zephyr socket-based CoAP client can send telemetry to ThingsBoard Cloud via the OTBR. I now receive a CoAP 2.01 Created response from ThingsBoard.

    So the current working setup is:

    • native OTBR with NAT64 enabled

    • fresh RCP firmware

    • Zephyr/OpenThread client built without the prebuilt Nordic OpenThread library

    • CONFIG_OPENTHREAD_SLAAC=y

    • Thread 1.3

    • CoAP over NAT64 to ThingsBoard Cloud working successfully

    The main relevant client configuration options are:

    CONFIG_NETWORKING=y
    CONFIG_NET_L2_OPENTHREAD=y
    CONFIG_OPENTHREAD=y
    CONFIG_OPENTHREAD_THREAD_VERSION_1_3=y
    CONFIG_OPENTHREAD_FTD=y
    CONFIG_OPENTHREAD_SLAAC=y
    CONFIG_OPENTHREAD_COAP=y
    CONFIG_NET_SOCKETS=y
    CONFIG_NET_UDP=y
    CONFIG_NET_IPV6=y
    CONFIG_COAP=y

    CONFIG_OPENTHREAD_JOINER=y was enabled during debugging for Commissioner/Joiner tests, but it is not required for my current fixed-dataset setup.

    As a next step, I will try to enable security and test CoAPS/DTLS with the ThingsBoard certificate setup, using the Zephyr CoAP/socket APIs.

    I will continue with this configuration for now, since it is stable. Thanks for your help.

    Regards,
    Markus

Reply
  • Hi,

    I wanted to give a quick update: the issue is now resolved.

    I cannot say with 100% certainty what the original root cause was, because I changed a few things during debugging. It may have been a combination of an inconsistent OTBR/RCP setup and the client-side OpenThread configuration.

    The original OTBR/RCP setup was replaced with a clean native OTBR installation and a freshly flashed RCP. After that, the Thread network itself became stable.

    On the Zephyr/OpenThread client side, the important finding was that the device did not initially create an OMR/SLAAC address from the prefix advertised by the border router. It only had mesh-local and link-local addresses, and therefore packets to the NAT64 prefix were not sent at all.

    After building OpenThread from source instead of using the prebuilt Nordic OpenThread library, and enabling SLAAC with:

    CONFIG_OPENTHREAD_SLAAC=y

    the device now automatically receives an address from the OMR prefix, for example:

    fd2a:4782:c31a:1:....

    With that address present, NAT64 works correctly and the Zephyr socket-based CoAP client can send telemetry to ThingsBoard Cloud via the OTBR. I now receive a CoAP 2.01 Created response from ThingsBoard.

    So the current working setup is:

    • native OTBR with NAT64 enabled

    • fresh RCP firmware

    • Zephyr/OpenThread client built without the prebuilt Nordic OpenThread library

    • CONFIG_OPENTHREAD_SLAAC=y

    • Thread 1.3

    • CoAP over NAT64 to ThingsBoard Cloud working successfully

    The main relevant client configuration options are:

    CONFIG_NETWORKING=y
    CONFIG_NET_L2_OPENTHREAD=y
    CONFIG_OPENTHREAD=y
    CONFIG_OPENTHREAD_THREAD_VERSION_1_3=y
    CONFIG_OPENTHREAD_FTD=y
    CONFIG_OPENTHREAD_SLAAC=y
    CONFIG_OPENTHREAD_COAP=y
    CONFIG_NET_SOCKETS=y
    CONFIG_NET_UDP=y
    CONFIG_NET_IPV6=y
    CONFIG_COAP=y

    CONFIG_OPENTHREAD_JOINER=y was enabled during debugging for Commissioner/Joiner tests, but it is not required for my current fixed-dataset setup.

    As a next step, I will try to enable security and test CoAPS/DTLS with the ThingsBoard certificate setup, using the Zephyr CoAP/socket APIs.

    I will continue with this configuration for now, since it is stable. Thanks for your help.

    Regards,
    Markus

Children
No Data
Related