Hello Nordic team,
After many hours I have not been able to get socket dispatching working with Zephyrs MQTT client, and I am not sure how to proceed.
First of all I would like to ask if, since that post was made (about a year ago), anything has changed regarding the use of the modem requiring Socket Offloading, or if there is a more idiomatic way (perhaps with an example) as to how I can use both an Ethernet, and Cellular network interfaces together in the same project.
The issue: How to get both Ethernet and Cellular interfaces working in the same application nRF Connect SDK application using Zephyr RTOS, with the Zephyr MQTT client and DNS hostname resolution using getaddrinfo working.
Dev Setup:
Hardware:
- nRF9151 SiP by Nordic Semi - contains internal cellular modem we are using for LTE network connection. (HW version 0.9)
- W5500 Ethernet controller by Wiznet - communicates with the nRF9151 over SPI, and contains its own TCP/IP stack.
Software:
- nRF Connect SDK v2.9.1
- nRF Connect VS code extension
What have I tried? / Journey so far / Background:
After many days I have not been able to get the Zephyr MQTT client library / API working in an application that uses both Ethernet and Cellular interfaces. I have found many posts online on the Nordic Devzone which mention this sort of problem, although they are over a year old so I am optimistic things may have changed since then. Here are some of those posts:
This post mentions issues with getaddrinfo() not working when Socket Offloading is enabled. In that post as well is a post by a Nordic modem team member who mentions that Socket Offloading is REQUIRED in order to use the internal modem of the nRF9151, and then shares links to things related to socket offloading and the socket dispatcher.
I have been able to successfully set up an example program that can be seen here with both socket offloading and native sockets Kconfig symbols enabled, and demonstrated the creation and use of sockets and the socket dispatcher by binding them to either interface, and testing TCP, UDP, and DNS. That being said, the DNS hostname resolution (via getaddrinfo() fxn), does not work unless the modem is init, which leads me to believe this doesn't actually use the Ethernet interface despite my code explicitly binding the socket created to the eth0 interfaces using setsockopt.
https://github.com/dheadrick1618/zephyr_socket_dispatching_eth_and_cell_example
Another major issues is that the Zephyr MQTT client library handles the creation and management of TCP sockets for broker connection and message publish and subscribe, in the background, and the user is not able to directly assign a socket fd to the mqtt client, instead they must use the 'client_connect()' fxn, which looks like this:
static int client_connect(struct mqtt_client *client) { int err_code; struct buf_ctx packet; err_code = mqtt_transport_connect(client); if (err_code < 0) { return err_code; } tx_buf_init(client, &packet); MQTT_SET_STATE(client, MQTT_STATE_TCP_CONNECTED); err_code = connect_request_encode(client, &packet); if (err_code < 0) { goto error; } /* Send MQTT identification message to broker. */ err_code = mqtt_transport_write(client, packet.cur, packet.end - packet.cur); if (err_code < 0) { goto error; } client->internal.last_activity = mqtt_sys_tick_in_ms_get(); /* Reset the unanswered ping count for a new connection */ client->unacked_ping = 0; NET_INFO("Connect completed"); return 0; error: mqtt_client_disconnect(client, err_code, false); return err_code; }
I then discovered the 'mqtt_transport_connect()' fxn can be configured to use a custom transport layer via setting the following Kconfig 'CONFIG_MQTT_LIB_CUSTOM_TRANSPORT=y', and by defining the correctly named transport layer functions to be called via the associated function pointers in that struct.
``` /**@brief Function pointer array for TCP/TLS transport handlers. */ const struct transport_procedure transport_fn[MQTT_TRANSPORT_NUM] = { { mqtt_client_tcp_connect, mqtt_client_tcp_write, mqtt_client_tcp_write_msg, mqtt_client_tcp_read, mqtt_client_tcp_disconnect, }, #if defined(CONFIG_MQTT_LIB_TLS) { mqtt_client_tls_connect, mqtt_client_tls_write, mqtt_client_tls_write_msg, mqtt_client_tls_read, mqtt_client_tls_disconnect, }, #endif /* CONFIG_MQTT_LIB_TLS */ #if defined(CONFIG_MQTT_LIB_WEBSOCKET) { mqtt_client_websocket_connect, mqtt_client_websocket_write, mqtt_client_websocket_write_msg, mqtt_client_websocket_read, mqtt_client_websocket_disconnect, }, #if defined(CONFIG_MQTT_LIB_TLS) { mqtt_client_websocket_connect, mqtt_client_websocket_write, mqtt_client_websocket_write_msg, mqtt_client_websocket_read, mqtt_client_websocket_disconnect, }, #endif /* CONFIG_MQTT_LIB_TLS */ #endif /* CONFIG_MQTT_LIB_WEBSOCKET */ #if defined(CONFIG_MQTT_LIB_CUSTOM_TRANSPORT) { mqtt_client_custom_transport_connect, mqtt_client_custom_transport_write, mqtt_client_custom_transport_write_msg, mqtt_client_custom_transport_read, mqtt_client_custom_transport_disconnect, }, #endif /* CONFIG_MQTT_LIB_CUSTOM_TRANSPORT */ }; ```
This lead me to create my own definition of these transport layer functions, taking inspiration from those defined in the 'mqtt_transport_socket_tcp.c' zephyr source code, although with mine including the use of the socket dispatcher when sockets are created (using setsockopt() ).
Unfortunately, not A SINGLE example exists online actually using a mqtt custom transport layer, and despite it appearing as if the MQTT connection is complete, the MQTT connection eventually times out before I am able to use it
This being said, not being able to modify the use of getaddrinfo() to work with a specific interface with socket offloading enabled, and thus not being able to resolve an address hostname using DNS with the Ethernet interface, leaves me stuck and unsure how to proceed with this project given these issues. I am not able to further debug this issue as it appears to be somewhat inconsistent, and I have spent too much time on this so far and must work on other things for the time being, but I seem to be getting 'SECURE FAULT' errors after the MQTT client connects, but before I am able to publish messages to the broker. That being said, even when I resolve these 'SECURE FAULT' errors, I will still need to figure out how to get the DNS hostname resolution working.
*** Starting MQTT Demo *** [00:00:15.101,104] <inf> socket_dispatcher_demo: ========================================== [00:00:15.101,165] <inf> socket_dispatcher_demo: Starting MQTT test on interface: eth0 [00:00:15.101,165] <inf> socket_dispatcher_demo: ========================================== [00:00:15.101,196] <inf> socket_dispatcher_demo: Testing socket creation on interface 'eth0' [00:00:15.101,898] <dbg> net_sock: zsock_socket_internal: (main): socket: ctx=0x20010660, fd=6 [00:00:15.101,989] <inf> socket_dispatcher_demo: Successfully created and bound socket to interface 'eth0' [00:00:15.102,020] <dbg> net_sock: zsock_close_ctx: (main): close: ctx=0x20010660, fd=5 [00:00:15.102,691] <inf> socket_dispatcher_demo: Socket closed [00:00:15.102,722] <inf> mqtt_api: Connecting to MQTT broker via interface: eth0 [00:00:15.460,449] <inf> mqtt_api: Resolved broker.hivemq.com to 3.73.173.107:1883 [00:00:15.460,540] <inf> mqtt_custom_transport: MQTT client initialized with interface binding to: eth0 [00:00:15.460,662] <inf> mqtt_api: Generated MQTT client ID: nrf9151_demo_000001 [00:00:15.460,662] <inf> mqtt_api: Sending MQTT CONNECT packet to broker... [00:00:15.460,723] <inf> mqtt_custom_transport: Creating custom transport connection to broker with interface: eth0 [00:00:15.461,578] <dbg> net_sock: zsock_socket_internal: (main): socket: ctx=0x20010660, fd=6 [00:00:15.877,044] <inf> mqtt_custom_transport: Successfully connected to MQTT broker via interface eth0 (socket 5) [00:00:15.879,425] <inf> net_mqtt: Connect completed [00:00:15.879,425] <inf> mqtt_api: Waiting for MQTT CONNACK... [00:00:16.026,153] <dbg> net_sock: zsock_received_cb: (rx_q[0]): ctx=0x20010660, pkt=0x20019988, st=0, user_data=(nil) [00:00:16.881,561] <inf> mqtt_api: MQTT event received: type=0, result=0 [00:00:16.881,591] <inf> mqtt_api: MQTT client connected successfully! [00:00:16.881,652] <inf> mqtt_api: MQTT connected successfully via interface: eth0 [00:00:16.881,683] <inf> socket_dispatcher_demo: MQTT connected successfully via eth0 [00:00:16.885,009] <err> os: ***** SECURE FAULT ***** [00:00:16.885,009] <err> os: Address: 0x41f1 [00:00:16.885,009] <err> os: Attribution unit violation [00:00:16.885,040] <err> os: r0/a1: 0x000041f1 r1/a2: 0x00000000 r2/a3: 0x000041f0 [00:00:16.885,070] <err> os: r3/a4: 0x000041f1 r12/ip: 0x0000000a r14/lr: 0x0003636d [00:00:16.885,070] <err> os: xpsr: 0x21000000 [00:00:16.885,101] <err> os: Faulting instruction address (r15/pc): 0x0004995c [00:00:16.885,131] <err> os: >>> ZEPHYR FATAL ERROR 41: Unknown error on CPU 0 [00:00:16.885,162] <err> os: Current thread: 0x2000f508 (main) [00:00:17.016,265] <err> os: Halting system
Finally there is a post here that mentions a very hacky and messy way to get the DNS resolution working, but I want to avoid modifying the Zephyr library (and thus then having to maintain a fork to keep it up to date).
Any help would be appreciated, as this is absolutely crucial for our project.