MQTT TLS connect blocking

Hi

I am currently working on a custom nrf9160 board and I have a random bug that appears to be in this function mqtt_client_connect_tls.

For 3min, the board consumption is 4mA and then the mqtt returns -116 error (timeout).

I have found in the devzone that is a known bug that should have been corrected in modem version 1.3.0.

 Blocking socket calls in MQTT library 

I am sure that only one thread is accessing the socket.

I have also tested increase the config like said. It doesn't change the problem.

CONFIG_NRF_MODEM_LIB_HEAP_SIZE=4096
CONFIG_NRF_MODEM_LIB_SENDMSG_BUF_SIZE=2048

09:46:17.528) (COM2) (>>) [16:42:30.076,232] <inf> MQTT_STANDARD_MODULE: TLS enabled<CR><LF>
(09:46:17.528) (COM2) (>>) [16:42:30.076,690] <dbg> net_mqtt_sock_tls: (): Created socket 0<CR><LF>
(09:47:04.943) (COM2) (>>) [16:43:17.424,224] <dbg> lte_lc: +CSCON notification<CR><LF>
(09:47:04.943) (COM2) (>>) [16:43:17.424,346] <dbg> lte_lc_helpers: Dispatching events:<CR><LF>
(09:47:04.943) (COM2) (>>) [16:43:17.424,377] <dbg> lte_lc_helpers:  - handler=0x0002ED49<CR><LF>
(09:47:04.943) (COM2) (>>) [16:43:17.424,377] <inf> module_modem: MODEM notification: RRC_UPDATE<CR><LF>
(09:47:04.943) (COM2) (>>) [16:43:17.424,377] <inf> module_modem: RRC mode: Idle<CR><LF>
(09:47:04.943) (COM2) (>>) [16:43:17.424,407] <dbg> lte_lc_helpers: Done<CR><LF>
(09:47:04.943) (COM2) (>>) [16:43:17.818,756] <dbg> lte_lc: +CEREG notification: +CEREG: 5,"C3D5","00136CCF",9,,,"00001010","11000001"<CR><LF>
(09:47:04.943) (COM2) (>>) <CR><LF>
(09:47:04.943) (COM2) (>>) [16:43:17.818,969] <dbg> lte_lc_helpers: Network registration status: 5<CR><LF>
(09:47:04.943) (COM2) (>>) [16:43:17.819,000] <dbg> lte_lc_helpers: LTE mode: 9<CR><LF>
(09:47:04.943) (COM2) (>>) [16:43:17.819,061] <dbg> lte_lc_helpers: Dispatching events:<CR><LF>
(09:47:04.943) (COM2) (>>) [16:43:17.819,061] <dbg> lte_lc_helpers:  - handler=0x0002ED49<CR><LF>
(09:47:04.943) (COM2) (>>) [16:43:17.819,061] <inf> module_modem: MODEM notification: CELL_UPDATE<CR><LF>
(09:47:04.943) (COM2) (>>) [16:43:17.819,091] <inf> module_modem: LTE cell changed: Cell ID: 1273039, Tracking area: 50133<CR><LF>
(09:47:04.943) (COM2) (>>) [16:43:17.819,091] <dbg> lte_lc_helpers: Done<CR><LF>
(09:47:04.943) (COM2) (>>) [16:43:17.819,824] <dbg> lte_lc_helpers: TAU: 1152000 sec, active time: 20 sec<LF><CR><LF>
(09:47:04.943) (COM2) (>>) [16:43:17.819,824] <dbg> lte_lc: TAU: 1152000 sec, active time: 20 sec<LF><CR><LF>
(09:47:24.895) (COM2) (>>) [16:43:37.425,964] <dbg> lte_lc: %XMODEMSLEEP notification<CR><LF>
(09:47:24.895) (COM2) (>>) [16:43:37.426,116] <dbg> lte_lc_helpers: Dispatching events:<CR><LF>
(09:47:24.895) (COM2) (>>) [16:43:37.426,116] <dbg> lte_lc_helpers:  - handler=0x0002ED49<CR><LF>
(09:47:24.895) (COM2) (>>) [16:43:37.426,116] <inf> module_modem: MODEM notification: MODEM_SLEEP_ENTER<CR><LF>
(09:47:24.895) (COM2) (>>) [16:43:37.426,147] <dbg> lte_lc_helpers: Done<CR><LF>
(09:49:03.930) (COM2) (>>) [16:45:16.989,501] <err> MQTT_STANDARD_MODULE: mqtt_connect, error: -116<CR><LF>
(09:49:04.060) (COM2) (>>) [16:45:16.989,532] <wrn> MQTT_STANDARD_MODULE: connection_poll_start failed, error: -116<CR><LF>
(09:49:04.060) (COM2) (>>) [16:45:16.989,532] <inf> MODULE_MQTT: standard_cloud_connect -116<CR><LF>
(09:49:04.060) (COM2) (>>) [16:45:16.989,562] <err> MODULE_MQTT: mqtt_connect -116<CR><LF>

I attach the log that I can have.

Do you know if there is a correction available ?

Thank you in advance

Antoine

(sdk 1.9.1, modem FW 1.3.2)

Parents Reply Children
  • Using TCP over NB-IoT is not recommended, you should rather use UDP if possible.

    "There are three TLS connections in the log. 1st and 3rd TLS connection is successfully established. The 2nd TLS connection fails to timeout.

    The same problem is visible in all three TLS connections. The NB-IoT connection is so slow that server starts to send TCP re-transmissions during the TLS handshake.

    The Server Hello message contains TLS handshake protocol messages: Server Hello, Certificate, Server Key Exchange, Certificate Request, and Server Hello Done. It's length is 3583 bytes and it is split to six TCP segments. In all TCP connections the whole content of Server Hello is re-transmitted in TCP level. In the failing connection the TCP re-transmission is not completed in time and timeout ends the TLS handshake.

    I am afraid that the only option here is to increase the TCP re-transmission interval in  server side. Constant re-transmissions are lowering the throughput and preventing the TLS handshake to complete in time."

  • Hi,

    thank you for your answer. It really help our understanding of this behavior.

    You said that we should use UDP if possible. Do you know the minimum version of the nrf SDK needed to support UDP with MQTT ? and for the modem firmware version ?

    We are currently using version 1.9.1 and I would like to know if we need to upgrade.

    Regards

    Antoine

  • Achevillard said:
    Do you know the minimum version of the nrf SDK needed to support UDP with MQTT ?

    It is not supported at all. MQTT is TCP based exclusively, so if you want to use UDP you should consider using CoAP instead.

Related