This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

MQTT_CONNECT fails with error -95

Hi,

Just trying to run mqtt sample with TLS and AWS IoT backend. Everything worked fine in the beginning, but then after I tweaked the code a bit I suddenly started getting error -95 during the connect.

Error -95 is not documented in NRF, but I found that it might be EOPNOTSUPP. More debugging showed that it is probably coming from modem and might be related to the certificates somehow. At the same time same certificates worked before and also I tested them using command line.

Few hours later I narrowed down the scope and basically found that absolutely any code change produces the error.

As an example below, a simple printk on line 325 basically breaks the mqtt connection. By commenting that line everything works again. This is 100% consistent.

Can it be somehow related to memory management or something?

Just in case attached my .prj file.

# General config
CONFIG_NEWLIB_LIBC=y
CONFIG_NEWLIB_LIBC_FLOAT_PRINTF=y
CONFIG_ASSERT=y
CONFIG_REBOOT=y
CONFIG_LOG=y
CONFIG_LOG_STRDUP_MAX_STRING=164
CONFIG_LOG_STRDUP_BUF_COUNT=10

CONFIG_DEBUG=y
CONFIG_LOG_IMMEDIATE=y

CONFIG_TEST_RANDOM_GENERATOR=y

# Network
CONFIG_NETWORKING=y
CONFIG_NET_NATIVE=n

CONFIG_NET_SOCKETS=y
CONFIG_NET_SOCKETS_OFFLOAD=y
CONFIG_NET_SOCKETS_POSIX_NAMES=y

# LTE link control
CONFIG_LTE_LINK_CONTROL=y
CONFIG_LTE_AUTO_INIT_AND_CONNECT=n
CONFIG_LTE_LINK_CONTROL_LOG_LEVEL_DBG=y

CONFIG_LTE_NETWORK_MODE_NBIOT=n
CONFIG_LTE_LEGACY_PCO_MODE=n
CONFIG_LTE_PSM_REQ_RPTAU="00000110"
CONFIG_LTE_PSM_REQ_RAT="00000000"

# Modem info
CONFIG_MODEM_INFO=y

# BSD library
CONFIG_BSD_LIBRARY=y
CONFIG_BSD_LIBRARY_TRACE_ENABLED=n

# AT Host
CONFIG_UART_INTERRUPT_DRIVEN=y
CONFIG_AT_HOST_LIBRARY=y

# MQTT
CONFIG_MQTT_LIB=y
CONFIG_MQTT_LIB_TLS=y

# Application
CONFIG_MQTT_PUB_TOPIC="topic1"
CONFIG_MQTT_SUB_TOPIC="topic2"
CONFIG_MQTT_CLIENT_ID="same as certificate common name to conform attached policy in AWS"
CONFIG_MQTT_BROKER_HOSTNAME="a44d87bf4oe9dc-ats.iot.ap-southeast-2.amazonaws.com"
CONFIG_MQTT_BROKER_PORT=8883

# Main thread
CONFIG_MAIN_THREAD_PRIORITY=7

# Heap and stacks
CONFIG_MAIN_STACK_SIZE=8192

CONFIG_HEAP_MEM_POOL_SIZE=16384
CONFIG_UART_0_NRF_TX_BUFFER_SIZE=8192

Thanks

Parents
  • Hello, 

    Can you provide more information on what version of NCS you are using? Master or v1.1.0?

    Everything worked fine in the beginning, but then after I tweaked the code a bit I suddenly started getting error -95 during the connect.

    What tweak did you add? Have you tried reverting back to the original?

    As an example below, a simple printk on line 325 basically breaks the mqtt connection. By commenting that line everything works again. This is 100% consistent.

     Is this where you are receiving ERRNO -95? What happens if you add the printk in the main loop after calling client_init()?

    Error -95 is not documented in NRF, but I found that it might be EOPNOTSUPP

     Yes, this looks correct. As you have CONFIG_NEWLIB_LIBC=y in prj.conf, the error number is most likely EOPNOTSUPP (Operation not supported on transport endpoint).

    Kind regards,
    Øyvind

  • Hi Øyvind,

    Thanks for the quick response.

    Can you provide more information on what version of NCS you are using? Master or v1.1.0?

    v1.1.0

    What tweak did you add? Have you tried reverting back to the original?

    Basically I have just added TLS support to MQTT.

    Is this where you are receiving ERRNO -95? What happens if you add the printk in the main loop after calling client_init()?

    Sorry, I should've been more clear. The error happens here

    rc = mqtt_connect(c);
    if (rc != 0) {
        ...

    But it is caused by 'printk' from a completely different method. I know this sounds stupid, but basically unrelated code somehow influences on the mqtt_connect function. It is a single threaded application though.

    I was thinking that maybe it is something to do with memory management.

    I'm currently checking few other things. Will let you know if found something.

    Thanks.

    Upd.

    Deeper debugging showed that the actual error is 45, which gets converted to 95 in bsd_os_errno_set in bsd_os.c:214.

  • _dev_ said:
    Sorry, I should've been more clear. The error happens here

     

    rc = mqtt_connect(c);
    if (rc != 0) {
        ...

    This code is only found in the Zephyr samples, did you have look at the mqtt_simple sample or AWS_FOTA sample. As these samples are officially written by Nordic Semiconductor and have been tested with nRF9160. I have had little or no issues with MQTT and TLS using these. 

    Kind regards,
    Øyvind

  • Hi Øyvind,

    Thanks for the suggestion.

    In fact, AWS_FOTA works with my own certificates, but if I use exactly the same certificates in my project I get 'ERROR: mqtt_connect -45'

    I have also tried to write certificates using AWS_FOTA project and then use them in my project with no luck. I'm currently comparing my project code with the example and so far see no difference, only cosmetic changes. I have a feeling that there is something dumb in my code.

    Is there a way to enable some verbose logging around using certificates?

    Thanks

  • Hello,

    Can you please provide which MQTT sample you based your project on?
     

    _dev_ said:
    In fact, AWS_FOTA works with my own certificates, but if I use exactly the same certificates in my project I get 'ERROR: mqtt_connect -45'

     Ok, so there is an issue with your certificates. Do you connect to the same server in your project as in AWS_FOTA?  In AWS_FOTA have a look at client_init(). What is the CONFIG_CLOUD_CERT_SEC_TAG configured in your project vs AWS_FOTA. This is where the certificates are stored in the modem, and your project needs to point to the same.

    Kind regards,
    Øyvind

  • Hi Øyvind,

    Sorry for taking your time.

    Such a shame, but I moved to following line from global context into client_init method and, of course, memory (stack) gets released as it is a local variable!

    sec_tag_t sec_tag_list[] = {SEC_TAG};

    So once memory is released something else takes it, this is why some random code such as printk affected on the connection.

    It is all good now and ticket can be closed.

    Thanks

Reply Children
No Data
Related