mqtt_connect error no 22 appearing after several successful connections.

Hello,

We've been developing some software,for the nrf9160, based on the asset tracker that posts to an AWS server periodically using the aws_iot library. After many days of running without incident the device appears to have stopped posting altogether; checking the output we see that our connect request is being met with "mqtt_connect, error: -22". 
 
This appears to be  indicate some sort of argument error, but nothing relating to the connection should be changing (endpoints and ports are set in config files, certificates are written to the modem, client is setup at initialization which happens on initial boot-up). The device does appear to recover after a simple power cycle, so I don't think anything is fundamentally wrong with the code. 


Does anyone have any ideas why this might be happening, or some way to mitigate it if it does happen?


Output Log:

[09:19:37.271,697] <inf> app_event_manager: MAIN_EVT_TRANSMIT
[09:19:37.346,282] <inf> app_event_manager: MODEM_EVT_LTE_CONNECTING
[09:19:37.353,851] <dbg> modem_module: state_set: State transition STATE_DISCONN                                                                                                                                                             ECTED --> STATE_CONNECTING
%CESQ: 41,2,23,3
[09:19:38.899,322] <dbg> modem_module: modem_rsrp_handler: Incoming RSRP status                                                                                                                                                              message, RSRP value is -99
+CEREG: 2,"ED3F","005A35C9",9
[09:19:38.913,024] <dbg> modem_module: lte_evt_handler: LTE cell changed: Cell I                                                                                                                                                             D: 5912009, Tracking area: 60735
[09:19:38.923,797] <inf> app_event_manager: MODEM_EVT_LTE_CELL_UPDATE
%CESQ: 41,2,34,4
[09:19:38.939,666] <dbg> modem_module: modem_rsrp_handler: Incoming RSRP status                                                                                                                                                              message, RSRP value is -99
+CSCON: 1
[09:19:39.665,405] <dbg> modem_module: lte_evt_handler: RRC mode: Connected
%CESQ: 41,2,28,4
[09:19:39.680,664] <dbg> modem_module: modem_rsrp_handler: Incoming RSRP status                                                                                                                                                              message, RSRP value is -99
+CGEV: ME PDN ACT 0,0
[09:19:41.458,709] <dbg> modem_module: pdn_event_handler: PDN_EVENT_ACTIVATED
+CNEC_ESM: 50,0
[09:19:41.467,834] <wrn> modem_module: Event: PDP context 0, PDN type IPv4 only                                                                                                                                                              allowed
%MDMEV: SEARCH STATUS 2
[09:19:41.478,759] <dbg> modem_module: lte_evt_handler: Modem domain event, type                                                                                                                                                             : Search done
+CEREG: 5,"ED3F","005A35C9",9,,,"00001010","11000001"
[09:19:41.493,072] <dbg> modem_module: lte_evt_handler: Network registration sta                                                                                                                                                             tus: Connected - roaming
%XTIME: "0A","3220020200600A","00"
[09:19:41.506,988] <inf> app_event_manager: MODEM_EVT_LTE_CONNECTED
[09:19:41.514,434] <dbg> modem_module: state_set: State transition STATE_CONNECT                                                                                                                                                             ING --> STATE_CONNECTED
[09:19:41.524,322] <wrn> aws_module: Cloud connection establishment in progress
[09:19:41.532,379] <wrn> aws_module: New connection attempt in 32 seconds if not                                                                                                                                                              successful
[09:19:41.541,534] <inf> app_event_manager: AWS_MODULE_EVT_CONNECTING
%CESQ: 40,2,23,3
[09:19:43.800,842] <dbg> modem_module: modem_rsrp_handler: Incoming RSRP status                                                                                                                                                              message, RSRP value is -100
[09:19:43.811,065] <inf> app_event_manager: MODEM_EVT_LTE_SIGNAL_QUALITY_UPDATE
UART power down timer expired.
[09:19:48.082,244] <err> aws_iot: mqtt_connect, error: -22
[09:19:48.088,470] <inf> app_event_manager: AWS_MODULE_EVT_CONNECTING
%CESQ: 41,2,28,4
[09:19:50.224,060] <dbg> modem_module: modem_rsrp_handler: Incoming RSRP status                                                                                                                                                              message, RSRP value is -99
+CSCON: 0
[09:19:59.757,904] <dbg> modem_module: lte_evt_handler: RRC mode: Idle
%CESQ: 42,2,23,3
[09:20:00.704,406] <dbg> modem_module: modem_rsrp_handler: Incoming RSRP status                                                                                                                                                              message, RSRP value is -98
[09:20:13.541,595] <inf> app_event_manager: AWS_MODULE_EVT_CONNECTION_TIMEOUT
[09:20:13.549,652] <wrn> aws_module: Cloud connection establishment in progress
[09:20:13.557,678] <wrn> aws_module: New connection attempt in 64 seconds if not                                                                                                                                                              successful
[09:20:13.566,894] <inf> app_event_manager: AWS_MODULE_EVT_CONNECTING
%CESQ: 40,2,29,4
[09:20:13.696,014] <dbg> modem_module: modem_rsrp_handler: Incoming RSRP status                                                                                                                                                              message, RSRP value is -100
+CSCON: 1
[09:20:14.226,501] <dbg> modem_module: lte_evt_handler: RRC mode: Connected
%CESQ: 40,2,24,3
[09:20:16.940,826] <dbg> modem_module: modem_rsrp_handler: Incoming RSRP status                                                                                                                                                              message, RSRP value is -100
[09:20:19.346,740] <err> aws_iot: mqtt_connect, error: -22
[09:20:19.352,966] <inf> app_event_manager: AWS_MODULE_EVT_CONNECTING

Parents
  • Hi,

    The error you see is EINVAL 22 Invalid Argument error. 

    Could you provide more information about your application?

    Best regards,
    Dejan

  • Sure, I'm not sure what specifically might be important. I'll cover what I can, but if you've got a more specific question let me know.

    We have a custom board with an NRF9160 being used to post data to AWS. The application's purpose is to generally sit idle, waiting for an external 'Post' command. That trigger causes the modem to connect, if the modem connects it triggers AWS to  connect, if AWS connects it triggers a publish; then we format a JSON message and attempt to publishes that to AWS.

    The general purpose of the app is post data to the cloud periodically, every hour or so, otherwise it sits 'idle/sleeping'. 
    Code-wise it's largely based off the Asset Tracker example, and uses the application event structure. Once fully provisioned the devices will be booted once, and then expected to run 'forever'. Before running into this "EINVAL" issue, the device has happily posted dozens, if not hundreds of times. This error also appears to be latching, as the device will never post again, unless we re-boot via power cycle. 

  • Hi,

    Could you provide application log after you enable CONFIG_NET_LOG and CONFIG_MQTT_LOG_LEVEL_DBG configuration options in your prj.conf file?

    Best regards,
    Dejan

  • Sure, I'll get working on that though it's a very non-deterministic problem so it might take a while. These devices often run for months on end without encountering issues. I can add that configuration and setup more devices, but it's not something that's occurring often though when it does occur it appears to latch. 

    In the meantime is it possible to look into possible causes for EINVAL responses out of aws_iot_connect()?

    I've been looking through myself and it seems that most of the time that function returns EINVAL when there's some incorrect configuration information. If that's consistent it would mean that some part of our configuration is being corrupted during runtime which we can look into. 

    If there's other situations where an EINVAL could originate from we could do debugging on that. 

    If at all possible I'd like to have some idea of where I could start looking for these sort of things, while I wait for more data from the devices themselves. 

  • Hi,

    In order to have better insight into the problem, please provide modem trace.

    Best regards,
    Dejan

Reply Children
No Data
Related