This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Modem Firmware Version 1.2.3 Cannot connect to MQTT broker using certificates

Hello,

We are trying to connect to an AWS MQTT broker using the nRF9160 with modem firmware version 1.2.3 and are not having any luck.

Project background:

Using modem firmware version 1.2.0, we have been able to load a root CA, Client Certificate, and private key using the following commands:

AT+CFUN=4
AT%CMNG=0,16842753,0,"root_ca"
AT%CMNG=0,16842753,1,"client_cert"
AT%CMNG=0,16842753,2,"private_key"

After which, we reboot the device and the application firmware goes through the normal connection process. This results in a successful connection to the AWS IoT Core MQTT broker.

Using the same method on modem firmware version 1.2.3, the modem successfully connects to LTE, but fails to connect to the MQTT broker. Our application firmware will then fall back to a locally hosted backup MQTT broker using username and password as an authentication. That connects successfully.

I know the certs load correctly, because I am able to load the certs onto the modem when it is running on 1.2.3, then go back to 1.2.0 and this results in a successful connection.

I'm not really sure what else to try, any inside knowledge to the changes between 1.2.0 and 1.2.3 would be appreciated.

Thanks,

Jordan

Parents
  • Hi,

    This sounds strange. I don't know about any changes between those two versions that could cause this.

    Are you using the same application FW with both modem FW versions?

    Do you have any logs from the application?

    Could you take a modem trace (at least a failing one with mfw 1.2.3, though a working one with 1.2.0 to compare with would also be nice)?

    Best regards,

    Didrik

  • Traces were shared privately, here are my comments:

    Hi, and thanks for the traces.

    Unfortunately, I am not able to successfully decode the 1.2.0 trace.

    However, the 1.2.3 trace shows that the TLS handshake is fine, but the server closes the TLS connection after the device has sent the first packet with (TLS) application data. Presumably, this is the MQTT Connect request.

    I assume you are running the same application in both cases?

    Is the device ID correct?

    Does AWS have any information about why the device is rejected?

    Best regards,

    Didrik

  • Ok, let me see if I can get another version of the 1.2.0 trace. All elements of the connection process are the same except the modem firmware version.

    Edit: To answer your question about AWS, there is no logging on aws below the MQTT data layer. So I have not been able to get any access logs from IoTCore

  • This trace is better, I was able to decode it. However, in the trace, the device is not able to connect to the network.

    There is one attach attempt, but it is rejected (by the network) with ESM reject cause 38:

    Cause #38 – Network failure
    This ESM cause is used by the network to indicate that the requested service was rejected due to an error
    situation in the network.

    After that, the device continues to search for a network, but doesn't find one.

  • That's strange, because I watched traffic come through on mqtt when this trace was recorded.

  • I looked through the trace again, but I couldn't find any obvious reasons for why.

    Could you try to take another trace?

Reply Children
  • Thanks for the working traces.

    One difference I could spot in the traces from mfw v1.2.0 and 1.2.3 is that in the trace from 1.2.3, you are using the Server Name Indication (SNI) TLS extension, while in the 1.2.0 traces, you are not.

    SNI support was added in mfw v1.2.1.

    My guess is that this is what causes your connection problems. Although I must admit that I would have expected it to be the other way: That if you didn't have SNI, the connection would have been refused. SNI is enabled by the MQTT client automatically when the hostname is provided, and the hostname is necessary for peer verification to work.

    AWS uses the certificates to route MQTT requests. Are you sure you are using the right hostname in the application?

  • Hey Didrik,

    We were able to get it working! Looks like our application firmware was attempting to specify the (incorrect) hostname using SNI. Since this feature was not available in the previous MFW version, I assume it was treated just like a NULL is treated in this MFW version. So all we did was change the hostname to NULL in our application firmware and allowed the AWS backend to route our connection request and it connected.

    This may be a matter for a different ticket, but we have noticed a significant increase in the amount of time it takes the device to connect to LTE with the new MFW version with no change to the application firmware. If you have a short answer for that, I would appreciate it. If not, we can investigate further before opening another ticket.

    Thanks!

    Jordan

  • jbax92 said:
    We were able to get it working!

    That's great to hear!

    jbax92 said:
    This may be a matter for a different ticket, but we have noticed a significant increase in the amount of time it takes the device to connect to LTE with the new MFW version with no change to the application firmware. If you have a short answer for that, I would appreciate it. If not, we can investigate further before opening another ticket.

    That sounds strange. I would have expected the performance to improve, or at least be quite similar, not be worse.

    The one posibillity I can think of is if the stored network information got removed in the update. When connected to a network, the modem will regularly (though not very often) store information about the current network and cell it is connected to. When it tries to connect to a network, it will use that stored information to help speed up the search. If the device had that information with the old modem FW, but it got removed when updating the modem, that can explain the reduced performance.

    The fastest way to store the network information is to connect with AT+CFUN=1, then disconnecting with AT+CFUN=0. That will write the network parameters to flash, without you having to wait for the next modem file system sync.

    However, it might be that this only fixes the symptom, and not the actual problem. But in that case, another ticket is probably the best place to take that discussion.

    Best regards,

    Didrik

Related