Download client (via AWS FOTA) stops at 50% and can't reconnect to AWS

Hello,

I am using the AWS IoT library and I came across an issue when downloading a new image (FOTA) from AWS. The FOTA process starts so the image is downloaded up to 50% where apparently the peer closes the connection (received length is 0) and it the download client can't reconnect to the server. The overall FOTA process then fails. 

My board is the nrf52840DK with the Ethernet W5500 Arduino shield on top. MQTT pub and sub works fine (also with device shadow). I using nRF SDK 2.6.0. 

I attach here my prj.conf file 1072.prj.conf

Here is the error:

More details: as you can see from the prj.conf file, I am using the Zephyr built-in mbedTLS and all communications are with TLS on (MQTTS and HTTPS). 

As you see from the log at around 50% the download client prints "Peer closed connection, will reconnect" and then fails to reconnect throwing a errno 2 (No such file or directory).

Looking at the download client source code, the "Peer closed connection" error is when a received packet has length 0. 

It always fails in the same way. No idea why,

Please help.

Thank you.

Marco 

  • Hi, I have fixed the log as you suggested but the 50% download issue remains. I the tried to set CONFIG_NET_BUF_DATA_SIZE=1024 but that meant also setting CONFIG_NET_BUF_FIXED_DATA_SIZE=y. Unfortunately this increases RAM footprint a lot and linker fails. I anyway managed to make it but the fw crashes by stack overflow when DHCP starts (or when it gets response from the DHCP server). I then increased both my task and the system work queue stack but I can't solve the stack overflow issue. Note that I can't make stacks too big as RAM would not fit.

    Any suggestions?

  • I tried to set CONFIG_NET_BUF_DATA_SIZE=1024 and not set CONFIG_NET_BUF_FIXED_DATA_SIZE=y and it compiles fine. Stack overflow happen though.

  • Hi,

     

    As a generic note:

    If you have faults occurring, please report the full log.

     

    In this case, revert to your original NET_BUF_* configuration, and rather adjust the overall buffer size via CONFIG_NET_BUF_DATA_POOL_SIZE and see if this changes the behavior.

    We have not tested the download sample with other devices than the IP-capable nRF's (nRF91-series, nRF70-series), so I would expect that there could be some adjustments that you need to do to make it fit for the W5500, in terms of generic configuration.

     

    Kind regards,

    Håkon

  • Adding CONFIG_NET_BUF_DATA_POOL_SIZE implies enabling CONFIG_NET_BUF_VARIABLE_DATA_SIZE which is experimental. I have tried and a task goes in stak overflow. I would not expect that. 

    Anyway I am playing with all those config macros and I guess I'll need to spend days trying to find the right combination with little clue of what I am doing. 

    Thank you for your help so far. 

  • When experiencing a fatal error / fault, please share the output.

    Marco Russi said:
    Anyway I am playing with all those config macros and I guess I'll need to spend days trying to find the right combination with little clue of what I am doing. 

    The default setting for CONFIG_NET_BUF_DATA_SIZE is 128, try to atleast adjust it to 256, and see if the failure point moves from 50% to something else.

     

    Kind regards,

    Håkon

Related