FOTA download breaking up

Product: NRF9160

SDK: 2.6.1

Modem: 1.3.6

My problem is that FOTA download is breaking up more or less every time our device is doing a SW upgrade. We download SW directly from AWS S3 bucket using download client. But that S3 is a pain to debug so instead I installed nginx to a test server and made a small script to offer HTTP file download using nodejs and express. It looks like I can reproduce similar problems there. My binary ATM is huge(300kB) and connection seems to break at least 3 times while it is being downloaded. Here is what the device logs look like

[00:00:53.951,507] <inf> download_client: Downloaded 100352/310339 bytes (32%)
[00:00:53.964,050] <dbg> download_client: download_thread: Receiving up to 2048 bytes at 0x2000f4d0...
[00:00:54.280,944] <dbg> download_client: handle_received: Read 1391 bytes from socket
[00:00:54.281,158] <dbg> download_client: http_header_parse: GET header size: 367
[00:00:54.282,226] <dbg> download_client: http_parse: Copying 1024 payload bytes
[00:00:54.282,318] <inf> download_client: Downloaded 101376/310339 bytes (32%)
[00:00:54.297,698] <dbg> download_client: download_thread: Receiving up to 2048 bytes at 0x2000f4d0...
[00:00:54.600,982] <dbg> download_client: handle_received: Read 1386 bytes from socket
[00:00:54.601,165] <dbg> download_client: http_header_parse: GET header size: 362
[00:00:54.601,318] <wrn> download_client: Peer closed connection, will re-connect
[00:00:54.601,348] <dbg> download_client: http_parse: Copying 1024 payload bytes
[00:00:54.601,440] <inf> download_client: Downloaded 102400/310339 bytes (32%)
[00:00:54.612,854] <inf> download_client: Reconnecting...
[00:00:54.614,135] <dbg> download_client: client_connect: Protocol not specified, defaulting to HTTP(S)
[00:00:54.614,166] <dbg> download_client: client_connect: Port not specified, using default: 443
[00:00:54.735,382] <wrn> download_client: Failed to resolve hostname our.test.server.com on IPv6
[00:00:54.739,288] <dbg> download_client: client_socket_connect: family: 1, type: 1, proto: 258
[00:00:54.739,685] <inf> download_client: Setting up TLS credentials, sec tag count 1
[00:00:54.739,807] <inf> download_client: Connecting to 123.456.789.123
[00:00:54.739,837] <dbg> download_client: client_socket_connect: fd 1, addrlen 8, fam IPv4, port 443
[00:00:56.220,733] <dbg> download_client: download_thread: Receiving up to 2048 bytes at 0x2000f4d0...
[00:00:56.510,192] <dbg> download_client: handle_received: Read 1391 bytes from socket
[00:00:56.510,406] <dbg> download_client: http_header_parse: GET header size: 367
[00:00:56.511,474] <dbg> download_client: http_parse: Copying 1024 payload bytes
[00:00:56.511,566] <inf> download_client: Downloaded 103424/310339 bytes (33%)

The funny(not really) thing is that now when I am actually able to get pcaps from this server setup I would like to accuse the server(nginx) of breaking up the connection with no reason.

I have tried many ways to simulate a slow connection and mimic the partial download that nrf9161 does against AWS S3 and that test server, but so far no luck, everything just always works. So I thought that maybe you guys have some more ideas what to try. Maybe there are some known incompatibility issues that I would need to workaround?

And I would still emphase that from the nrf91 point of view the problem looks identical to what we see in real environment in AWS S3, but in reality I have no idea whether the problem I see in this test server is the same problem or not.

I can see that   had pretty similar issues lately with another product, but since I have a TLS encryption used here using 4kB fragments is not really an option.
devzone.nordicsemi.com/.../download-client-via-aws-fota-stops-at-50-and-can-t-reconnect-to-aws

Parents
  • One funny observation from this is that the download seems to break at the same binary offset. Like below with 200 and 300kB files:

    [00:00:56.926,757] <inf> download_client: Downloaded 101376/227227 bytes (44%)
    [00:00:57.152,160] <wrn> download_client: Peer closed connection, will re-connect
    [00:00:57.152,252] <inf> download_client: Downloaded 102400/227227 bytes (45%)
    [00:00:57.163,726] <inf> download_client: Reconnecting...
    
    ...
    
    [00:01:45.487,915] <inf> download_client: Downloaded 203776/227227 bytes (89%)
    [00:01:45.727,203] <wrn> download_client: Peer closed connection, will re-connect
    [00:01:45.727,416] <inf> download_client: Downloaded 204800/227227 bytes (90%)
    [00:01:45.738,800] <inf> download_client: Reconnecting...

    [00:01:00.961,517] <inf> download_client: Downloaded 101376/310339 bytes (32%)
    [00:01:01.291,076] <wrn> download_client: Peer closed connection, will re-connect
    [00:01:01.291,168] <inf> download_client: Downloaded 102400/310339 bytes (32%)
    [00:01:01.302,551] <inf> download_client: Reconnecting...
    
    ...
    
    [00:01:38.770,080] <inf> download_client: Downloaded 203776/310339 bytes (65%)
    [00:01:38.996,246] <wrn> download_client: Peer closed connection, will re-connect
    [00:01:38.996,368] <inf> download_client: Downloaded 204800/310339 bytes (65%)
    [00:01:39.007,781] <inf> download_client: Reconnecting...


    Maybe I'll try to sniff the flashwriting a bit. Maybe it causes some unwanted side effects somehow

Reply Children
No Data
Related