This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

NRF9160 Socket send hangs forever

Hi there,

I have developed a small sensor device, which generates events when a reedcontact is triggered. When an event occurs a NB-iot message is sent.

However when I generate a couple of events within a short period of time, the nrf socket send function stalls and the watchdog is triggered.

For demonstration purpose I have reduced the code to this:

    
    uint8_t buf[16];
    u32_t size = 16;
    u32_t flags = 0;
    while(1)
    {
            err = send(fd, buf, size, flags);
            k_sleep(K_MSEC(1000));
            LOG_DBG("Tx retries = %d", retries);
            retries++;
    }

After 16 or 17 cycles I see that the send function hangs and that the watchdog resets the device. 

Is it somehow possible to add a timeout to the send function? Or any other way to recover from such a situation?

Of course the best thing to do would be gather all the sensor events and sent them periodically but I would also like an explaination for this behavior.

Eric

Parents
  • Hello,

    these are the updates from R&D;

    "Issue two (this case) can not currently be fixed because the modem does not share any information about it's TX buffers with the application core. So from the application core point of view it can look like everything is ready to send data, but when the application core request the modem core to send the data the modem core may respond with an error.",

    "For the second issue, we already support NRF_MSG_DONTWAIT flag for the send function. So it should be possible to use that flag to net get stalled on the send call. In NCS 1.4 we will also support timeout on send."

  • I can confirm that non-blocking socket operations do fix this issue. Unfortunately, the non-blocking socket has worse performance in terms of latency and packets delivered, when compared to the same setup, signal conditions, and blocking sockets. So that's not something people like to use. Timeout/error is the only correct solution for this.

Reply Children
No Data
Related