Bug report: Non-blocking send fails with EWOULDBLOCK even though poll returned POLLOUT

Sample code for nRF9160:

#include <zephyr/kernel.h>
#include <stdio.h>
#include <modem/lte_lc.h>
#include <zephyr/net/socket.h>
#include <fcntl.h>

void my_assert(bool b) {
    if (!b) {
        printk("failed %d\r\n", errno);
        exit(1);
    }
}

void main(void)
{
	int err;

	printk("Waiting for network.. ");
	err = lte_lc_init_and_connect();
	if (err) {
		printk("Failed to connect to the LTE network, err %d\n", err);
		return;
	}

	int sk = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
	if (sk < 0) {
		printk("Failed to create socket\n");
		return;
	}
	struct sockaddr_in sa;
	sa.sin_family = AF_INET;
	sa.sin_port = htons(9632);
	sa.sin_addr.s_addr = ...; // FILL IN IP ADDRESS

	if (connect(sk, (const struct sockaddr *)&sa, sizeof(sa)) < 0) {
		printk("Failed to connect: %d\n", errno);
		return;
	}
	printk("Connected\n");

    err = fcntl(sk, F_SETFL, O_NONBLOCK);
    my_assert(!err);

    printk("Connected to server\r\n");
    for (int i = 0; i < 20; i++) {
        struct pollfd pfd = {sk, POLLIN | POLLOUT, 0};
        int res = poll(&pfd, 1, -1);
        my_assert(res == 1);
        printk("Poll result: 0x%02x\r\n", pfd.revents);
        if (pfd.revents & POLLOUT) {
            char buf[1] = {'A' + i};
            err = send(sk, buf, 1, 0);
            if (err == -1) {
                printk("Send failed with errno %d\n", errno);
            } else {
                printk("send done\r\n");
            }
        }
    }
    printk("Done\r\n");
	close(sk);
}

Sample code for the server (Node.js):

const net = require('net');

const server = net.createServer((c) => {
    console.log("connected");
    c.on('data', (d) => {
        console.log(d.toString('utf8'));
    });
    c.on('error', (e) => {
        console.log(e);
    });
    c.on('end', () => {
        console.log('end');
    });
    c.on('close', () => {
        console.log('close');
    });
});

server.listen(9632, () => {
    console.log('bound');
});

Output nRF9160:

*** Booting Zephyr OS build v3.2.99-ncs2 ***
Waiting for network.. Connected
Connected to server
Poll result: 0x04
send done
Poll result: 0x04
send done
Poll result: 0x04
send done
Poll result: 0x04
send done
Poll result: 0x04
send done
Poll result: 0x04
send done
Poll result: 0x04
send done
Poll result: 0x04
send done
Poll result: 0x04
send done
Poll result: 0x04
send done
Poll result: 0x04
send done
Poll result: 0x04
send done
Poll result: 0x04
send done
Poll result: 0x04
send done
Poll result: 0x04
send done
Poll result: 0x04
send done
Poll result: 0x04
Send failed with errno 11
Poll result: 0x04
send done
Poll result: 0x04
send done
Poll result: 0x04
send done
Done

Output server:

bound
connected
A
BCDEFGHIJKLMNOP
RST
end
close

This error is being reproduced with 100% probability for me in the exact same way.

In the output from nRF9160, we can see that send returns -1 with errno set to EWOULDBLOCK in the 17th call. This is in violation with the POSIX contract, since the poll function returned POLLOUT for the socket, indicating that it is possible to put more data in the socket's send buffer. The expected outcome is that the poll function should block and exit with POLLOUT first when there is at least 1 byte free in the socket's send buffer, so that a send call will succeed.

We see that Q is missing in the output from the server, indicating that the send call indeed failed.

nRF Connect SDK version: 2.3.0.

Modem firmware: 1.3.4.

Mobile operator: Telenor SE.

Parents Reply
  • Hi,

    Achim Kraus said:
    This ticket demonstrates, that using poll with POLLOUT doesn't work. It does that for TCP.

    Sorry, I read your reply as a different issue. The investigation is related to the return code, ie. that poll() does not honor the POSIX API. This is a direct dependency on the buffering mechanism (actually; communication..) within nrf_modem, regardless of the type of socket based shared memory transaction.

     

    Kind regards,

    Håkon

Children
Related