When using l2cap to stream data at a high rate from device to an app we are seeing a hard fault in l2cap_data_pull when there is queued l2cap send data.
When using l2cap to stream data at a high rate from device to an app we are seeing a hard fault in l2cap_data_pull when there is queued l2cap send data.
Hi,
The race condition could be in the your applications, please check if you already do something like in the linked code snippet to avoid calling APIs with the conn pointer while you unref the conn pointer on disconnected:
Regards,
Amanda H.
The bt_l2cap_chan_send function that we use checks the channel connection state. Is that not adequate?
Just to check to see if bt_conn_unref was the issue, I commented that out to test. The same hard fault in l2cap_data_pull occurs. So doesn't seem to be an issue with an early unref.
Could you try the suggested code snippet?
I can't do that exactly because I'm using the l2cap calls, but I did test adding a ref/unref around the send and the same hard fault still happens.
Hi,
It might be a race condition between l2cap_data_pull
and bt_l2cap_chan_del
or l2cap_chan_shutdown
. It should not happen since l2cap_data_pull
runs on the system work queue, which is forced to be non-preemptible by the Bluetooth subsystem Kconfig. Have you overridden this restriction and made the system work queue preemtible?
We have not modified the SYSTEM_WORKQUEUE_PRIORITY configuration.
We have not modified the SYSTEM_WORKQUEUE_PRIORITY configuration.
Could you try to surround bt_l2cap_dyn_chan_send in a k_sched_lock as a step toward debugging? This makes the thread temporarily cooperative. It can be done by applying the attached diff to NCS v2.9.0.
#!/usr/bin/env -S git apply
# Attachment to Nordic Devzone https://devzone.nordicsemi.com/f/nordic-q-a/118471/l2cap_data_pull-hard-fault-on-disconnect-with-queued-tx-data
diff --git a/subsys/bluetooth/host/l2cap.c b/subsys/bluetooth/host/l2cap.c
index ed185d12fd7..58bb4cf13aa 100644
--- a/subsys/bluetooth/host/l2cap.c
+++ b/subsys/bluetooth/host/l2cap.c
@@ -3283,7 +3283,18 @@ static int bt_l2cap_dyn_chan_send(struct bt_l2cap_le_chan *le_chan, struct net_b
return 0;
}
-int bt_l2cap_chan_send(struct bt_l2cap_chan *chan, struct net_buf *buf)
+static int bt_l2cap_chan_send_(struct bt_l2cap_chan *chan, struct net_buf *buf);
+int bt_l2cap_chan_send(struct bt_l2cap_chan *chan, struct net_buf *buf)
+{
+ int err;
+
+ k_sched_lock();
+ err = bt_l2cap_chan_send_(chan, buf);
+ k_sched_unlock();
+
+ return err;
+}
+static int bt_l2cap_chan_send_(struct bt_l2cap_chan *chan, struct net_buf *buf)
{
if (!buf || !chan) {
return -EINVAL;
Please let me know if it can work or not.
I added that change and was still able to reproduce the issue.
Could you provide a simple project to help us reproduce the issue?