l2cap_data_pull hard fault on disconnect with queued tx data

When using l2cap to stream data at a high rate from device to an app we are seeing a hard fault in l2cap_data_pull when there is queued l2cap send data.

When the fault occurs l2cap_data_pull, conn is in the disconnected state.  It appears that the pdu->data pointer is 0.  Then net_buf_push tries to adjust pdu->data which results in 0xfffffffc.  Dereferencing that causes the hard fault:
hdr = net_buf_push(pdu, sizeof(*hdr));
hdr->len = sys_cpu_to_le16(pdu_len);
Using nRF Connect SDK 2.9.0 and nRF5340.
Should there be a check for conn status disconnected?  Or for pdu->data == NULL?  Or is this a race condition?
Parents Reply Children
  • Could you try to surround bt_l2cap_dyn_chan_send in a k_sched_lock as a step toward debugging? This makes the thread temporarily cooperative. It can be done by applying the attached diff to NCS v2.9.0. 

    #!/usr/bin/env -S git apply
    # Attachment to Nordic Devzone https://devzone.nordicsemi.com/f/nordic-q-a/118471/l2cap_data_pull-hard-fault-on-disconnect-with-queued-tx-data
    diff --git a/subsys/bluetooth/host/l2cap.c b/subsys/bluetooth/host/l2cap.c
    index ed185d12fd7..58bb4cf13aa 100644
    --- a/subsys/bluetooth/host/l2cap.c
    +++ b/subsys/bluetooth/host/l2cap.c
    @@ -3283,7 +3283,18 @@ static int bt_l2cap_dyn_chan_send(struct bt_l2cap_le_chan *le_chan, struct net_b
     	return 0;
     }
    
    -int bt_l2cap_chan_send(struct bt_l2cap_chan *chan, struct net_buf *buf)
    +static int bt_l2cap_chan_send_(struct bt_l2cap_chan *chan, struct net_buf *buf);
    +int bt_l2cap_chan_send(struct bt_l2cap_chan *chan, struct net_buf *buf)
    +{
    +	int err;
    +
    +	k_sched_lock();
    +	err = bt_l2cap_chan_send_(chan, buf);
    +	k_sched_unlock();
    +
    +	return err;
    +}
    +static int bt_l2cap_chan_send_(struct bt_l2cap_chan *chan, struct net_buf *buf)
     {
     	if (!buf || !chan) {
     		return -EINVAL;
    

    Please let me know if it can work or not.  

  • I added that change and was still able to reproduce the issue.

  • Could you provide a simple project to help us reproduce the issue?

Related