I'm seeing an assertion failure at PC=0xa806 in nrf52840 + S140 7.3.0, while using L2CAP.
I know the most common cause for softdevice assertion failures is interrupt/timing issues, so I have triple-checked this is not the case. I have no critical sections that disable softdevice interrupts, and at the time of crash only softdevice interrupts are using the served priorities 0, 1 or 4. These are the interrupts enabled at the time of the crash:
ERROR device > irq 0 prio 0x80 ERROR device > irq 1 prio 0x0 ERROR device > irq 3 prio 0x40 ERROR device > irq 6 prio 0x40 ERROR device > irq 11 prio 0x0 ERROR device > irq 16 prio 0x40 ERROR device > irq 17 prio 0x40 ERROR device > irq 20 prio 0x60 ERROR device > irq 22 prio 0xc0 ERROR device > irq 25 prio 0x80 ERROR device > irq 32 prio 0x20 ERROR device > irq 47 prio 0x40
I strongly suspect it's some issue related to L2CAP buffer management, since it always follows the same pattern: RX is started, 3 TX's are started, the first TX finishes, the RX finishes, and another RX is started with the same buffer as the finished TX (I'm pulling rx/tx buffers from the same pool, not sure if it's relevant). Also maybe relevant: fragmentation is used (PDU size bigger than MPS).
INFO device > BLE_L2CAP_EVTS_BLE_L2CAP_EVT_CH_CREDIT INFO device > BLE_L2CAP_EVTS_BLE_L2CAP_EVT_CH_RX 0x2002661a INFO device > sd_ble_l2cap_ch_rx 0x20026808 INFO device > sd_ble_l2cap_ch_tx 0x20026be4 len=0x22 INFO device > BLE_L2CAP_EVTS_BLE_L2CAP_EVT_CH_TX 0x20026be4 INFO device > BLE_L2CAP_EVTS_BLE_L2CAP_EVT_CH_CREDIT INFO device > sd_ble_l2cap_ch_tx 0x2002661a len=0x24 INFO device > BLE_L2CAP_EVTS_BLE_L2CAP_EVT_CH_TX 0x2002661a INFO device > BLE_L2CAP_EVTS_BLE_L2CAP_EVT_CH_CREDIT INFO device > BLE_L2CAP_EVTS_BLE_L2CAP_EVT_CH_RX 0x20026808 INFO device > sd_ble_l2cap_ch_rx 0x2002661a INFO device > sd_ble_l2cap_ch_tx 0x200269f6 len=0x22 INFO device > sd_ble_l2cap_ch_tx 0x20026808 len=0x32 INFO device > BLE_L2CAP_EVTS_BLE_L2CAP_EVT_CH_TX 0x200269f6 INFO device > BLE_L2CAP_EVTS_BLE_L2CAP_EVT_CH_CREDIT INFO device > BLE_L2CAP_EVTS_BLE_L2CAP_EVT_CH_TX 0x20026808 INFO device > BLE_L2CAP_EVTS_BLE_L2CAP_EVT_CH_CREDIT INFO device > BLE_L2CAP_EVTS_BLE_L2CAP_EVT_CH_RX 0x2002661a INFO device > sd_ble_l2cap_ch_rx 0x20026808 // HERE the evil pattern starts INFO device > sd_ble_l2cap_ch_tx 0x20026be4 len=0x1e0 INFO device > sd_ble_l2cap_ch_tx 0x2002661a len=0x1e0 INFO device > sd_ble_l2cap_ch_tx 0x200269f6 len=0x1d3 INFO device > BLE_L2CAP_EVTS_BLE_L2CAP_EVT_CH_TX 0x20026be4 INFO device > BLE_L2CAP_EVTS_BLE_L2CAP_EVT_CH_CREDIT INFO device > BLE_L2CAP_EVTS_BLE_L2CAP_EVT_CH_RX 0x20026808 INFO device > sd_ble_l2cap_ch_rx 0x20026be4 ERROR device > panicked at 'Softdevice assertion failed: an assertion inside the softdevice's code has failed. Most common cause is disabling interrupts for too long. Make sure you're using nrf_softdevice::interrupt::free instead of cortex_m::interrupt::free, which disables non-softdevice interrupts only. PC=0xa806'
If I send data slower so that the 3 TX's streak never happens, it keeps running forever fine.
This is the softdevice configuration:
let sd_config = nrf_softdevice::Config { clock: Some(sd::nrf_clock_lf_cfg_t { source: sd::NRF_CLOCK_LF_SRC_XTAL as u8, rc_ctiv: 0, rc_temp_ctiv: 0, accuracy: 7, }), conn_gap: Some(sd::ble_gap_conn_cfg_t { conn_count: 20, event_length: 15, }), conn_gatt: Some(sd::ble_gatt_conn_cfg_t { att_mtu: 114 }), conn_gattc: Some(sd::ble_gattc_conn_cfg_t { write_cmd_tx_queue_size: 0, }), conn_gatts: Some(sd::ble_gatts_conn_cfg_t { hvn_tx_queue_size: 0 }), gatts_attr_tab_size: Some(sd::ble_gatts_cfg_attr_tab_size_t { attr_tab_size: 1024 }), gap_role_count: Some(sd::ble_gap_cfg_role_count_t { adv_set_count: 1, periph_role_count: 4, central_role_count: 16, central_sec_count: 0, _bitfield_1: sd::ble_gap_cfg_role_count_t::new_bitfield_1(0), }), conn_l2cap: Some(sd::ble_l2cap_conn_cfg_t { ch_count: 1, rx_mps: 247, tx_mps: 247, rx_queue_size: 3, tx_queue_size: 3, }), ..Default::default() };
The issue is also present in older softdevice versions:
7.2.0 PC=0xa822
7.0.1 PC=0xa7f6
How can I debug this? Thank you!