Hello,
We are facing an issue where the sum of count in BLE_GATTS_EVT_HVN_TX_COMPLETE events (member of the ble_gatts_evt_hvn_tx_complete_t structure) doesn't match with the number of BLE notifications submitted to the SoftDevice.
Here's some info on what we're using:
NRF5 SDK: v17.0.2
Softdevice: S140
MCU: NRF52840
We implemented a custom BLE submission queue similar to the BLE GATT Queue of the SDK but with some tweeks.
Here's how we are submitting notifications to the SoftDevice:
/**
* @brief Submits a single `hvx_request` from the queue to the SoftDevice.
*
* @param[in] req Pointer to a `hvx_request`.
*
* @return @ref error_ok if the request has been successfully submitted.
* @return @ref error_busy if the SoftDevice can't currently handle the request.
* @return @ref error_internal otherwise.
*/
static error submit_request(const struct hvx_request *req);
static void process_queue() {
size_t request_count = nrf_queue_utilization_get(p_queue);
while (!is_process_ending) {
struct hvx_request req = {0};
nrf_queue_peek(p_queue, &req);
error submit_status = ok;
CRITICAL_REGION_ENTER();
submit_status = submit_request(&req);
request_count--;
if (ok == submit_status) {
nrf_atomic_u32_add(&op_count, 1);
}
if ((0 == request_count) || (error_busy == submit_status)) {
SEGGER_RTT_printf(0, "Ending process with op_count: %d\n", op_count);
nrf_atomic_flag_set(&is_process_ending);
}
CRITICAL_REGION_EXIT();
if (error_busy != submit_status) {
nrf_queue_pop(p_queue, &req);
notify_request_processed(&req, submit_status);
}
}
}
Here's how we decrease the op_count variable in the context of the BLE event callback (hvx_queue_on_ble_evt is a NRF_SDH_BLE_OBSERVER callback):
void hvx_queue_on_ble_evt(const ble_evt_t *ble_evt, void *context) {
...
uint8_t tx_count = ble_evt->evt.gatts_evt.params.hvn_tx_complete.count;
uint32_t old_count = nrf_atomic_u32_fetch_sub_hs(op_count, tx_count);
SEGGER_RTT_printf(0, "Update op_count with old_count: %d, tx_count: %d\n", old_count, tx_count);
if (old_count < tx_count) {
// We shouldn't end up in this condition
nrf_atomic_u32_store(&op_count, 0);
}
if (is_process_ending && (0 == op_count)) {
end_processing();
}
...
}
We have as a requirement, to keep track of the number of submitted notifications in order to know when they are actually sent. The is_process_ending flag is used to indicate whether all notifications have been submitted. If it's set and the op_count fall to 0, then we consider that all notifications were transmitted.
The submission of notifications to the SoftDevice are inside a critical section because we need to prevent the hvx_queue_on_ble_evt() callback to be invoked before is_process_ending is set.
To optimize the throughput, we submit notifications using sd_ble_gatts_hvx() until it returns NRF_ERROR_RESOURCES.
Here's an abstract of logs which shows the issue we're facing and it's related Wireshark capture:
Ending process with op_count: 3 Update op_count with old_count: 3, tx_count: 1 // Start of event counter N°5476 Update op_count with old_count: 2, tx_count: 1 Update op_count with old_count: 1, tx_count: 1 Ending process with op_count: 3 Update op_count with old_count: 3, tx_count: 1 Update op_count with old_count: 2, tx_count: 3 // This is the problematic line

We can see that there's 6 transmissions happening in the connection event N°5476. However, the last transmission is only ACKed by the LL in the connection event N°5477.
There's one BLE_GATTS_EVT_HVN_TX_COMPLETE event per notification except for the last 2. The last BLE_GATTS_EVT_HVN_TX_COMPLETE event contains a tx_count set to 3. This is where we face our issue since we expected tx_count = 2. Indeed, at that moment, we submitted 3 notifications and already received a BLE_GATTS_EVT_HVN_TX_COMPLETE event for the first one.
According to the Wireshark capture, there are no other notifications made on any other characteristic to explain the missmatch on tx_count.
What could be the cause of this missmatch ?
Thanks for your help.