This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

Duplicate and then dropped notification

Hi,

I'm working on a sensor application that sends out fairly large quantities of data via BLE notifications. I understand how bluetooth works and have read and followed the advice at :

devzone.nordicsemi.com/.../

and

devzone.nordicsemi.com/.../

I comfortably achieve the throughput that is needed. I also understand that notifications are acknowledged at the link-layer, so I should see all notifications arriving on the central device (although possibly out-of-order). Anyway, I add a counter to each notification that I send, and then reassemble the stream on the central device to ensure correct packet order. However, while doing this I've noticed an issue -- sometimes I receive the same notification twice, and then the next notification is dropped. It's worth noting that it's not always the next notification that is dropped, but one soon after (e.g. could duplicate 4, get 5 and drop 6).

I'm sending 6 notifications each connection interval, and am using a 7.5ms interval. You can see the issue in the following packet dump. Each notification's first two bytes are a counter, and you'll see that 0x0029 is received twice, and then 0x002A is never received.

Edit: I have also attached a Wireshark packet dump from Nordic's BTLE Sniffer. I've included the entire dump, but the relevant part is packet number 17798 - 17813, specifically 17811 and 17813. There are no failed deliveries, and the duplicate packet even has a different data header, so it's not an exact copy. From my application's point of view, the duplicated packet is 0x0D0D.

nrf51-dump.pcapng

[Jul 26 23:36:29.937]  [ATT Receive]  Handle Value Notification - Handle:0x0014 - Value:22 00 40 FF EF FF 96 00 40 FF EE FF 95 00 40 FF F0 FF 92 00 
[Jul 26 23:36:29.938]  [ATT Receive]  Handle Value Notification - Handle:0x0014 - Value:23 00 40 FF EF FF 93 00 
[Jul 26 23:36:30.017]  [ATT Receive]  Handle Value Notification - Handle:0x0014 - Value:24 00 41 FF EF FF 93 00 41 FF F1 FF 94 00 3F FF F0 FF 93 00 
[Jul 26 23:36:30.025]  [ATT Receive]  Handle Value Notification - Handle:0x0014 - Value:25 00 3F FF EE FF 96 00 40 FF EF FF 95 00 41 FF ED FF 94 00 
[Jul 26 23:36:30.025]  [ATT Receive]  Handle Value Notification - Handle:0x0014 - Value:26 00 40 FF EE FF 95 00 3E FF EF FF 95 00 41 FF EE FF 93 00 
[Jul 26 23:36:30.026]  [ATT Receive]  Handle Value Notification - Handle:0x0014 - Value:27 00 40 FF F0 FF 94 00 3F FF F0 FF 94 00 3F FF EE FF 92 00 
[Jul 26 23:36:30.027]  [ATT Receive]  Handle Value Notification - Handle:0x0014 - Value:28 00 3F FF F0 FF 93 00 40 FF EF FF 95 00 40 FF EF FF 94 00 
[Jul 26 23:36:30.027]  [ATT Receive]  Handle Value Notification - Handle:0x0014 - Value:29 00 41 FF ED FF 94 00 
[Jul 26 23:36:30.028]  [ATT Receive]  Handle Value Notification - Handle:0x0014 - Value:29 00 41 FF ED FF 94 00 
[Jul 26 23:36:30.107]  [ATT Receive]  Handle Value Notification - Handle:0x0014 - Value:2B 00 40 FF EF FF 93 00 42 FF F0 FF 94 00 40 FF EE FF 94 00 
[Jul 26 23:36:30.108]  [ATT Receive]  Handle Value Notification - Handle:0x0014 - Value:2C 00 3F FF EF FF 93 00 40 FF EF FF 92 00 40 FF EE FF 96 00 
[Jul 26 23:36:30.109]  [ATT Receive]  Handle Value Notification - Handle:0x0014 - Value:2D 00 3F FF EF FF 94 00 3F FF EE FF 94 00 41 FF ED FF 94 00 

I'm not too sure what could be causing this issue, although it seems peculiar. I've included the transmission code below, but I think it's fine.

   while (true) {
	uint8_t bufferIndex = 0;
	bool hasData = true;
	
	measurement.counter = packetCounter;
	
	do {
		hasData = measurementFactory(acc, bufferIndex, &measurement.data[bufferIndex]);
		if (hasData) {
			bufferIndex++;
		}
	} while (hasData && bufferIndex < 3);
	
	// No more data to send
	if (bufferIndex == 0) {
		break;
	}

	ble_gatts_hvx_params_t params;
	uint16_t sendLength = 2 + (bufferIndex * sizeof(acc_data_point_t));
	
	params.type = BLE_GATT_HVX_NOTIFICATION;
	params.p_len = &sendLength;
	params.p_data = (unsigned char*)&measurement;
	params.handle = acc->measurementHandles.value_handle;
	params.offset = 0;

	// The soft device makes a copy of params (and buffer)
	err = sd_ble_gatts_hvx(acc->connectionHandle, &params);
	
	if (err == BLE_ERROR_NO_TX_BUFFERS ||
		err == NRF_ERROR_INVALID_STATE || 
		err == BLE_ERROR_GATTS_SYS_ATTR_MISSING)
	{
		err = NRF_SUCCESS;
		break;
	} else if (err != NRF_SUCCESS) {
		break;
	}
	
	packetCounter++;
}

Finally, here's how I setup the characteristic

// Mainly copied/pasted from nAN-36_v1.1 example at 
// www.nordicsemi.com/.../nAN-36.zip
ble_gatts_char_md_t char_md;
ble_gatts_attr_md_t cccd_md;
ble_gatts_attr_t    attr_char_value;
ble_uuid_t          ble_uuid;
ble_gatts_attr_md_t attr_md;

memset(&cccd_md, 0, sizeof(cccd_md));

BLE_GAP_CONN_SEC_MODE_SET_OPEN(&cccd_md.read_perm);
BLE_GAP_CONN_SEC_MODE_SET_OPEN(&cccd_md.write_perm);
cccd_md.vloc = BLE_GATTS_VLOC_STACK;

memset(&char_md, 0, sizeof(char_md));

char_md.char_props.read   = 1;
char_md.char_props.notify = 1;
char_md.p_char_pf         = NULL;
char_md.p_user_desc_md    = NULL;
char_md.p_cccd_md         = &cccd_md;
char_md.p_sccd_md         = NULL;

ble_uuid.type = acc->uuidType;
ble_uuid.uuid = measurement_uuid;

memset(&attr_md, 0, sizeof(attr_md));

BLE_GAP_CONN_SEC_MODE_SET_OPEN(&attr_md.read_perm);
BLE_GAP_CONN_SEC_MODE_SET_NO_ACCESS(&attr_md.write_perm);
attr_md.vloc       = BLE_GATTS_VLOC_STACK; // SoftDevice will make a copy
attr_md.rd_auth    = 0;
attr_md.wr_auth    = 0;
attr_md.vlen       = 1; // Can send fewer than 3 samples

memset(&attr_char_value, 0, sizeof(attr_char_value));

attr_char_value.p_uuid       = &ble_uuid;
attr_char_value.p_attr_md    = &attr_md;
attr_char_value.init_len     = 0;
attr_char_value.init_offs    = 0;
attr_char_value.max_len      = 20;
attr_char_value.p_value      = NULL;

return sd_ble_gatts_characteristic_add(acc->serviceHandle, 
									   &char_md,
									   &attr_char_value,
									   &acc->measurementHandles);

Sorry for the long post, but any help would be appreciated :)

  • Agreed that's a new packet - you can see in the link layer the sequence number flips from 1 to 0 showing it's the next one in the series and not a re-transmit. You did capture a retransmit back at 17800, wonder what happened there, wonder if it's relevant.

    I can't see a way through your code which could possibly enqueue the same sequence number twice and the data is the same between them, one point, same value, which seems unlikely (your data seems to change frame to frame) so that really does look like the same data twice . You only increment the sequence number if the hvx function reports success so I don't see how you skip one either.

    Observation - the sequence which ends in a double send is 7 instead of 6 readings (although one is doubled), the next sequence is 6 again, so that seems complete. It looks like you queued 0d0e, got success, but 0d0d was sent twice and 0d0e dropped.

  • What SDK and SD are you using by the way - it may be important. I'm fairly out of ideas, what I'd probably do at this point is get a pointer to the end of used RAM and start writing the value of packetCounter after the end of the while loop to that memory creating a list of the bursts you queued up, if you find a point you get a double you can see what you queued, if there's a burst in which you managed to queue 7 instead of 6 and it doubled the second to last and dropped the last, you may have found a bug.

  • I'm using 8.0.2 of S130 with SDK 8.0. My transmission function isn't reentrant and is called from both the main loop and the BLE_EVT_TX_COMPLETE event. Of course BLE_EVT_TX_COMPLETE is executed from an interrupt context. If transmission is interrupted with a completion event then the same packet counter can be used twice (once in the interrupt, once outside of it). I noticed that this issue only occurs when the final notification isn't full (i.e. all measurements have been consumed), which of course means a network buffer may be left free. Then, when normal execution resumes, a duplicate is sent.

    Simplest fix was to just pass a scheduler into SOFTDEVICE_HANDLER_INIT so that events wouldn't be invoked from an interrupt context. Also made my marshalling code for other events redundant, so win win.

Related