This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

Error when trying to update to 2M PHY with some dongles

I have an nRF 52832 running the SDK 15.2 as peripheral and have tried connecting to different Windows PCs: a laptop running a native bluetooth 5 adapter and a desktop PC running a Laird 851 bluetooth dongle (this is BLE 5 too).

On the connection event, I'm calling the routine sd_ble_gap_phy_update to request an update to 2M PHY, as shown in the following code.

static void ble_stack_init( void )
{
	ret_code_t err_code;

	err_code = nrf_sdh_enable_request();
	APP_ERROR_CHECK( err_code );

	// Configure the BLE stack using the default settings.
	// Fetch the start address of the application RAM.
	uint32_t ram_start = 0;
	err_code = nrf_sdh_ble_default_cfg_set( APP_BLE_CONN_CFG_TAG, &ram_start );
	APP_ERROR_CHECK( err_code );

	// Enable BLE stack.
	err_code = nrf_sdh_ble_enable( &ram_start );
	APP_ERROR_CHECK( err_code );

	// Register a handler for BLE events.
	NRF_SDH_BLE_OBSERVER( m_ble_observer, APP_BLE_OBSERVER_PRIO, ble_evt_handler, NULL );
}

static void ble_evt_handler( ble_evt_t const * p_ble_evt, void * p_context )
{
	uint16_t conn_handle = p_ble_evt->evt.gap_evt.conn_handle;
	uint16_t role        = ble_conn_state_role( conn_handle );

	pm_handler_secure_on_connection( p_ble_evt );

	switch ( role ) {
	case BLE_GAP_ROLE_PERIPH:
		periph_mgmt_evt_handler( p_ble_evt, p_context );
		break;
	case BLE_GAP_ROLE_CENTRAL:
		central_evt_handler( p_ble_evt, p_context );
		break;
	default:
		break;
	}
}
	
void periph_mgmt_evt_handler( ble_evt_t const * p_ble_evt, void * p_context )
{
	ret_code_t err_code = NRF_SUCCESS;
	uint16_t conn_handle;

	switch ( p_ble_evt->header.evt_id ) {

	case BLE_GAP_EVT_CONNECTED: {
		NRF_LOG_INFO( "Connected." );
		conn_handle = p_ble_evt->evt.gap_evt.conn_handle;

		periph_on_connected_event( common_mgmt_get_conn( conn_handle ), conn_handle ); 

		ble_gap_phys_t const phys = {
			.rx_phys = BLE_GAP_PHY_2MBPS,
			.tx_phys = BLE_GAP_PHY_2MBPS,
		};
		err_code = sd_ble_gap_phy_update( p_ble_evt->evt.gap_evt.conn_handle, &phys );
		APP_ERROR_CHECK_NON_CRITICAL( err_code );

		adjust_advertising();
	}
	break;

With the Windows laptop this goes as expected, and the connection changes the PHY to 2M. However, with the Laird dongle they get into this weird loop where the nRF is saying it prefers the 2M PHY and the PC says it prefers the coded PHY.

The whole exchange is included here. It appears the nRF is sending an LL_PHY_REQ but, for some reason, the PC is sending 4 identical answers LL_PHY_RSP.

The nRF sends this request, stating that it prefers 2M, as expected.

The PC answers with 4 identical packets (even the sequence number), saying that it prefers coded (not sure why). However, they never seem to agree. I'd expect the nRF to stop sending the request, because the PC is not offering 2M (not sure why).

If I don't send the request, everything works as expected, i.e. they stay in 1M PHY and are able to communicate succesfully.

Thanks!

Parents

0 Kenneth over 5 years ago

Hi,

I have found in general that initiating exchanges directly on connected event is not a good idea, I have seen many reports that some peers don't handle this very well. Likely they may have internal state machines that are not ready or can't handle the exchanges on first connection event. So if possible start an app_timer that may execute the exchange after for instance 100 ms or later. This doesn't really address your issue, but more of a fyi.

The 4 identical packets are simply retransmits, you can see the packet content and sequence numbers are not changed, so this means they are simply retransmits until the peer, in this case the nRF52, send anything in reverse direction to acknowledge the packet is received and the sequence numbers are incremented.

In this case it does indeed look as you have a back-and-forth PHY update requests, so I suggest that you search through your project to find all instances where the sd_ble_gap_phy_update() may be called, likely you have some module in your application that cause this to be called more than once.

What softdevice number and version are you using here?

Best regards,
Kenneth
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Federico over 5 years ago in reply to Kenneth
Hi Kenneth,

Thank you very much for the advise and answer. I'm working with s132_nrf52_6.1.0_softdevice.

I can see that the identical packets are re-transmits now, that makes sense. However, I've traced all calls to sd_ble_gap_phy_update and this is the only one that is executing.

I tried the setup again and this time, it works a little differently (not sure what changed, I'm on a different PC but everything is the same). Now, the nRF never answers. I only get a bunch of LL_PHY_RSP after the first and only LL_PHY_REQ. All the LL_PHY_RSP have the same sequential number. After 10 seconds, the communication stops; it would appear that a timeout expires on the nRF because it never got the expected answer. However, if this was the case, I would expect retransmissions, as in the original post.

What event should the LL_PHY_RSP raise in the nRF? Is it possible I'm not answering and that's why it's retransmitting all the time?

The LL_PHY_RSP packets have an incorrect CRC flag in the sniffer, is this relevant? Could they be malformed packets, and that's why the nRF never answers?

This link shows a few communication diagrams, where it would appear that the Laird dongle is out of spec. When the Slave initiates the PHY change (6.31 - 6.33), the Master never answers LL_PHY_RSP, it can answer either LL_PHY_UPDATE_IND or LL_UNKNOWN_RSP. I have tested both these cases and they work fine: a BT5 laptop accepts and sends LL_PHY_UPDATE_IND to change to 2Mbps, a 4.0 dongle sends LL_UNKNOWN_RSP because it doesn't support PHY change. Could this be the issue?

Thank you very much for your time

Edit: all LL_PHY_RSP packets have the exact same (incorrect, according to wireshark) CRC. This would seem to discard an interference error, more like a systematic error on either the transmission or reception. Also, the LL_PHY_REQ that the nRF keeps sending are also retransmits (they have the same sequence number; the SN and ENSN seem to indicate that the nRF is not processing the LL_PHY_RSP)

This is the nRF log while this is happening:
[00:00:13.761,962] <debug> nrf_ble_gatt: Requesting to update ATT MTU to 247 bytes on connection 0x1. [00:00:13.761,993] <debug> nrf_ble_gatt: Updating data length to 251 on connection 0x1. [00:00:13.762,207] <info> app: Connected. [00:00:13.762,268] <info> app: Restarted advertising. [00:00:14.014,770] <debug> nrf_ble_gatt: Data length updated to 251 on connection 0x1. [00:00:14.014,770] <debug> nrf_ble_gatt: max_rx_octets: 251 [00:00:14.014,801] <debug> nrf_ble_gatt: max_tx_octets: 251 [00:00:14.014,801] <debug> nrf_ble_gatt: max_rx_time: 2120 [00:00:14.014,801] <debug> nrf_ble_gatt: max_tx_time: 2120 [00:00:14.014,831] <debug> nrf_ble_gatt: Data length updated to 251 on connection 0x1. [00:00:14.014,862] <debug> nrf_ble_gatt: max_rx_octets: 251 [00:00:14.014,862] <debug> nrf_ble_gatt: max_tx_octets: 251 [00:00:14.014,862] <debug> nrf_ble_gatt: max_rx_time: 2120 [00:00:14.014,892] <debug> nrf_ble_gatt: max_tx_time: 2120 [00:00:14.014,892] <info> app: Data length updated to 251 bytes. [00:00:14.074,707] <debug> nrf_ble_gatt: Peer on connection 0x1 requested a data length of 251 bytes. [00:00:14.074,737] <debug> nrf_ble_gatt: Updating data length to 0 on connection 0x1. [00:00:14.074,768] <debug> nrf_ble_gatt: Peer on connection 0x1 requested a data length of 251 bytes. [00:00:14.074,798] <debug> nrf_ble_gatt: Updating data length to 251 on connection 0x1. [00:00:23.795,196] <info> app: Disconnected. [00:00:23.795,227] <info> app: Restarted advertising.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Kenneth over 5 years ago in reply to Federico

Quickly looking at the screenshot it looks like the peripheral just stop responding, are you sure you are not having an assert? Maybe check the app_error_fault_handler() for any error?

Kenneth
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Federico over 5 years ago in reply to Kenneth

That would log an error to the debug console right? It would also stop execution on the NRF_BREAKPOINT_COND at the end of app_error_fault_handler, or reset the nRF. I can't see any of these things happening, execution seems to resume normally. What other condition could cause the nRF to stop transmitting altogether?
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Kenneth over 5 years ago in reply to Federico

What is the ppm tolerance you have set for the 32kHz lfclk when enable the BLE softdevice? Can you try to increase it (e.g. 500ppm for test)?

Kenneth
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Federico over 5 years ago in reply to Kenneth

I was using the default NRF_SDH_CLOCK_LF_ACCURACY, with a value of 7 which corresponds to NRF_CLOCK_LF_ACCURACY_20_PPM. I changed it to 1, that should be NRF_CLOCK_LF_ACCURACY_500_PPM.

I got the same result, the nRF aborts transmission anyway and I'm still getting the wrong CRC on the sniffer. At this point the only two clues to go on are the wrong CRC on the sniffer and the logged errors:

[00:00:39.677,825] <debug> nrf_ble_gatt: Peer on connection 0x2 requested an ATT MTU of 527 bytes.
[00:00:39.677,856] <debug> nrf_ble_gatt: Updating ATT MTU to 0 bytes (desired: 0) on connection 0x2.
[00:00:39.677,886] <error> nrf_ble_gatt: sd_ble_gatts_exchange_mtu_reply() returned NRF_ERROR_INVALID_PARAM.

But they might not be related at all
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Kenneth over 5 years ago in reply to Federico

You have updated latest windows 10 build and drivers?

I could find these threads that seems to indicate that Bluetooth is problematic on the specific DELL model (Dell XPS 13 9350 laptop), there is some suggestions in the below threads of things you can try:
https://www.dell.com/community/Networking-Internet-Bluetooth/Bluetooth-not-working-on-new-XPS-13-9350-Windows-10/td-p/4760065/page/3

https://www.dell.com/community/Networking-Internet-Bluetooth/Windows-10-Bluetooth-Problem/td-p/4626667/page/3

Also, try to comment out pm_handler_secure_on_connection() and/or pm_conn_secure() if you have it in main.c to see if that make a difference.

One final comment: Try to set SEC_PARAM_LESC 0, in case the laptop can't handle this bit.

Kenneth
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Reply

0 Kenneth over 5 years ago in reply to Federico

You have updated latest windows 10 build and drivers?

I could find these threads that seems to indicate that Bluetooth is problematic on the specific DELL model (Dell XPS 13 9350 laptop), there is some suggestions in the below threads of things you can try:
https://www.dell.com/community/Networking-Internet-Bluetooth/Bluetooth-not-working-on-new-XPS-13-9350-Windows-10/td-p/4760065/page/3

https://www.dell.com/community/Networking-Internet-Bluetooth/Windows-10-Bluetooth-Problem/td-p/4626667/page/3

Also, try to comment out pm_handler_secure_on_connection() and/or pm_conn_secure() if you have it in main.c to see if that make a difference.

One final comment: Try to set SEC_PARAM_LESC 0, in case the laptop can't handle this bit.

Kenneth
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Children

No Data