FREETOS + NUS high throput application crash the BLE stack without error

CONFIGURACION:

NRF52 + SDK15 + SD132 + FREERTOS + NUS.

I have a task sending many packages to a central. I'm having an aboslutly aleatory behaviour:

- At some time (could be seconds or minutes) the tramsmition stops and I start to get NRF_ERROR_RESOURCES. Next, the device is disconnected but I don't get the BLE_GAP_EVT_DISCONNECTED or BLE_GATTC_EVT_TIMEOUT events in the ble event handler.

-Besides the device desappears in the scan, I have had tryed to restart advertising without succed.

- I'm not receving any error in the peer, and the program is still running correctly, but it looks like the SD is not responding anymore.

- I have tried a lot of debbuging strategies and different configuration without succed, the only temporaly solution was a NVIC_system_restart with a timeout in the comunication task.

Could you give me a clue of what could be happening? Update the sdk could be a solution for this problem?

I let you two involved functions: ble_transmition_test is the one that I'm using to reproduce the issue. I'm calling it from a freertos task with low priority.

uint32_t ble_string_send(uint8_t *data, uint16_t len)
{
	uint32_t err_code;
	uint16_t time_out = 10;
	uint16_t m_len = 0;
	
	if (len == 0)
		return 0;
	
	if (ble_state == BLE_DISCONNECTED)
		return 0;
	
	if (len > m_ble_nus_max_data_len)
		m_len = m_ble_nus_max_data_len;
	else
		m_len = len;
	do
	{
		err_code = ble_nus_data_send(&m_nus, data, &m_len, m_conn_handle);
		time_out--;
	} while ((err_code == NRF_ERROR_BUSY || err_code == NRF_ERROR_RESOURCES)  && time_out);
	
	if (err_code == NRF_SUCCESS)
		return m_len;
	else
		return 0;
}

void ble_transmition_test(void)
{
						
	uint32_t i = 0;
	while(1)
	{
		uint16_t len = sprintf((char*) ble_tx_data, "%u,%.2f,%d,[%.1f,%.1f,%.1f,%d],%d\r\n", i, (float) 20.0, 1,0.066, 1.0,2.0,10,90);
		uint8_t retrys = 100;
		while (ble_string_send(ble_tx_data, len) < len && retrys)
		{
			vTaskDelay(20);
			retrys--;
		}
		vTaskDelay(10);
		i++;
	}	
}

/**@brief Function for handling BLE events.
 *
 * @param[in]   p_ble_evt   Bluetooth stack event.
 * @param[in]   p_context   Unused.
 */
static void ble_evt_handler(ble_evt_t const * p_ble_evt, void * p_context)
{
		uint8_t err_code;
    switch (p_ble_evt->header.evt_id)
    {
        case BLE_GAP_EVT_CONNECTED:
            m_conn_handle = p_ble_evt->evt.gap_evt.conn_handle;
            err_code = nrf_ble_qwr_conn_handle_assign(&m_qwr, m_conn_handle);
            APP_ERROR_CHECK(err_code);
						ble_state = BLE_CONNECTED;
            break;
				 case BLE_GAP_EVT_DISCONNECTED:
						ble_state = BLE_DISCONNECTED;
            break;

        case BLE_GAP_EVT_PHY_UPDATE_REQUEST:
        {
            NRF_LOG_DEBUG("PHY update request.");
            ble_gap_phys_t const phys =
            {
                .rx_phys = BLE_GAP_PHY_AUTO,
                .tx_phys = BLE_GAP_PHY_AUTO,
            };
            err_code = sd_ble_gap_phy_update(p_ble_evt->evt.gap_evt.conn_handle, &phys);
						APP_ERROR_CHECK(err_code);
            break;
        }

        case BLE_GATTC_EVT_TIMEOUT:
            // Disconnect on GATT Client timeout event.
            NRF_LOG_DEBUG("GATT Client Timeout.");
            err_code = sd_ble_gap_disconnect(p_ble_evt->evt.gattc_evt.conn_handle,
                                             BLE_HCI_REMOTE_USER_TERMINATED_CONNECTION);
						APP_ERROR_CHECK(err_code);
            break;

        case BLE_GATTS_EVT_TIMEOUT:
            // Disconnect on GATT Server timeout event.
            NRF_LOG_DEBUG("GATT Server Timeout.");
            err_code = sd_ble_gap_disconnect(p_ble_evt->evt.gatts_evt.conn_handle,
                                             BLE_HCI_REMOTE_USER_TERMINATED_CONNECTION);
						APP_ERROR_CHECK(err_code);
            break;
				case BLE_GATTS_EVT_HVN_TX_COMPLETE:
						send_wait = false;
							NRF_LOG_INFO("Send ends");
						break;
				case BLE_GAP_EVT_ADV_SET_TERMINATED:
					//command_to_hmi.type = HMI_COMMAND_SLEEP;
				break;					
        default:
            // No implementation needed.
            break;
    }
}

Parents
  • Unfortunately I am not familiar with FreeRTOS, but if you are not getting a disconnect event, are you sure the code is running at all and not for instance stuck in app_error_fault_handler() or waiting indefinitely for ble_nus_data_send(), after looking at the code it may seem as that is the case. The NRF_ERROR_RESOURCES simply means that the softdevice buffer is full, and the application should wait for BLE_GATTS_EVT_HVN_TX_COMPLETE event before trying again. So I suggest to change your code a bit, either ignore NRF_ERROR_RESOURCES (which means packet will be lost) or the application need to buffer the packet on NRF_ERROR_RESOURCES event, and try again on BLE_GATTS_EVT_HVN_TX_COMPLETE event. You may also consider changing the connection interval (make it shorter) to allow more throughput in general. You may also try to configure NRF_SDH_CLOCK_LF_ACCURACY 1 to see if your problem is related to LFCLK accuracy of any kind.

    Kenneth

Reply
  • Unfortunately I am not familiar with FreeRTOS, but if you are not getting a disconnect event, are you sure the code is running at all and not for instance stuck in app_error_fault_handler() or waiting indefinitely for ble_nus_data_send(), after looking at the code it may seem as that is the case. The NRF_ERROR_RESOURCES simply means that the softdevice buffer is full, and the application should wait for BLE_GATTS_EVT_HVN_TX_COMPLETE event before trying again. So I suggest to change your code a bit, either ignore NRF_ERROR_RESOURCES (which means packet will be lost) or the application need to buffer the packet on NRF_ERROR_RESOURCES event, and try again on BLE_GATTS_EVT_HVN_TX_COMPLETE event. You may also consider changing the connection interval (make it shorter) to allow more throughput in general. You may also try to configure NRF_SDH_CLOCK_LF_ACCURACY 1 to see if your problem is related to LFCLK accuracy of any kind.

    Kenneth

Children
  • Hi! I have tried also what you have said about the BLE_GATTS_EVT_HVN_TX_COMPLETE, with the same result. As I've said, I can see se program runging ok, it is not stuck in any fault hanlder.

    ou may also consider changing the connection interval (make it shorter) to allow more throughput in general. You may also try to configure NRF_SDH_CLOCK_LF_ACCURACY 1 to see if your problem is related to LFCLK accuracy of any kind.

    About that, I'm using the sintetic LF clock with this accuracy: NRF_CLOCK_LF_XTAL_ACCURACY_250_PPM. Am I to change the accuracy? I'm not sure if that couyld be the problem, but I'm noticing that the vtaskDelay (freertos task delay)is kind imprecise.

Related