nrf52840 appears to stop advertising

We're having an issue where our nrf52840 appears to stop advertising, and isn't able to resume advertising.

Our device is a custom PCB based on the Thingy91 (has both an nrf9160 and an nrf52840). Our firmware is based on asset_tracker_v2 and connectivity bridge.

One of the modifications we made is to wake up the nrf52840 using a DETECT signal. Here's our main:

int main(void)
{
	int err;

	// This pin is responsible for waking up the nRF52840 from system off mode which happens
	// when nrf_power_system_off(NRF_POWER) is called in power_down_handler().
	nrf_gpio_cfg_input(bt_enable_gpio.pin, NRF_GPIO_PIN_PULLUP);
	nrf_gpio_cfg_sense_set(bt_enable_gpio.pin, NRF_GPIO_PIN_SENSE_LOW);

	err = app_event_manager_init();
	if (err) {
		LOG_ERR("Application Event Manager could not be initialized, error: %d", err);
		return err;
	}

	module_set_state(MODULE_STATE_READY);

	return 0;
}

We have another GPIO pin that disables bluetooth by sending the BLE_CTRL_DISABLE event. Here's how we set up the interrupt inside ble_handler.c:

static int bt_gpio_triggers_configure()
{
	int err;

	err = gpio_pin_configure_dt(&bt_disable_gpio, (GPIO_INPUT | GPIO_PULL_UP | GPIO_ACTIVE_LOW));
	if (err) {
		LOG_ERR("Failed to configure bt_disable_gpio: %d", err);
		return err;
	}

	err = gpio_pin_interrupt_configure_dt(&bt_disable_gpio, GPIO_INT_EDGE_TO_ACTIVE);
	if (err) {
		LOG_ERR("Failed to configure bt_disable_gpio interrupt: %d", err);
		return err;
	}

	gpio_init_callback(&bt_disable_gpio_cb, ble_ctrl_disable, BIT(bt_disable_gpio.pin));
	err = gpio_add_callback(bt_disable_gpio.port, &bt_disable_gpio_cb);
	if (err) {
		LOG_ERR("Failed to add bt_disable_gpio callback: %d", err);
		return err;
	}
}

This seems to work great. We've created a test that wakes up the device, connects to a phone and receives some data, then goes back to sleep, and repeats every couple minutes, and that can run for days when connected to a debug probe.

However, in the field, we'll see that over a longer period of time (sometimes days, sometimes weeks) the device appears to stop advertising bluetooth. And once this happens, the only "fix" is to reset the 52840, after which it resumes working. So far, we have only been able to reproduce this on battery-powered devices, so I suspect it might be power-related, but not sure. It's very difficult to reproduce.

What's strange is that the nrf52840 still appears to be responsive. For example, we set a GPIO pin active when it wakes to "acknowledge" receipt of the DETECT signal, and we do receive that signal.

I suspect there's an error occurring when trying to set up Bluetooth advertising at some point (bt_enable, bt_le_adv_start) so I've added a NVIC_SystemReset() call in those cases as a band-aid, but still want to understand the root cause. We're working on writing error codes to flash memory so we can retrieve a record of what happened, but don't have that implemented yet.

Any ideas on what could be happening, or how to diagnose this issue?

Related