Possible reasons for disconnect reason 8

Hi,

I have an application with custom boards using nRF54L15 (running ncs 2.9.1) as the peripheral and currently a modified nRF9160DK (running ncs 2.5.1) as the central, which connects to multiple peripherals at a time. I am noticing quite some disconnects happening in (unattended) field tests, and I would like to understand if there is a way to diagnose/fix these. I already started adding more logging of metrics so I have more context, and this is what I have so far.

# Connection parameters
CONFIG_BT_PERIPHERAL_PREF_MIN_INT=400
CONFIG_BT_PERIPHERAL_PREF_MAX_INT=400
CONFIG_BT_PERIPHERAL_PREF_LATENCY=2
CONFIG_BT_PERIPHERAL_PREF_TIMEOUT=1000

There are no resets happening on the peripheral/central side, and the RSSI is pretty stable, with the 10% quantile for the device that is furthest away at a stable -80dBm. I know this is on the low end, but I assume it should still be high enough for a stable connection. In this case, the central has 4 peripherals connected to it. The disconnects occur for all peripherals, but at widely different (random) intervals.

Could you give me any pointers on what could be going wrong and how I can try to identify it? It's hard for me to get a sniffer trace as this field test is happening unattended at a public site, and the occurence of the disconnects is pretty random and can take some time before they start happening.

Best,

Wout

Parents
  • Hi

    Can you upload the full log of the connection process, the connection itself and the disconnect as a .txt file so we can review it? One line just shows us that it's at one point set as expected. It doesn't mean it will stay like this. Also, how long after this connection parameter update does the disconnection occur? If there is nothing else occurring for 10 seconds here the connection supervision timeout will trig and disconnect the devices.

    Best regards,

    Simon

  • Hi Simon,

    There's nothing concretely being logged once the connection is set up. The peripheral just sends sensor data (around 100 byte payload) every second. The disconnections occur randomly, sometimes within a minute, sometimes only after 24+ hours.

    Here is a log from the connection process (with a peripheral that was just started up, hence the reset reason).

    [00:00:11.579,681] <inf> ble: Scanning ..
    [00:00:11.827,819] <inf> ble: Device found: DE:56:FE:56:A4:A9 (random), RSSI: -70
    [00:00:12.331,939] <inf> sensor: sensor_connect: Connecting to sensor with address DE:56:FE:56:A4:A9 (random)
    [00:00:14.683,898] <inf> pss_client: Reset reason received: 4096
    [00:00:14.684,326] <inf> sensor: Sensor with address DE:56:FE:56:A4:A9 (random) has UUID 996b1211dbecbdfd, FW version 0.7.2, reset reason 4096
    [00:00:14.783,813] <inf> sensor: Time synchronized for sensor 996b1211dbecbdfd
    [00:00:14.784,851] <inf> gateway: sensor_connected: Connected sensors = 1
    [00:00:14.885,620] <inf> ble: Scanning ..
    [00:00:17.683,746] <inf> ble: Connection parameters updated: interval 500.00 ms, latency 2 intervals, timeout 10000 ms

    I don't have a full log from the disconnect process at the ready, but I can tell that the disconnect reason is recorded in the on_disconnected callback on the nRF52840 of the nRF9160DK and forwarded to the nRF9160 to be included in a periodic update message to our cloud infrastructure. That's how I can see how often/when it happens without having actual logs.

    I'll try to get some logs from a disconnect event, but it may take some time until I get it.

    In the meantime, I wanted to already check to see if there are specific things I should log as part of the application to provide as much information as possible? My thinking is that the main reasons for the peripheral for not being able to respond within the timeout would be a reset on the peripheral or a low quality connection which should be characterized by low RSSI. I'm monitoring both factors, and they both seem fine. So are there other things I should try to monitor or investigate?

    Best,

    Wout

Reply
  • Hi Simon,

    There's nothing concretely being logged once the connection is set up. The peripheral just sends sensor data (around 100 byte payload) every second. The disconnections occur randomly, sometimes within a minute, sometimes only after 24+ hours.

    Here is a log from the connection process (with a peripheral that was just started up, hence the reset reason).

    [00:00:11.579,681] <inf> ble: Scanning ..
    [00:00:11.827,819] <inf> ble: Device found: DE:56:FE:56:A4:A9 (random), RSSI: -70
    [00:00:12.331,939] <inf> sensor: sensor_connect: Connecting to sensor with address DE:56:FE:56:A4:A9 (random)
    [00:00:14.683,898] <inf> pss_client: Reset reason received: 4096
    [00:00:14.684,326] <inf> sensor: Sensor with address DE:56:FE:56:A4:A9 (random) has UUID 996b1211dbecbdfd, FW version 0.7.2, reset reason 4096
    [00:00:14.783,813] <inf> sensor: Time synchronized for sensor 996b1211dbecbdfd
    [00:00:14.784,851] <inf> gateway: sensor_connected: Connected sensors = 1
    [00:00:14.885,620] <inf> ble: Scanning ..
    [00:00:17.683,746] <inf> ble: Connection parameters updated: interval 500.00 ms, latency 2 intervals, timeout 10000 ms

    I don't have a full log from the disconnect process at the ready, but I can tell that the disconnect reason is recorded in the on_disconnected callback on the nRF52840 of the nRF9160DK and forwarded to the nRF9160 to be included in a periodic update message to our cloud infrastructure. That's how I can see how often/when it happens without having actual logs.

    I'll try to get some logs from a disconnect event, but it may take some time until I get it.

    In the meantime, I wanted to already check to see if there are specific things I should log as part of the application to provide as much information as possible? My thinking is that the main reasons for the peripheral for not being able to respond within the timeout would be a reset on the peripheral or a low quality connection which should be characterized by low RSSI. I'm monitoring both factors, and they both seem fine. So are there other things I should try to monitor or investigate?

    Best,

    Wout

Children
No Data
Related