This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

disconnecting while operations are in progress never gives BLE_GAP_EVT_DISCONNECTED event

2020-01-24-092119EST-ProductStoppedGettingEventsFromNordicDK.txtImprivataTestNordicEventsNotReceived.zipCalls_to_pc_ble_driver.cpp0285.2020-02-24-TestProgramUploadedToNordicSupport.zipFeb25TestProgramUploadedToNordicSupport.zipImprivata_bgTestApp.zipbgSDKTestAppMay4.zip2020-05-05-035347-NordicDK_USB840M_200505_ClockInternal_2in1.hex.txt.txtbgSDKTestAppMay6.zipI’m developing an application based on pc-ble-driver to talk to an nRF52840-based dongle (from Fanstel).

I’m having trouble disconnecting cleanly when a connection has operations in progress.  For example, I call ‘sd_ble_gattc_write’, which returns NRF_SUCCESS, but I don’t receive event BLE_GATTC_EVT_WRITE_RSP (after waiting 60 seconds), so I decide to disconnect. When this happens, sd_ble_gap_disconnect returns NRF_SUCCESS, but I do not receive BLE_GAP_EVT_DISCONNECTED even after waiting 30 seconds. The connection supervision timeout is 4 seconds.  What could cause the disconnect to not generate any BLE_GAP_EVT_DISCONNECTED event?

What I’m trying to accomplish here: if a connection is not responsive, I want to end that connection, without disturbing other connections I have open.

Thanks,

Paul Bradford

  • Hi Paul,

    Today we managed to track down what may be a potential race condition in the connectivity firmware. When the scheduler queue gets close to full (above 75%), the processing of events from the softdevice is suspended in order to leave room for serial events from USB, etc. This happens in ser_conn_handlers.c:191. When the queue is processed and free space is below 50%, the processing of softdevice events will continue (main.c:376). What happens when this issue occurs seems to be that the suspend function is called twice in a row, with a call to resume right after the second suspend:

    [00:03:16.456,695] <info> app: nrf_sdh_suspend
    [00:03:16.692,565] <info> app: nrf_sdh_suspend
    [00:03:16.692,596] <info> app: nrf_sdh_resume

    The solution we now are testing is to incorporate the resume section into a critical region:

    CRITICAL_REGION_ENTER();
    if (nrf_sdh_is_suspended())
    {
        // Resume pulling new events if queue utilization drops below 50%.
        if (app_sched_queue_space_get() > (SER_CONN_SCHED_QUEUE_SIZE >> 1))
        {
            nrf_sdh_resume();
        }
    }
    CRITICAL_REGION_EXIT();

    The critical region will prevent the application from processing other application events during this region.

    I have had this workaround running with a clean connectivity FW for 3 hours without hitting the issue now (typically issue happens within 5-15 minutes on my setup).

    Best regards,
    Jørgen

  • After more testing on several different computers, I'm still seeing the original problem. On one computer I don't see it but on two others I do. I also see some errors from pc-ble-driver when doing write operations to the USB serial port, including:

    Error:  serial port write operation on port COM15 failed. Error: A device which does not exist was specified.[433]

    Error:  serial port write operation on port COM15 failed. Error: The device does not recognize the command.[22]

    Error:  serial port write operation on port COM15 failed. Error: A device attached to the system is not functioning.[31]

    I'll continue testing.

  • The serial port errors may indicate that the USB was detached from the computer, for instance by a device reset. I have seen this behavior with the approach used in the other customer case I linked you before. I traced this to an issue with the scheduler queue size, and increasing this seemed to solve the issue on my setup. I posted an updated hex-file in this post, which increases the scheduler queue size to 32 (from 16). If this is not enough, I also built a version with a queue size of 64: connectivity_4.1.1_usb_with_s140_6.1.1_critical_region_fix_increased_sched_queue_size64_log_enabled.hex

    This one also has UART logging enabled for errors only. I would appreciate if you could check the logs if you are still able to reproduce the issue with this new hex-file. 

    Did you always see the original problem together with this new problem? Did you manage to reproduce this on the Nordic Dongle/DK, or only on Fanstel Dongle? I received the firmware for Fanstel and have had it running without issues for 2 hours now.

  • It doesn't fail on every computer.   I will work on testing the new connectivity firmware on Fanstel and the Nordic dev-dongle and the Nordic DK.

Related