This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

Connection timeout at central

Hi,

I'm using the example "ble_central/ble_app_uart_c" of nRF5_SDK_17.0.2. Connection to peripheral workes fine.

If the central re-starts while a connection is established, the peripheral get a "disconnect" event and wait for a new connection.

If the peripheral re-starts while a connection is established, the central get no "disconnect" event and runs into fatal error.

So, how to set up central, to handle a "disconnect" by timeout?

Thanks for helping!

Parents
  • Hello,

    I'm using the example "ble_central/ble_app_uart_c" of nRF5_SDK_17.0.2.

    Have you made any changes to the central example? If so, what changes have you made?

    the central get no "disconnect" event and runs into fatal error.

    That sounds strange - the central should resume scanning for a peripheral device when a link is broken.
    Could you make sure to have DEBUG defined in your preprocessor defines, like shown in the included image?

    This will make a detailed error message be outputted to your logger whenever a non-NRF_SUCCESS error code is passed to an APP_ERROR_CHECK. Please do this, and let me know what this error message says when the device resets.

    Looking forward to resolving this issue together!

    Best regards,
    Karl

  • Hello Karl,

    thank you for your soon response. First the error message:

    <info> app_timer: RTC: initialized.
    <info> app: BLE UART central example started.
    <info> app: Connecting to target 340D3750ACE0
    <info> app: ATT MTU exchange completed.
    <info> app: Ble NUS max data length set to 0xF4(244)
    <info> app: Discovery complete.
    <info> app: Connected to device with Nordic UART Service.
    <error> nrf_ble_gq: SD GATT procedure (1) failed on connection handle 0 with error: 0x00000013.
    <error> app: ERROR 19 [NRF_ERROR_RESOURCES] at C:\Projekte\MyCompany\0288_TT_BLE_Modul\nRF5_SDK_17.0.2_d674dde\examples\ble_central\ble_app_uart_c\main.c:123
    PC at: 0x0002F57D
    <error> app: End of error report

    I have made some "small" changes in main.c:

    • set ECHOBACK_BLE_UART_DATA to 0
    • in void uart_event_handle(app_uart_evt_t * p_event):
      Check connection state befor sending data
          if (m_ble_nus_c.conn_handle != BLE_CONN_HANDLE_INVALID)
          {
              switch (p_event->evt_type)
              {
                  case APP_UART_DATA_READY:
                      ...
              }
          }

    Here is some background information:

    The BLE module is connected to our µC (STM32) via UART. This STM32 sends a heartbeat every 250ms via uart -> BLE to the peripheral - indipending of a BLE connection state. If there is no connection established, the central runs into a fatal error. Therefore we have checked the connection handle before sending the data via BLE.

  • Hello Karl,

    sorry for the late response.

    The initialization and enabling of the BLE stack does not take very long, and the nRF52833 can still receive over UART regardless of the BLE stack status. Which error is generated, and from which function, when the STM32 sends data too soon?

    In Debug-Build I get following error message:

    In line 305 of main.c is the following function:

    void uart_event_handle(app_uart_evt_t * p_event)
    {
        static uint8_t data_array[BLE_NUS_MAX_DATA_LEN];
        static uint16_t index = 0;
        uint32_t ret_val;
    
        switch (p_event->evt_type)
        {
            /**@snippet [Handling data from UART] */
            case APP_UART_DATA_READY:
                ...
                break;
    
            /**@snippet [Handling data from UART] */
            case APP_UART_COMMUNICATION_ERROR:
                index = 0;
                NRF_LOG_ERROR("Communication error occurred while handling UART.");
    
    /* line 305 */
                APP_ERROR_HANDLER(p_event->data.error_communication);
    
                break;
    
            case APP_UART_FIFO_ERROR:
                index = 0;
                NRF_LOG_ERROR("Error occurred in FIFO module used by UART.");
                APP_ERROR_HANDLER(p_event->data.error_code);
                break;
    
            default:
                break;
        }
    }
    

    Whe I stopp the debugger before entering the APP_ERROR_HANDLER, the call stack changes to

    Since the error occurs while function "ble_stack_init()" is running, I suspected that the problem was in the area of initialization.

    I am sorry, but I do not understand what you mean to say here fully. Are you saying that you do not see the DISCONNECTED event when a APP_ERROR_HANDLER execution has just finished?

    If I reset the peripheral while there is a BLE connection and the cyclic heartbeat is being sent, I get the following error:

    Since we use the default APP_ERROR_HANDLER, application stopps and has to be reset! How should we handle this error?

    I do not understand what you mean by this. Why can you simply not increase the queue, if this is what is causing your NRF_ERROR_RESOURCES?

    The project we are working on is a safety application. The only message sent via BLE is a heartbeat (every 250 ms). If this heartbeat does not occur, the machine must be switched off immediately.

    If there is a corresponding error, the Central no longer sends a heartbeat and thus wants to force shutdown of the machine.

    If we handle messages in a queue, heartbeats could accumulate in the event of a disturbed transmission and lead to a delayed shutdown. That would be unacceptable!

    Kind regards,
    Andi

  • Hello Andi,

    Andi_Frueh said:
    sorry for the late response.

    No problem at all - we continue whenever you have the time, no worries.

    Andi_Frueh said:
    In line 305 of main.c is the following function:

    Exactly which APP_ERROR_CHECK is at line 305?
    Could it here be that you have unitialized the UART peripheral while not in a connection without having disabled related UART functionality in the application, or similar?

    Andi_Frueh said:
    If I reset the peripheral while there is a BLE connection and the cyclic heartbeat is being sent, I get the following error:

    Which function returned the NRF_ERROR_RESOURCES passed to the APP_ERROR_CHECK on line 119?

    Andi_Frueh said:
    Since we use the default APP_ERROR_HANDLER, application stopps and has to be reset! How should we handle this error?

    You could implement specific error handling for specific errors. The default error handling is to reset the device - no matter which error has occurred - this is not always necessary, but depends on your application. For example when the NRF_ERROR_RESOURCES is returned by the sd_ble_gatts_hvx call, that only means that the queue for notifications to send is already full. This is therefore then not a good reason to reset the application.
    In this case, a more fitting error handling would be to note which data was not successfully queued, and instead try to queue it again later, after having received a hvn tx event (notification successfully sent event). This way, you make sure not to restart the device unnecessarily, and not to loose any data that is not immediately queued, for example.

    Andi_Frueh said:

    The project we are working on is a safety application. The only message sent via BLE is a heartbeat (every 250 ms). If this heartbeat does not occur, the machine must be switched off immediately.

    If there is a corresponding error, the Central no longer sends a heartbeat and thus wants to force shutdown of the machine.

    If we handle messages in a queue, heartbeats could accumulate in the event of a disturbed transmission and lead to a delayed shutdown. That would be unacceptable!

    Thank you for elaborating on the requirements and constraints of the project - this makes it much easier for me to understand your issues and help you resolve them.

    Best regards,
    Karl

  • Hi Karl,

    Exactly which APP_ERROR_CHECK is at line 305?

    Could it here be that you have unitialized the UART peripheral while not in a connection without having disabled related UART functionality in the application, or similar?

    Since call stack shows, that the program counter has started in function ble_stack_init, uart would be already initialized:

    int main(void)
    {
        // Initialize.
        log_init();
        timer_init();
        uart_init();
        buttons_leds_init();
        db_discovery_init();
        power_management_init();
        ble_stack_init();
        gatt_init();
        nus_c_init();
        scan_init();
    
        // Start execution.
        printf("BLE UART central example started.\r\n");
        NRF_LOG_INFO("BLE UART central example started.");
        scan_start();
    
        // Enter main loop.
        for (;;)
        {
            idle_state_handle();
        }
    }
    

    Which function returned the NRF_ERROR_RESOURCES passed to the APP_ERROR_CHECK on line 119?

    in file nrf_ble_gq.c the function sd_ble_gattc_write returnesthe NRF_ERROR_RESOURCES.

    Could you handle with that informations?

    Best regards,
    Andi

  • Hello again, Andi

    Andi_Frueh said:
    In Debug-Build I get following error message:

    Thank you for clarifying.

    There is a issue with the logging of the UART communication error in the default error handler. It assumes that the error code is a standard nRF error code, but in fact the error code you get in case of a APP_UART_COMMUNICATION_ERROR is the content of the ERRORSRC register. So value 1 is not NRF_ERROR_SVC_HANDLER_MISSING, but rather an overrun error (1, i.e. the least significant bit is the OVERRUN field in ERRORSRC: "A start bit is received while the previous data still lies in RXD").

    It would seem that for some reason you are not able to process the data fast enough in some cases. Perhaps you get a interrupt at a "bad time", or something else. The best way to handle this is probably to use flow control if possible.

    Andi_Frueh said:

    in file nrf_ble_gq.c the function sd_ble_gattc_write returnesthe NRF_ERROR_RESOURCES.

    Could you handle with that informations?

    You can check returned error messages against the function API to read why this particular error message was returned.
    In the case of sd_ble_gattc_write, the description for the NRF_ERROR_RESOURCES reads:

    NRF_ERROR_RESOURCES Too many writes without responses queued. Wait for a BLE_GATTC_EVT_WRITE_CMD_TX_COMPLETE event and retry.

    It seems to me that you are queueing the writes faster than they are being written, leading to an overflow of the queued writes buffer.
    Please read the note section of the function's API to see how you may use the CMD_TX_COMPLETE event to queue more write without responses.
    How fast are you queueing writes, compared to how fast they are being sent? What connection interval are you using?

    Best regards,
    Karl

  • Hello again,

    APP_UART_COMMUNICATION_ERROR

    When this error occurs, I see in call stack, that the initialization of application code isn't finished. The application code is still handling the function ble_stack_init(). While executing this function, nrf_sdh_enable_request() will be called, where a critical section ist entered and exited. While exit this critical section, the UARTE0_UART0_IRQHandler() is called immideatelly, which ends in the APP_UART_COMMUNICATION_ERROR.

    Even if the dedicated STM32 sends its 250ms heartbeat via UART while the Nordic chip is starting its firmware (start debug session): how can it be that an overflow occurs immediately at this point every time? How can I prevent this overflow?

    It seems to me that you are queueing the writes faster than they are being written, leading to an overflow of the queued writes buffer.
    Thisis quite possible. The Central needs approx. 4 seconds until it detects that the connection to the Peripheral has been broken. During this time the heartbeat continues to be sent every 250ms.
    Is it possible that the interplay between the
    • heartbeat cycle (250ms),
    • the duration for detecting the disconnection (approx. 4 Sec.) and
    • the size of the message queue (?)

    has been selected as "unfavorable"? Can I adjust disconnection time?

Reply
  • Hello again,

    APP_UART_COMMUNICATION_ERROR

    When this error occurs, I see in call stack, that the initialization of application code isn't finished. The application code is still handling the function ble_stack_init(). While executing this function, nrf_sdh_enable_request() will be called, where a critical section ist entered and exited. While exit this critical section, the UARTE0_UART0_IRQHandler() is called immideatelly, which ends in the APP_UART_COMMUNICATION_ERROR.

    Even if the dedicated STM32 sends its 250ms heartbeat via UART while the Nordic chip is starting its firmware (start debug session): how can it be that an overflow occurs immediately at this point every time? How can I prevent this overflow?

    It seems to me that you are queueing the writes faster than they are being written, leading to an overflow of the queued writes buffer.
    Thisis quite possible. The Central needs approx. 4 seconds until it detects that the connection to the Peripheral has been broken. During this time the heartbeat continues to be sent every 250ms.
    Is it possible that the interplay between the
    • heartbeat cycle (250ms),
    • the duration for detecting the disconnection (approx. 4 Sec.) and
    • the size of the message queue (?)

    has been selected as "unfavorable"? Can I adjust disconnection time?

Children
  • Hi Karl,

    we tried something to find out the interacting of

    • heartbeat cycle
    • disconnection time and
    • queue size.
    We found the place where the dosconnection time can be set. If we reduce this time (from 4000ms to 1000ms), a disconnection will be recognized before the queue overflows.
    Unfortunately we have not found the place where to adjust the size of the queue. Can this also be found in the sdk_config?
  • Hello again,

    the other issue with the APP_UART_COMMUNICATION_ERROR while initialization seems also to be fixed (by a workaround): We have changed the order of the initialization sequence and have moved uart_init() behind the ble_stack_init().

    While doing so, we have no APP_UART_COMMUNICATION_ERROR  anymore.

    While initialization the uart interrupts are already enabled, although the BLE stack has not yet been initialized. This obviously leads to this error.

    What better solution here than just changing the initialization order?

  • Hello again Andi,

    Thank you for your patience.

    Andi_Frueh said:
    We found the place where the dosconnection time can be set. If we reduce this time (from 4000ms to 1000ms), a disconnection will be recognized before the queue overflows.

    Yes, the connection timeout can be adjusted to fit the specific application. 
    You should also consider the environment that the device will be working in when configuring this, because packet loss or corruption happens more frequently in a 2.4 GHz noisy environment, or in an environment in which the peripheral and central moves behind obstacles and further away from each other. What connection interval are you using currently?
    The connection interval is what determines how often there is scheduled communication between the two devices, so for example if you need your central to stop a machine as fast as possible if a heartbeat is not received you can likely set the connection timeout even lower if you are using a low connection interval, such as 7.5 ms.

    Andi_Frueh said:
    Unfortunately we have not found the place where to adjust the size of the queue. Can this also be found in the sdk_config?

    The size of the queue, are you here talking about the UARTE RX buffer, or the hvn_tx_queue_size?

    Andi_Frueh said:
    What better solution here than just changing the initialization order?

    Is it a possibility for you to use flow control?
    This should negate this issue all together.
    Alternatively, perhaps your STM could wait until it receives some kind of go-ahead signal before it starts sending the heartbeats. This could for example be a start command sent over UART, or similar.

    Best regards,
    Karl

  • Good Morning Karl,

    What connection interval are you using currently?

    If I'm right, the connection intervall is defined in sdk_config.h with 7.5...30 ms. So we can decrease the disconnection timeout, as you suggested.

    #define NRF_BLE_SCAN_MIN_CONNECTION_INTERVAL   7.5
    #define NRF_BLE_SCAN_MAX_CONNECTION_INTERVAL    30
    #define NRF_BLE_SCAN_SUPERVISION_TIMEOUT      4000

    The size of the queue, are you here talking about the UARTE RX buffer, or the hvn_tx_queue_size?

    In main.c the uart buffer size is defined by 256 Byte. Increasing or decreasing this size has no effect to the disconnection detection. If the connection is broken, it takes around 4 seconds and the central runs into the error handle, before to detect disconnection.

    Is it a possibility for you to use flow control?

    Using flow control is not desired. Is there a way to disable / enable uart interrupts by application, without changing SDK code? If so, we could disable the uart irq while initialization phase.

    You could implement specific error handling for specific errors.

    What would be an adequate error handling for

    1. APP_UART_COMMUNICATION_ERROR while initialization
    2. NRF_ERROR_RESOURCES while sending cyclic data with a broken connection

    Regards,
    Andi

  • Hello Andi,

    Andi_Frueh said:
    If I'm right, the connection intervall is defined in sdk_config.h with 7.5...30 ms. So we can decrease the disconnection timeout, as you suggested.

    Great! Yes, you could also set the two values to be equal, this will make the only possible connection interval whatever you define them too be. So long as this is fine with the peer (the central always decides, but the peripheral may be configured to disconnect if it does not support it), then this will always be the used connection interval. This also gives you a better predictability for how often the messages will be coming in.
    It also seems to me that you are here still using the 4000 ms connection supervision timeout, is this intentional or will you be lowering it later on?

    Andi_Frueh said:
    In main.c the uart buffer size is defined by 256 Byte. Increasing or decreasing this size has no effect to the disconnection detection. If the connection is broken, it takes around 4 seconds and the central runs into the error handle, before to detect disconnection.

    Does this means that it takes around 4 seconds for the UART buffer to overflow?
    Because if it is the 4 s connection timeout event will not cause it to enter the error handler, because a disconnection (for whatever reason) is not an application layer error. It is expected that a link may be broken at whatever time.
    If you decrease the connection supervision timeout to something much smaller, say 200 ms (with 7.5 ms connection interval this still gives you ~26 connection event tries to get your packets through before disconnecting), this would let you get the disconnected event much sooner. 

    Andi_Frueh said:

    What would be an adequate error handling for

    1. APP_UART_COMMUNICATION_ERROR while initialization

    This depends a little, but you could likely just ignore it during initialization unless this means that you are missing out on some critical / unique information. For example, you could ignore it until the UART peripheral is ready, clear the RX buffer, and then begin normal operation.

    Does your heartbeats contain any information, such as a timestamp or status?
    If they do not, you could for example just discard the information that is received during startup and wait for the next one.

    Andi_Frueh said:
    NRF_ERROR_RESOURCES while sending cyclic data with a broken connection

    This could also just be ignored in the case that you do not have any additional information contained in the heartbeat signal.
    For example if your hvn tx queue size is 1, there can only be queued a single heartbeat for sending at any one time. If the heartbeats come in every 250 ms, and you have a supervision timeout of < 250 ms, you will keep trying to send the same heartbeat notification up until either it goes through (and you then are ready for the next heartbeat) or until the connection is declared broken once the supervision timeout runs out.

    The important part for the error handling in both of these cases is that you do not use the default error handling, since this is to reset the device, which here would terminate the link and do no good.

    Best regards,
    Karl

Related