This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

BLE multilink ATT timeout problem occurs when the number of connections increases

Recently I am using BLE in Zephyr to implement the multilink central function
However, none of the references currently have a similar function, and there is also the nordic zephyr code
https://github.com/nrfconnect/sdk-nrf  <-There are only examples of multilink peripherals in this link
So I wrote one to implement it. The goal I want to achieve is one central to 30 peripherals
The current connection with 4 peripherals is fully stable, but when I increase the connected peripherals, the following error will appear

<err> bt_attLATT Timeout
<wrn> bt_att: No ATT channel for MTU 5
<wrn> bt_att: No pending ATT request

The picture below shows the error when connecting 20 peripherals

At present, I know that modifying the Interval connection will indeed improve, but only to make <err> bt_attLATT Timeout happen later.

How can I avoid this problem so that I can connect 30 peripherals stably?

Below is my code
Or you can go to this page to download https://github.com/mfinmuch/zephyr-ble-mulrilink-test

3225.multilink central.rar

Thanks,

Poyi

  • Hello
    I modified the following parameters to

    CONFIG_SYSTEM_WORKQUEUE_STACK_SIZE=4096
    CONFIG_MAIN_STACK_SIZE=4096
    CONFIG_BT_RX_STACK_SIZE=4096

    Then an error appeared, as shown in the figure below. What is the cause of this?
    <wrn> bt_conn: Disconnected while allocating context

    After trying to change these three parameters to 8192, the above error did not appear.

    CONFIG_SYSTEM_WORKQUEUE_STACK_SIZE=8192
    CONFIG_MAIN_STACK_SIZE=8192
    CONFIG_BT_RX_STACK_SIZE=8192

    But there is a Tx Buffer Overflow error, as shown below

    [00:36:55.371,032] <err> bt_ctlr_hci: Tx Buffer Overflow
    [00:36:55.371,063] <wrn> bt_hci_core: Data buffer overflow (link type 0x01)
    [00:36:55.371,063] <err> bt_conn: Unable to send to driver (err -55)

    However, it didn’t stop my data transmission, but it looks like it’s not right
    For my TX Buffer related settings, I set the official maximum value, as follows

    CONFIG_BT_CONN_TX_MAX=18
    CONFIG_BT_L2CAP_TX_BUF_COUNT=18
    CONFIG_BT_CTLR_RX_BUFFERS=18
    CONFIG_BT_CTLR_TX_BUFFERS=18
    CONFIG_BT_L2CAP_TX_MTU=247
    CONFIG_BT_L2CAP_RX_MTU=247
    CONFIG_BT_CTLR_TX_BUFFER_SIZE=251
    CONFIG_BT_CTLR_DATA_LENGTH_MAX=251
    CONFIG_BT_RX_BUF_LEN=258

    How can I modify it to prevent the Data buffer overflow error from happening again?

    Thanks.

    Poyi

  • Hi Poyi, 
    Please try to increase one stack size at a time. 
    I would suggest to increase CONFIG_BT_RX_STACK_SIZE first. The current default value in your project as 1024, correct ? Please try to increase it to 2048 first. 

    The next you want to try is CONFIG_SYSTEM_WORKQUEUE_STACK_SIZE. Please try with 4096. I don't think the CONFIG_MAIN_STACK_SIZE need to be increased. 
    Please make sure you put all the BLE API calls in to a work queue, or in main thread. 

    If you don't plan send large data packet, please set the CONFIG_BT_L2CAP_RX_MTU , CONFIG_BT_CTLR_DATA_LENGTH_MAX to match with your packet size. 

    Please let me know until how many peripherals do  you see the "ATT Timeout" error ? What's the data traffic ? How often the peripheral send notification? What's the data size of the notification? 


    Please try testing with larger connection interval and with slave latency. 

  • You mean, I set

    CONFIG_SYSTEM_WORKQUEUE_STACK_SIZE=4096

    Come again
    CONFIG_SYSTEM_WORKQUEUE_STACK_SIZE=4096
    CONFIG_BT_RX_STACK_SIZE=4096

    Is that so?

    In addition, currently the smallest can only be set like this

    CONFIG_BT_CTLR_DATA_LENGTH_MAX=32
    CONFIG_BT_L2CAP_TX_MTU=65
    CONFIG_BT_L2CAP_RX_MTU=65

    I don’t know if it matches my packet size

    Currently my central only uses bt_gatt_write_without_response
    And bt_gatt_write_without_response is in cmd_write
    The method to change to work_queue is as follows

    void write_work_handler(struct k_work *work)
    {
    int err;
    printk("test now_conn %d\n",now_conn);
    cmd_write(service_handle, now_conn);
    To
    }
    K_WORK_DEFINE(write_work, write_work_handler);

    ATT Timeout appears after a while after connecting with all peripherals, and the time of appearance is random

    I don’t really understand the data traffic you mentioned. I thought that after cnetral announced the interval connection, the peripherals would automatically select an interval value to determine how long it would take to send it.

    Five of my peripherals will be sent every 100ms, the rest will be sent every 1 second, and 8bytes will be sent every time.

    static uint8_t test[8];
    
    rc = bt_gatt_notify(NULL, &hrs_svc.attrs[1], &test, sizeof(test));

    test[8] is the data that my peripheral wants to send

    and also
    Is there any good solution to the Tx Buffer Overflow error mentioned above?

    [00:36:55.371,032] <err> bt_ctlr_hci: Tx Buffer Overflow
    [00:36:55.371,063] <wrn> bt_hci_core: Data buffer overflow (link type 0x01)
    [00:36:55.371,063] <err> bt_conn: Unable to send to driver (err -55)

    Or is it related to my CONFIG_SYSTEM_WORKQUEUE_STACK_SIZE and CONFIG_BT_RX_STACK_SIZE?

    Thanks,

    Poyi

  • Hi Poyi, 

    Please clarify do you still see the error when you change these to:

    CONFIG_SYSTEM_WORKQUEUE_STACK_SIZE=4096
    CONFIG_BT_RX_STACK_SIZE=4096

    But this suggestion was based on the MPU (memory protection unit) fault, it suggested there was an issue with the memory/stack.

    You may think of increasing CONFIG_BT_L2CAP_TX_BUF_COUNT. If you have 20 peripheral and if you try to send a write command to all of the 20 peripherals you may pass the number of 18 tx buffer . 

    Your issue seems to be similar to this on going report: https://github.com/zephyrproject-rtos/zephyr/issues/30378 

    Could be the same issue. 

    Do you have a sniffer ? If you can sniff the whole activity we can check which exact action caused the timeout. 

    My concern is that when you have 20 connection, and each connection has an interval of 90ms there are only 4.5ms for each connection and if the scheduler couldn't schedule all the connections good enough you will have packet drop. I would suggest to change the device that sends packet every one second to switch to 1 second interval instead of 90ms. Or at least change them to 500ms interval. 

  • Hello

    I don't quite understand what you mean

    you may pass the number of 18 tx buffer . 

    It is true that my problem is very similar to it, but no one has provided a good solution yet, which troubles me a lot.
    The problem now is that I keep sending, and Tx Buffer Overflow will appear in the central after a while.

    <err> bt_ctlr_hci: Tx Buffer Overflow

    When this error occurs, my central will continue to send data, but some peripherals will not receive the bt_gatt_write_without_response sent by central, but they will still stay connected to central

    In this case, it seems that there is no clear solution

    Is 500ms your estimated value, or is there an algorithm?

    I will try to look at 500ms and 1s, but in my situation, if I can transmit and receive data as quickly as possible, the better

    Thanks,

    Poyi

Related