Choice of controller configurations in BLE Throughput sample

I am referring to the ncs\v3.1.0\nrf\samples\bluetooth\throughput\sysbuild\ipc_radio\prj.conf for the nRF5340 and extrapolating that to a central application with multiple peripheral connections. In that case, RAM use on the network core becomes a concern. 

1. Why does the sample have such a large heap size (CONFIG_HEAP_MEM_POOL_SIZE=8192)? I am not aware of any k_malloc() use in the Throughput sample and ipc_radio.

* Yes, we can use a "minimal.conf", but why put such a large default of 8192 in the Throughput prj.conf?

2. Why doesn't the sample suggest adjusting other configurations, notably the CONFIG_BT_CTLR_SDC_TX_PACKET_COUNT, CONFIG_BT_CTLR_SDC_RX_PACKET_COUNT, and CONFIG_BT_BUF_ACL_TX_COUNT? Couldn't these influence throughput?

* See my case 296354, where increasing CONFIG_BT_CTLR_SDC_TX_PACKET_COUNT influenced keeping the More Data (MD) bit set.

3. In that case 296354, there was a statement:
"There is no need to maintain any ratio or relationship between CONFIG_BT_CTLR_SDC_TX_PACKET_COUNT and CONFIG_BT_BUF_ACL_TX_COUNT. Also, the latter is only used by the Zephyr LL. (Generally, the difference between BT_BUF_ACL_TX_COUNT and BT_CTLR_SDC_TX_PACKET_COUNT is that the latter is per connection while the former is shared. This does not matter for a single connection, though.)"

That was for an old NCS, 2.0. I believe CONFIG_BT_BUF_ACL_TX_COUNT is indeed used by the SDC in newer NCS versions, such as NCS 2.6 and beyond. Is that correct?

Parents
  • Hello, I think I can answer your questions.

    1. Because the IPC/RPMsg/virtqueue system on the nRF5340 network core uses dynamic allocation, even if the throughput sample itself never calls k_malloc().
    The 8 KB heap is simply a safe default to ensure that RPMsg, libmetal, and virtqueue buffers always have enough memory.

    2. Because the sample is meant to run out-of-the-box and demonstrate throughput, not serve as a tuning guide.
    However, those parameters absolutely do affect throughput and memory usage, especially for multiple connections, and they should be tuned in real applications.

    3. Yes, in newer NCS releases, BT_BUF_ACL_TX_COUNT is also used when the SoftDevice Controller is enabled.
    The old statement (“only used by Zephyr LL”) is no longer correct for modern NCS.

  • Thanks  for your reply

    I spent quite a bit of time looking into #1 thanks to what you pointed out. However, I found that the system heap with size CONFIG_HEAP_MEM_POOL_SIZE only needs CONFIG_HEAP_MEM_POOL_SIZE=704 for the network core, as the worst case for either central or peripheral role in the Throughput sample.  In my opinion, the default 8 KB heap is way overkill and wasteful, at least for the throughput sample, when you can increase BLE-related buffer sizes and counts instead for multiple connections.

    I used NCS 2.6.4 to investigate. I employed use of 
    CONFIG_SYS_HEAP_RUNTIME_STATS on both cores, as well as adding several debug printk's.

    One thing I noticed is that there are two heaps being used in both cores.

    1. The _SYSTEM_HEAP of size CONFIG_HEAP_MEM_POOL_SIZE in ncs\v2.6.4\zephyr\kernel\mempool.c.

    K_HEAP_DEFINE(_system_heap, CONFIG_HEAP_MEM_POOL_SIZE);
    #define _SYSTEM_HEAP (&_system_heap)
    

    2. The z_malloc_heap of size HEAP_SIZE in ncs\v2.6.4\zephyr\lib\libc\common\source\stdlib\malloc.c.

    For our case of z_malloc_heap:

    #   define USED_RAM_END_ADDR   POINTER_TO_UINT(&_end)\
    /*
     * No partition, heap can just start wherever _end is, with
     * suitable alignment
     */
    #   define HEAP_BASE	ROUND_UP(USED_RAM_END_ADDR, HEAP_ALIGN)
    
    #   define HEAP_SIZE	ROUND_DOWN((RAM_SIZE -	\
    		((size_t) HEAP_BASE - (size_t) RAM_ADDR)), HEAP_ALIGN)
    


    That is, the malloc heap begins where ever the used-RAM ends, and the heap ends at the end of the RAM itself.

    k_malloc() uses the _SYSTEM_HEAP, while malloc() uses the z_malloc_heap.

    For any heap allocation, both heaps will eventually call sys_heap_alloc() in C:\ncs\v2.6.4\zephyr\lib\os\heap.c., which calls increase_allocated_bytes() if CONFIG_SYS_HEAP_RUNTIME_STATS=y.

    In increase_allocated_bytes(), I added these two lines at the end:

    	printk("increase_allocated_bytes(): allocated %zu, free %zu, max allocated %zu, pHeapStruct=%p\n",
    		h->allocated_bytes, h->free_bytes,
    		h->max_allocated_bytes, (void*)h);

    The pHeapStruct above tells me which heap is being used for the allocation. I recorded the address of both heap structures at startup in mempool.c/k_thread_system_pool_assign() and malloc.c/malloc_prepare(). One of those two heap structures will show up in the print of pHeapStruct when increase_allocated_bytes() is called.

    At bootup, the network core in the Throughput sample makes just two allocations from the _SYSTEM_HEAP each of size 312 bytes (316 bytes aligned). One for the rx vring and one for the tx vring:

    static int vq_setup(struct ipc_static_vrings *vr, unsigned int role)
    {
    	vr->vq[RPMSG_VQ_0] = virtqueue_allocate(vr->vring_size);
    	if (vr->vq[RPMSG_VQ_0] == NULL) {
    		return -ENOMEM;
    	}
    
    	vr->vq[RPMSG_VQ_1] = virtqueue_allocate(vr->vring_size);
    	if (vr->vq[RPMSG_VQ_1] == NULL) {
    		return -ENOMEM;
    	}
    
    



    No further allocations in the network core occur when you run the throughput test. And malloc allocations never occur.

    Those two allocations use 316*2=632 bytes.  I verified with CONFIG_HEAP_MEM_POOL_SIZE=512 that IPC init fails (<err> hci_ipc: IPC service instance initialization failed: -12). I verified the sample runs the test successfully on both sides if CONFIG_HEAP_MEM_POOL_SIZE=704. It probably can go even lower.  Regardless, 8192 seems to be way overkill.

    As an aside, the app core also had those same two vring allocations from the system heap. The central app core is the only image that had an additional allocation other than the two vrings.  That occurred in 
    user_data_alloc() in v2.6.4\nrf\subsys\bluetooth\gatt_dm.c, which called k_calloc() for the _SYSTEM_HEAP for an additional 124 bytes, and that's it. As for the network core, no mallocs occurred using the z_malloc_heap.

    2. Follow up:  . 
    a. What benefit do you get if you set the network core 
    CONFIG_BT_BUF_ACL_RX_SIZE to greater than 251 if the app core has CONFIG_BT_BUF_ACL_RX_SIZE=500 (the MTU size)?  That is, won't the network core receive max 251 bytes per packet over the air, which would then be sent to the app core with the larger 500 size for L2CAP reassembly? How would the network core make use of CONFIG_BT_BUF_ACL_RX_SIZE>251?

    b. How can you determine if the CONFIG_BT_CTLR_SDC_RX_PACKET_COUNT default of 2 is too small? Does the SDC nak a packet if it doesn't have enough buffers for a particular connection?  As for the CONFIG_BT_CTLR_SDC_TX_PACKET_COUNT, my understanding is that if there are not enough tx buffers, the More Data bit will not be set if it could be set. Is that right?

    c. What is the relation of the CONFIG_BT_BUF_ACL_TX_COUNT for the app core and network core?

  • Hi,

    variant said:
    a. What benefit do you get if you set the network core CONFIG_BT_BUF_ACL_RX_SIZE to greater than 251 if the app core has CONFIG_BT_BUF_ACL_RX_SIZE=500 (the MTU size)?  That is, won't the network core receive max 251 bytes per packet over the air, which would then be sent to the app core with the larger 500 size for L2CAP reassembly? How would the network core make use of CONFIG_BT_BUF_ACL_RX_SIZE>251?

    For BLE, the maximum LL data length on air is 251 bytes, so setting the CONFIG_BT_BUF_ACL_RX_SIZE above 251 does not increase the over‑the‑air PDU size. The app core’s larger CONFIG_BT_BUF_ACL_RX_SIZE simply means its host stack can hold a larger reassembled PDU. You can set CONFIG_BT_BUF_ACL_RX_SIZE to 251 in the prj.conf as the DevAcademy course https://academy.nordicsemi.com/courses/bluetooth-low-energy-fundamentals/lessons/lesson-3-bluetooth-le-connections/topic/blefund-lesson-3-exercise-2/ 

    variant said:
    b. How can you determine if the CONFIG_BT_CTLR_SDC_RX_PACKET_COUNT default of 2 is too small? Does the SDC nak a packet if it doesn't have enough buffers for a particular connection? 

    With the default count, the application is expected to be able to empty the buffers during a connection event. That is, non-default values (>2) should only be used when
    the CPU utilization is so high that the application is not able to read data fast enough during connection events. Value 1 should be used to save memory when reduced throughput is accepted.

    variant said:
    As for the CONFIG_BT_CTLR_SDC_TX_PACKET_COUNT, my understanding is that if there are not enough tx buffers, the More Data bit will not be set if it could be set. Is that right?

     It might be because the controller (LL) does not get data fast enough from the host. You can try to increase CONFIG_BT_CTLR_SDC_TX_PACKET_COUNT. 

    variant said:
    c. What is the relation of the CONFIG_BT_BUF_ACL_TX_COUNT for the app core and network core?

     BLE on the nRF5340 splits the Bluetooth LE Controller and the host part of the Bluetooth LE stack and runs them on different cores. When splitting the Bluetooth LE Controller and the Host, run the Bluetooth LE Controller on the network core and the host part of the Bluetooth LE stack and the application logic on the application core.

    CONFIG_BT_BUF_ACL_TX_COUNT is the Number of outgoing ACL data buffers sent from the Host to the Controller. This determines the maximum amount of data packets the Host can have queued in the Controller before waiting for the to notify the Host that more packets can be queued with the Number of Completed Packets event. The buffers are shared between all of the connections and the Host determines how to divide the buffers between the connections. The Controller will return this value in the HCI LE Read Buffer Size command response.

    -Amanda H.

Reply
  • Hi,

    variant said:
    a. What benefit do you get if you set the network core CONFIG_BT_BUF_ACL_RX_SIZE to greater than 251 if the app core has CONFIG_BT_BUF_ACL_RX_SIZE=500 (the MTU size)?  That is, won't the network core receive max 251 bytes per packet over the air, which would then be sent to the app core with the larger 500 size for L2CAP reassembly? How would the network core make use of CONFIG_BT_BUF_ACL_RX_SIZE>251?

    For BLE, the maximum LL data length on air is 251 bytes, so setting the CONFIG_BT_BUF_ACL_RX_SIZE above 251 does not increase the over‑the‑air PDU size. The app core’s larger CONFIG_BT_BUF_ACL_RX_SIZE simply means its host stack can hold a larger reassembled PDU. You can set CONFIG_BT_BUF_ACL_RX_SIZE to 251 in the prj.conf as the DevAcademy course https://academy.nordicsemi.com/courses/bluetooth-low-energy-fundamentals/lessons/lesson-3-bluetooth-le-connections/topic/blefund-lesson-3-exercise-2/ 

    variant said:
    b. How can you determine if the CONFIG_BT_CTLR_SDC_RX_PACKET_COUNT default of 2 is too small? Does the SDC nak a packet if it doesn't have enough buffers for a particular connection? 

    With the default count, the application is expected to be able to empty the buffers during a connection event. That is, non-default values (>2) should only be used when
    the CPU utilization is so high that the application is not able to read data fast enough during connection events. Value 1 should be used to save memory when reduced throughput is accepted.

    variant said:
    As for the CONFIG_BT_CTLR_SDC_TX_PACKET_COUNT, my understanding is that if there are not enough tx buffers, the More Data bit will not be set if it could be set. Is that right?

     It might be because the controller (LL) does not get data fast enough from the host. You can try to increase CONFIG_BT_CTLR_SDC_TX_PACKET_COUNT. 

    variant said:
    c. What is the relation of the CONFIG_BT_BUF_ACL_TX_COUNT for the app core and network core?

     BLE on the nRF5340 splits the Bluetooth LE Controller and the host part of the Bluetooth LE stack and runs them on different cores. When splitting the Bluetooth LE Controller and the Host, run the Bluetooth LE Controller on the network core and the host part of the Bluetooth LE stack and the application logic on the application core.

    CONFIG_BT_BUF_ACL_TX_COUNT is the Number of outgoing ACL data buffers sent from the Host to the Controller. This determines the maximum amount of data packets the Host can have queued in the Controller before waiting for the to notify the Host that more packets can be queued with the Number of Completed Packets event. The buffers are shared between all of the connections and the Host determines how to divide the buffers between the connections. The Controller will return this value in the HCI LE Read Buffer Size command response.

    -Amanda H.

Children
Related