Beware that this post is related to an SDK in maintenance mode
More Info: Consider nRF Connect SDK for new designs

Speeding up application to send 24bit I2S data at 16 kHz via BLE

Hi,

I'm trying to send I2S data (~ 16 kHz, 24 bits) via BLE using the nRF52 DK and the nRF5 SDK 17.1.0. After I get the data from the I2S driver (by specification, they are in 32bit format), I add them to very simple ring buffer. The problem is, that this ring buffer becomes full too fast, overwriting older data (which has not been sent yet), which subsequently results in data loss.

Is there a way to speed up my application to avoid this or give higher priority to the application to be able to process the data faster (I guess the main problem is my loop which is casting the 32bit data into 24bit samples)?

Here is my applications logic together with the respective code:

global variables:

/*----------------- I2S -----------------*/
#define I2S_DATA_BLOCK_WORDS    32

static bool i2s_active = false;
static bool i2s_transfer = false;

/*----------------- BUFFERS -----------------*/
#define RNG_BUF_SIZE            (I2S_DATA_BLOCK_WORDS*4*60)
static uint32_t rng_buffer[RNG_BUF_SIZE];
static uint16_t rb_write_ptr = 0;
static uint16_t rb_read_ptr = 0;

static uint8_t m_array[243] = {0};

main():

int main(void)
{
    bool erase_bonds;
    uint32_t err_code;

    // Initialize.
    log_init();
    timers_init();
    buttons_leds_init(&erase_bonds);
    power_management_init();

    // Initialize TWI/ I2C & Setup ADC
    twi_adc_configuration();

    // Initialize i2s
    err_code = i2s_init();
    if (err_code == NRF_SUCCESS)
    {
        NRF_LOG_INFO("I2S successfully initialized.");
    }
    else
    {
        NRF_LOG_INFO("Error initializing I2S.");
    }

    // Initialize BLE
    ble_stack_init();
    gap_params_init();
    gatt_init();
    services_init();
    advertising_init();
    conn_params_init();
    peer_manager_init();

    // Start advertising
    advertising_start(erase_bonds);

    for (;;)
    {
        if (i2s_transfer)
        {
            if (i2s_active != true)
            {
                err_code = start_i2s();

                if (err_code == NRF_SUCCESS)
                {
                    i2s_active = true;
                    NRF_LOG_INFO("I2S started.\n");
                } 
                else
                {
                    NRF_LOG_INFO("Error starting I2S.\n");
                }
            }

            // set back err_code
            err_code = NULL;
            if (rb_read_ptr < rb_write_ptr)
            {
                // fill up an array, which is later sent via BLE
                for (int i = 0; i <= 243; i+=3) 
                {
                    if (rb_read_ptr > RNG_BUF_SIZE)
                    {
                        rb_read_ptr = 0;
                        read_cnt++;
                        NRF_LOG_INFO("reading finished %d", read_cnt);
                    }

                    uint32_t const * p_word = NULL;
                    uint8_t sample[3];
      
                    // cast 32bit sample to 24bit sample
                    p_word = &rng_buffer[rb_read_ptr];
                    sample[0] = ((uint8_t const *)p_word)[2];
                    sample[1] = ((uint8_t const *)p_word)[1];
                    sample[2] = ((uint8_t const *)p_word)[0];

                    rb_read_ptr++;

                    copy 24bit sample to the array
                    memcpy(&m_array[i], &sample, sizeof(sample));
                }

                // when m_array is full, send data via BLE
                err_code = ble_aas_value_update(&m_aas, &m_array);
                if (err_code == NRF_SUCCESS)
                {
                    pkg_cnt++;
                    sample_no+=81;
                }
                if ((err_code != NRF_SUCCESS) &&
                    (err_code != NRF_ERROR_INVALID_STATE) &&
                    (err_code != NRF_ERROR_RESOURCES) &&
                    (err_code != NRF_ERROR_BUSY) &&
                    (err_code != BLE_ERROR_GATTS_SYS_ATTR_MISSING)
                   )
                {
                    APP_ERROR_CHECK(err_code);
                }
                else if (err_code == NRF_ERROR_RESOURCES)
                {
                    // if ble stack is full, wait until more space is available
                    // ble_ready will become true, when BLE_GATTS_EVT_HVN_TX_COMPLETE event is received
                    ble_ready = false;
                    while (!ble_ready)
                    {
                        idle_state_handle();
                    }
                    
                    // retry
                    err_code = ble_aas_value_update(&m_aas, &m_array);
                    if (err_code == NRF_SUCCESS)
                    {
                        pkg_cnt++;
                        sample_no+=81;
                    }
                    if ((err_code != NRF_SUCCESS) &&
                        (err_code != NRF_ERROR_INVALID_STATE) &&
                        (err_code != NRF_ERROR_RESOURCES) &&
                        (err_code != NRF_ERROR_BUSY) &&
                        (err_code != BLE_ERROR_GATTS_SYS_ATTR_MISSING)
                       )
                    {
                        APP_ERROR_CHECK(err_code);
                    }
                }
            }
        }

        if ((!i2s_transfer) &&
            (i2s_active)
           )
        {
            nrf_drv_i2s_stop();
            i2s_active = false;
            NRF_LOG_INFO("package added to ble stack: %d", pkg_cn);
            NRF_LOG_INFO("samples sent: %d", sample_no);
        }

        idle_state_handle();
        NRF_LOG_FLUSH();
    }

    bsp_board_leds_off();
}

the i2s data handler and the function which adds the received 32bit samples to the ring buffer:

static void write_rng_buffer(uint32_t const * p_block)
{
    if ((rb_write_ptr+I2S_DATA_BLOCK_WORDS) > RNG_BUF_SIZE)
    {
        rb_write_ptr = 0;
        write_cnt++;
        NRF_LOG_INFO("writing finished %d", write_cnt);
    }

    memcpy(&rng_buffer[rb_write_ptr], p_block, (I2S_DATA_BLOCK_WORDS*4));
    rb_write_ptr+=I2S_DATA_BLOCK_WORDS;
}

static void data_handler(nrf_drv_i2s_buffers_t const * p_released,
                         uint32_t                      status)
{
    // 'nrf_drv_i2s_next_buffers_set' is called directly from the handler
    // each time next buffers are requested, so data corruption is not
    // expected.
    ASSERT(p_released);

    // When the handler is called after the transfer has been stopped
    // (no next buffers are needed, only the used buffers are to be
    // released), there is nothing to do.
    if (!(status & NRFX_I2S_STATUS_NEXT_BUFFERS_NEEDED))
    {
        return;
    }

    // First call of this handler occurs right after the transfer is started.
    // No data has been transferred yet at this point, so there is nothing to
    // check. Only the buffers for the next part of the transfer should be
    // provided.
    if (!p_released->p_rx_buffer)
    {
        // .p_tx_buffer = m_buffer_tx[1] changed to .p_tx_buffer = NULL, since we only receive data
        nrf_drv_i2s_buffers_t const next_buffers = {
            .p_rx_buffer = m_buffer_rx[1],
            .p_tx_buffer = NULL,
        };
        APP_ERROR_CHECK(nrf_drv_i2s_next_buffers_set(&next_buffers));

    }
    else
    {
        write_rng_buffer(p_released->p_rx_buffer);

        // The driver has just finished accessing the buffers pointed by
        // 'p_released'. They can be used for the next part of the transfer
        // that will be scheduled now.
        APP_ERROR_CHECK(nrf_drv_i2s_next_buffers_set(p_released));
    }
}

any help and hint is appreciated!

Parents
  • Hello,

    I don't think the bottleneck is the 24->32 bit casting.

    I didn't look too much at the logic in your casting, and my I2S is a bit rusty, since there is no sound HW on the DKs. But is it correct that the p_released->p_rx_buffer in the data_handler() callback has samples of 32 bits, where the last 8 bits are blank (0x00 or 0xFF)? I.e. 0xAAAAAA00, and the next sample has 0xBBBBBB00, and 0xCCCCCC00 and so on?

    If so, after you cast it to 32 bytes, are you left with this in the buffer?

    0xAAAAAABB 0xBBBBCCCC 0xCCDDDDDD 0x... and so on? Or what does it look like after you cast it?

    I assume that your bottleneck is currently the BLE, and not the CPU or I2S. 

    I see that some parts of the application are missing, such as the implementation of ble_aas_value_update(). What does it look like? Is this always sending a large buffer? The trick in BLE is to always send large buffers (to reduce the overhead/payload ratio). Then, you need to keep the buffers full at all times for maximum throughput. 

    If this is satisfied, the throughput is up to the rest of your connection parameters. This isn't straight forward, but you need an MTU that is large enough (looks good from your log). What is your connection interval, connection event length, and PHY?

    To play around with the connection parameters, I suggest you take a look at the SDK\examples\ble_central_and_peripheral\experimental\ble_app_att_mtu_throughput

    I believe maxing the event length, using around 50ms connection interval, and using 2MBPS PHY will give the highest throughput.

    Best regards,

    Edvin

  • Good morning,

    after my answer from yesterday I checked again my connection parameters and, in fact, I was using 7.5 ms as event length. I changed that now to 50 ms by changing

    #define CONN_INTERVAL_DEFAULT           (uint16_t)(MSEC_TO_UNITS(7.5, UNIT_1_25_MS)) 
    #define MIN_CONN_INTERVAL               MSEC_TO_UNITS(20, UNIT_1_25_MS)
    #define MAX_CONN_INTERVAL               MSEC_TO_UNITS(50, UNIT_1_25_MS) 

    to

    #define CONN_INTERVAL_DEFAULT           (uint16_t)(MSEC_TO_UNITS(50, UNIT_1_25_MS)) 
    #define MIN_CONN_INTERVAL               MSEC_TO_UNITS(50, UNIT_1_25_MS)        /**< Minimum acceptable connection interval (0.1 seconds). */
    #define MAX_CONN_INTERVAL               MSEC_TO_UNITS(50, UNIT_1_25_MS) 

    and according to nRF Connect for Desktop, my connection parameters should now allow maximum throughput (2 M PHY and optimal connection parameters). I'm attaching my log file from nRF Connect.2022-08-17T06_13_35.488Z-log.txt

    However, this did not improve the overall process so much - I'm still receiving less samples than I would expect when recording, e.g., 3 seconds (~ 48.000 samples expected). So I looked again at the maximum throughput example. One significant difference in the example's sdk_config.h is

    // <o> NRF_SDH_BLE_GAP_EVENT_LENGTH - GAP event length. 
    // <i> The time set aside for this connection on every connection interval in 1.25 ms units.
    
    #ifndef NRF_SDH_BLE_GAP_EVENT_LENGTH
    #define NRF_SDH_BLE_GAP_EVENT_LENGTH 400
    #endif

    this value is 6 in my application. I tried setting the value first to 400 and then to 40 (to achieve 50 ms), but I got the following warnings:

    <warning> nrf_sdh_ble: Insufficient RAM allocated for the SoftDevice.
    <warning> nrf_sdh_ble: Change the RAM start location from 0x20002AD8 to 0x20002BE0.
    <warning> nrf_sdh_ble: Maximum RAM size for application is 0xD420.
    <error> nrf_sdh_ble: sd_ble_enable() returned NRF_ERROR_NO_MEM.
    

    I changed the RAM start location as I was told, but since then my application stops working after roughly 1 second without a clear error message. I can only see an arrow in the 'Disassembly' window of Segger emStudio pointing to

    00000A60    4B01    ldr r3,
    

    I haven't been able to find out what that means and how to address this issue. Can you help me out?

  • Moritz_S said:
    I haven't been able to find out what that means and how to address this issue. Can you help me out?

    That is a hardfault. These are a bit tricky to debug. 

    The log (from nRF Connect) only states that the peripheral disappeared, and hence you got a timeout disconnection.

    If you sent the appliication you are using, will I be able to replicate the issue (hardfault) without an external microphone attached?

    In fact, I think you should try to reduce the application to only send dummy data, so that we can exclude the I2S operations for now. Just to see whether the I2S + the data processing is slowing down things at all.

    If you strip it down, and send it so that I can try to run it on a DK, I can see what your connection parameters looks like. 

    In case you are interrested, I'll attach a couple of modified examples (ble_app_uart and ble_app_uart_c) which will just send dummy data from one device to the other based on some button presses. 

    0825.nus_throughput.zip

    It runs in SDK16.0.0, so please download and test it in SDK16.0.0. Just unzip it next to the ble_app_uart example in the folder structure. Flash one DK with the peripheral example and another with the central, and press button 1 (I think) on one of them, and monitor the RTT log output. 

    Best regards,

    Edvin

  • Hi Edvin,

    thanks for your response, I have tested two things now:

    1. Measuring throughput while sending dummy data

    I have removed the I2S code from the application (I2S still enabled in sdk_config.h) and implemented sending dummy data as shown in the ble_app_uart example into my application and tested it on an NRF52 DK as sender and an NRF52 Dongle as the receiver together with NRF Connect for Desktop Bluetooth 4.0 (on Ubuntu 20.04 LTS). The hard fault is now gone.

    The result was: The byte count reported on the sender side (534.114 bytes) was significantly higher than on the receiving side (277.020 bytes). When calculating throughput from the log file, it was mostly between 30 and 39 KBps.

    I counted the bytes on the receivers side from the log file of NRF Connect for Desktop. I noticed that in most cases the bytes counted on the receivers side are roughly 50 % of the bytes counted on the sender's side.

    Here is my project with the code for sending dummy data (using SDK 17.0.1) and the log file related to the case described above.

    example_logfile.txt

    surag_sense.zip

    2. Measuring throughput with original I2S code

    I have used the same approach to count the bytes that are sent using my original application including I2S.

    Now the byte count on the sender's and the receiving side are equal to each other (42.768 bytes) but throughput reduced significantly to ~ 8 KBps.

    My Conclusions:

    - it seems that I2S in fact has an impact on the achievable throughput?

    - NRF Connect for Desktop seems to have problem with logging higher data rates?

    What do you think about these conclusions? Can you suggest any next steps?

  • Moritz_S said:
    The result was: The byte count reported on the sender side (534.114 bytes) was significantly higher than on the receiving side (277.020 bytes).

    That should not be possible. If so, there is something wrong with the way you are counting either received or sent bytes. 

    Notes from looking at your surag_sense.zip file:

    Line 1133 in main.c:

    err_code = ble_aas_value_update(&m_aas, &data);

    should be:

    err_code = ble_aas_value_update(&m_aas, data);
    // or
    err_code = ble_aas_value_update(&m_aas, &data[0]);

    The same goes for m_array in 1034:

    err_code = ble_aas_value_update(&m_aas, m_array);

    Then go to ble_aas.c line 74, and change it to:

    const ble_gatts_evt_write_t * p_evt_write = &p_ble_evt->evt.gatts_evt.params.write;

    First after doing these changes I can compile your application without warnings. In most cases you should treat warnings as errors, because it may lead to undefined behavior. The only exception I can think of is when variables are "set but not used", while you are adding and removing things for debugging. 

    Moritz_S said:
    The result was: The byte count reported on the sender side (534.114 bytes) was significantly higher than on the receiving side (277.020 bytes). When calculating throughput from the log file, it was mostly between 30 and 39 KBps.

    If this is based on the log file from nRF Connect for Desktop, I would not trust that. I suggest you write a central application that can count the received bytes properly. The reason for this is that at least the GUI in nRF Connect for Desktop is not able to keep up with the throughput, and I believe that the log reflects the GUI, meaning it will not receive all the bytes. 

    Counting the sent_bytes from your peripheral (I used an app timer to trigger every 1 second, and saw that it had sent 3 058 884 bytes over 29 seconds which is roughly 105kB per second = 800kbps. This is roughly how much you can get on 1MBPS PHY in BLE.

    Checking by setting the connection interval to 500ms, I get slightly above 800kpbs (noisy environment in the office), and that is above what is possible with 1MBPS.

    Looking at the log from nRF Connect for desktop, I sometimes see:

    1:17:52.043	Attribute value changed, handle: 0x10, value (0x): 03-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
    11:17:52.043	Attribute value changed, handle: 0x10, value (0x): 17-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
    
    Meaning that all the packets between 0x03 and 0x17 are not printed in the log. They are received if the peripheral can queue it successfully, because the only way that the packet are not acked is if they disconnect. 

    Best regards,

    Edvin

  • Hi Edvin,

    thanks for your answer and suggestions. I implemented the suggested changes into both, the application sending dummy data and the original I2S application. They compile without warnings now.

    For both I'm using 2MBPS PHY and 50 ms connection interval.

    When sending dummy data, I can achieve ~ 1060 kbps, which is a little less than the optimum, but that's okay for now. I will continue testing, if less 2.4 GHz devices in the environment will increase the throughput.

    When enabling I2S however, throughput goes down to ~ 60 kbps, which is not sufficient at all. then I realized that when using I2S, my NRF_SDH_BLE_GAP_EVENT_LENGTH was set to 6 (I was using two different projects ^^). Increasing that value to 40 (as it is in the project sending dummy data) also increased my throughput to a number that seems to be close to what I need for sending my I2S data. I still loose some data, but that is due to my buffer and not the BLE - so I just need to improve my implementation of the buffer.

    For anybody interested: I also tried increasing NRF_SDH_BLE_GAP_EVENT_LENGTH further (e.g. to 400 as it is in the ATT Max. Throughput example), but this did not increase throughput further.

    Overall, it seems that my problem is solved. But what impact has the NRF_SDH_BLE_GAP_EVENT_LENGTH parameter on what is happening in the code? I mean the connection interval is still 50 ms, no matter how high the NRF_SDH_BLE_GAP_EVENT_LENGTH.

  • Moritz_S said:
    I also tried increasing NRF_SDH_BLE_GAP_EVENT_LENGTH further (e.g. to 400 as it is in the ATT Max. Throughput example), but this did not increase throughput further.

    Increasing this will only give a higher throughput if it was the parameter that was blocking. Increasing it to more than the radio would have time to use every connection interval will not have any effect.

    Glad to hear that you found something that seems to be working.

    Best regards,

    Edvin

Reply
  • Moritz_S said:
    I also tried increasing NRF_SDH_BLE_GAP_EVENT_LENGTH further (e.g. to 400 as it is in the ATT Max. Throughput example), but this did not increase throughput further.

    Increasing this will only give a higher throughput if it was the parameter that was blocking. Increasing it to more than the radio would have time to use every connection interval will not have any effect.

    Glad to hear that you found something that seems to be working.

    Best regards,

    Edvin

Children
No Data
Related