TWI write to IC registers fails when triggered by an incoming a BLE command [NRF Internal Error 3]

System details:

  • Board: Custom
  • Chip: nRF52832
  • PCA Number: PCA10040
  • SoftDevice: S132
  • SDK: Version 16.0.0

Until recently, I have had no issues using the TWI module to read/write form all of our various sensor ICs that we have on our custom board. I initialize each sensor and configure them using TWI and it behaves as expected. After initialization/configuration, my application begins to advertise, and when it connects with our Desktop app it continuously streams data from each sensor every 50ms according to a repeated timer. This timer makes use of the App Scheduler by calling app_sched_event_put(NULL, 0, dataSamplingTimerHandler) every 50ms, and this dataSamplingTimerHandler reads from each sensor IC using TWI before it sends each byte of data using ble_nus_data_send(). This timer and ble-send functionality also works as expected, with no issues and we receive the data nicely.

Now that this behavior is stable, I have set up some code to receive incoming BLE commands in the nus_data_handler(), which also makes use of the App Scheduler and app_sched_event_put(), so that I can read/write to sensor registers directly from the Desktop app. My goal is to hopefully lean on the Scheduler to securely queue and manage TWI sensor-read events from the internal timer and from these asynchronous incoming BLE commands so that there is no resource conflict or anything like that.

The code I wrote (included below) works without fail when I am simply reading from a register. However, it only works some of the time when I want to write to a register. When it works, I can successfully verify the data was written by reading from the register.

When it fails, it throws NRF ERROR INTERNAL, ERROR # 3, pretty much every single time. When only requesting a single ble-enabled register write operation, it is much more likely to avoid the error than if requesting a batch of multiple register writes. I have tried changing the I2C frequency from 400kHz to 100kHz and this doesnt help. Do you have any ideas as to why the register write operation fails only when it is triggered from the incoming BLE message? I have also included a photo of the stack when the failure occurs. Please let me know if there is anything else (other code?) that I can add to this ticket to help debug this. Many thanks in advance!

/**
 * @brief Helper function to schedule RegWrite tasks
 */
void regWriteHandlerHelper(ble_nus_evt_t * p_evt)
{
    uint32_t err_code;
    uint16_t msg_len;
    uint8_t end = 0x1;
    msg_len = (p_evt)->params.rx_data.length;
    uint8_t msg[msg_len];
    strcpy(msg, (p_evt)->params.rx_data.p_data);

    NRF_LOG_INFO("Received %d bytes of data from BLE NUS: %s", msg_len, msg);
    NRF_LOG_FLUSH();

    char *token;
    char *delim = ",";
    token = strtok(msg, delim);
    /* Access the first argument in RegWrite:(addr,reg,data) */
    token = strtok(NULL, delim);
    int address = (int)strtol(token, NULL, 16);
    NRF_LOG_INFO("RegWrite Address: 0x%X\n", address);
    /* Access the second argument in RegWrite:(addr,reg,data) */
    token = strtok(NULL, delim);
    int reg = (int)strtol(token, NULL, 16);
    NRF_LOG_INFO("RegWrite Register: 0x%X\n", reg);
    /* Access the third argument in RegWrite:(addr,reg,data) */
    token = strtok(NULL, delim);
    int d = (int)strtol(token, NULL, 16);
    NRF_LOG_INFO("RegWrite Data: 0x%X\n", d);

    const uint8_t dd[] = { (uint8_t) d};
    if (address == (uint8_t) IQS572_ADDR) {
        uint8_t buf[1];
        uint16_t regs = (uint16_t) reg;
        uint8_t addrs = (uint8_t) address;
        twiRegRead16(addrs, regs, buf, 1); // NOTE: Dummy Read for expected NACK due to forced comms
        err_code = twiRegWrite16(addrs, regs, dd, 1);
    } else {
        uint8_t add = (uint8_t) address;
        uint8_t rgstr = (uint8_t) reg;
        err_code = twiRegWrite(add, rgstr, dd, 1);
    }
    APP_ERROR_CHECK(err_code);

    // End Communication window
    err_code = twiRegWrite16(IQS572_ADDR, IQS572_END_COMMS_WINDOW, &end, 1);
    APP_ERROR_CHECK(err_code);
}


/**
 * @brief Function to handle incoming BLE requests to write data to a register of a sensor IC
 */
void regWriteHandler(void * p_evt, uint16_t size)
{
    regWriteHandlerHelper((ble_nus_evt_t *)p_evt);
}


/**
 * @brief Function to handle incoming BLE requests to retreive sensor IC configuration data
 */
static void readConfigsHandler(void)
{
    /* Send all config data <c,config_data_bytes> */
    uint32_t err_code;
    uint8_t dataPacket[BLE_CHARACTERISTIC_MAX_LENGTH];
    uint8_t dataIndex = 0;
    uint8_t id = 'c';

    memcpy(dataPacket + dataIndex, &id, 1);
    dataIndex = dataIndex + 1;

    uint8_t sensitivity_configs[43] = {0};
    // 2 Bytes for ATI Trackpad Target Value
    // 1 Byte for Global ATI C value
    // 40 Bytes for individual channel ATI adjustment values
    capTouchSensitivityConfigs(sensitivity_configs);

    memcpy(dataPacket + dataIndex, sensitivity_configs, 43);
    dataIndex = dataIndex + 43;

    err_code = nus_data_send(dataPacket, dataIndex);
    if ((err_code != NRF_ERROR_INVALID_STATE) &&
        (err_code != NRF_ERROR_RESOURCES) &&
        (err_code != NRF_ERROR_NOT_FOUND))
    {
        APP_ERROR_CHECK(err_code);
    }
}


/**@brief Function for handling the data from the Nordic UART Service.
 *
 * @details This function will process the data received from the Nordic UART BLE Service and send
 *          it to the UART module.
 *
 * @param[in] p_evt       Nordic UART Service event.
 */
/**@snippet [Handling the data received over BLE] */
static void nus_data_handler(ble_nus_evt_t * p_evt)
{
    if (p_evt->type == BLE_NUS_EVT_RX_DATA)
    {
        uint32_t err_code;
        uint16_t msg_len;
        msg_len = p_evt->params.rx_data.length;
        uint8_t msg[msg_len];
        strcpy(msg, p_evt->params.rx_data.p_data);

        NRF_LOG_INFO("Received %d bytes of data from BLE NUS: %s", msg_len, msg);
        NRF_LOG_FLUSH();

        if(strstr(msg, "RegWrite")) 
        {
            app_sched_event_put(p_evt, sizeof(ble_nus_evt_t), regWriteHandler);
        }
        else if(strstr(msg, "ReadConfigs")) 
        {
            app_sched_event_put(NULL, 0, (app_sched_event_handler_t)readConfigsHandler);
        }
        NRF_LOG_FLUSH();
    }
}


/**@brief Function for application main entry.
 */
int main(void)
{
    bool erase_bonds;

    /* Initialize all modules */
    log_init();

    /* Initialize the async SVCI interface to bootloader before any interrupts are enabled. */
    APP_ERROR_CHECK(ble_dfu_buttonless_async_svci_init());

    timers_init();
    scheduler_init();
    power_management_init();
    bleInit();
    twiInit(PIN_I2C_SDA, PIN_I2C_SCL);
    //gpio_init();
    batteryInit();
    dataSamplingInit();
    capacitiveTouchInit();
    imuInit(IMU_SECONDARY);
    ptInit(LPS22HB_PRIM_ADDR);
    erase_bonds = false;

    /* Start advertising device via BLE */
    advertising_start(erase_bonds);

    NRF_LOG_INFO("Entering main loop ...");
    NRF_LOG_FLUSH();

    /* Enter main loop. */
    for(;;)
    {
        app_sched_execute(); // Execute any scheduled events.

        if(NRF_LOG_PROCESS() == false)
        {
            nrf_pwr_mgmt_run();
        }
    }
}



  • Hi,

    What function throws the NRF_ERROR_INTERNAL? If it's the TWI driver then it's a hardware error. This can happen for example if the timing requirements are not respected. Do you have external pullups connected to the SDA or SCL? 

    regards

    Jared 

  • This may be due to not considering multiple threads, which is an issue I have seen on other projects. regWriteHandlerHelper() is (probably) not re-entrant, so an error is generated when it is used by two threads simultaneously. Why would that happen? Consider a read or write in progress to the sensor from some function invoked from main or other part of the code when a BLE packet is received asynchronously which invokes nus_data_handler(). The problem is that this BLE request is within an interrupt context, and so initiates a sensor transfer interrupting the transfer already in progress with no handling to detect this.

    A simple test to see if this is the issue is instead of nus_data_handler() invoking regWriteHandlerHelper() directly instead set a volatile bool and invoke regWriteHandlerHelper()  from main().

    If you have already considered this please ignore, but looking at the code presented in the original post it looks like this is the issue.

  • Correct me if I am wrong here, but I am specifically using the app Scheduler to deal with this issue of multiple threads. My understanding of the Scheduler could be incorrect, but from the main() function in my project, the application is in an infinite loop that solely waits for events and calls app_sched_execute() to run any tasks that have been queued via app_sched_event_put().

    By only scheduling (not running) the register write task in the ble_nus_handler(), I thought that I was effectively handing off the execution of each regWrite task to the main() loop. The only other reads/writes made to the sensors are also done via the same app_sched_event_put() function, so my hope was that the Scheduler would be smart enough to sequentially process these tasks without conflict or fear of asynchronous interrupts.

    Is this logic incorrect?

  • Hi Jared,

    I spoke with the EE who designed the board and YES there are external pullups in place (2.2kΩ). And, yes the error is coming from the TWI driver, specifically from this function:

    uint32_t twiRegWrite16(uint8_t slaveAddr, uint16_t regAddr, uint8_t const * pdata, uint16_t length)
    {
        uint32_t ret;
        uint8_t tx_buff[length + 2];
        uint8_t upperAddr = (regAddr >> 8) & 0x00FF;
        uint8_t lowerAddr = regAddr & 0x00FF;
        tx_buff[0] = upperAddr;
        tx_buff[1] = lowerAddr;
    
        memcpy(tx_buff + 2, pdata, length);
        nrf_drv_twi_xfer_desc_t xfer = NRF_DRV_TWI_XFER_DESC_TX(slaveAddr, tx_buff, length + 2);
    
        ret = nrf_drv_twi_xfer(&m_twi, &xfer, 0);
    
        if (NRF_SUCCESS != ret)
        {
            return ret;
        }
    
        // Wait for response for 5 ms
        for(uint32_t k = 0; k <= 5000; k++)
        {
            if(m_xfer_status != 0)
            {
                nrf_delay_ms(1);
                break;
            }
            nrf_delay_us(1);
        }
    
        if(m_xfer_status == TWI_XFER_STATUS_SUCCESS)
        {
            return NRF_SUCCESS;
        }
        else
        {
            return NRF_ERROR_INTERNAL;
        }
    }

    First, I tried disabling the internal pullups first, but that didnt seem to work right, because the error occurred again when just relying on the external Pullups.

    Next, we removed the external pullups, which you can see in the schematic that I am attaching for reference. However, now we can no longer communicate with ANY of our sensor ICs (I double checked that the internal PullUps are enabled in nrf_drv_twi.c in the SCL/SDA_PIN_INIT_CONF defines). Any ideas as to why this would happen when just using the internal PullUps?

    Also, I am curious as to why don't I ever see the NRF INTERNAL ERROR 3 thrown during the initialization stage of the application. When I initialize all the sensors, I perform many register write operations in a row, and never once has it thrown an error. The only other register writes that occur with this project are from the Desktop app that I am working on here, and they are buggy for some reason. The only other running code in this application is the 50ms repeated timer in, which performs many register reads and never fails....I suppose those repeated register reads could be conflicting with these asynchronous register writes (but oddly not with asynch register reads, which always works). And we dont see this failure during configuration because the timer hasnt been started yet...

    Thank you.

  • You are correct, I re-read the code more closely and see that I'd missed that.

    The TWI always had an issue (MISRA and CERT violation) with using an unchecked length to create an array which is subsequently used by an unchecked block copy/move. if length is (say) 0 bad things happen .. edit ignore that I see there is a +2 on reading again. Is pdata range-checked? memcpy of a zero-length value is also bad .. and the error shown seems to be in memcpy ..

Related