This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

app_twi gets stuck in while(internal_transaction_in_progress)

Platform: Keil, SDK13.1, nRF52832

I'm investigating series of seemingly random hangs when app_twi gets permanently stuck in app_twi.c following code snippet:

while (p_app_twi->p_app_twi_cb->internal_transaction_in_progress)
{
    if (user_function)
    {
        user_function();
    }
}

When this happens SCL line is permanently pulled low. I thought that it may be due to clock stretching and checked if one of slave devices is misbehaving. I wasn't able to to find anything on slave side. 

One of my slave devices (Atmel Tiny MCU) has a way to tell if it's holding SCL line. While similar lock happened I was able to check that SCL was held by master (as far as you can trust ATTiny register values). I was also able to reset the slave and this haven't released SCL. However resetting nRF52 (master) caused SCL release.

So I think that's something wrong with app_twi master implemetation.

This code snippet can reliably cause symptoms I've described above:

#include "nrf_log.h"
#include "app_twi.h"
#include "sm_pin_mapper.h"

//TWI instance handle
APP_TWI_DEF(m_app_twi, APP_TWI_QUEUE_SIZE, 0);

//TWI slave address
#define TWI_SLAVE_ADDR 0x1D

//how many bytes to read from slave
#define TWI_SLAVE_REG_LEN 7

static ret_code_t twi_init(void)
{
	ret_code_t err_code;
	nrf_drv_twi_config_t twi_config=NRF_DRV_TWI_DEFAULT_CONFIG;

	twi_config.sda=SM_PIN_TWI_SDA;
	twi_config.scl=SM_PIN_TWI_SCL;

	err_code=app_twi_init(&m_app_twi, &twi_config);

	APP_ERROR_CHECK(err_code);

	return err_code;
}

static void read_reg(const uint8_t reg_addr, uint8_t* const p_buff, const uint8_t len)
{
    ret_code_t err_code;
	
	static uint8_t tx_buff[1];
	tx_buff[0]=reg_addr;
	app_twi_transfer_t const twi_transfers[] = {
		APP_TWI_WRITE(TWI_SLAVE_ADDR, tx_buff, sizeof(tx_buff), APP_TWI_NO_STOP),
		APP_TWI_READ(TWI_SLAVE_ADDR, p_buff, len, 0)
	};
	
	err_code = app_twi_perform(&m_app_twi, twi_transfers, sizeof(twi_transfers) / sizeof(twi_transfers[0]), 0);
    APP_ERROR_CHECK(err_code);
}

static void write_reg(const uint8_t reg_addr, uint8_t* const p_buff, const uint8_t len, const uint8_t stop)
{
    ret_code_t err_code;
    uint8_t tx_buff[TWI_SLAVE_REG_LEN+1]={0};
    tx_buff[0] = reg_addr;
    memcpy(&tx_buff[1], p_buff, len);
	
	app_twi_transfer_t const twi_transfers[] = {
		APP_TWI_WRITE(TWI_SLAVE_ADDR, tx_buff, len + 1, (stop == 1) ? 0 : APP_TWI_NO_STOP)
	};
	
	err_code = app_twi_perform(&m_app_twi, twi_transfers, sizeof(twi_transfers) / sizeof(twi_transfers[0]), 0);
    APP_ERROR_CHECK(err_code);
}

static void test_case(uint8_t tx_stop)
{
	uint8_t rx_buff[TWI_SLAVE_REG_LEN] = { 0 };
    uint8_t tx_buff[] = { 0 };

    write_reg(0, tx_buff, sizeof(tx_buff), tx_stop);
    read_reg(0, rx_buff, sizeof(rx_buff));

    NRF_LOG_RAW_HEXDUMP_DEBUG(rx_buff, sizeof(rx_buff));
}

void twi_slave_tiny_main_unwrap(void)
{
    APP_ERROR_CHECK(twi_init());
	
	//this will complete just fine
	test_case(1);
	
	//this will pull SCL low and get stuck forever
	test_case(0);
}

test_case(1) works as expected while test_case(0) reliably causes app_twi to get stuck as on attached Saleae trace /cfs-file/__key/support-attachments/beef5d1b77644c448dabff31668f3a47-010a9164018145d884c8f96862005e33/twi_5F00_hang.logicdata and register view screenshot

https://devzone.nordicsemi.com/resized-image/__size/1920x1080/__key/support-attachments/beef5d1b77644c448dabff31668f3a47-010a9164018145d884c8f96862005e33/twi_5F00_hang.png 

While this happens SCL is permanently pulled low. Resetting nRF52 gets communication working again.

This is just an example. Hangs I'm experiencing happen in various configurations of TWI read/write operations. 

Am I doing something wrong?

What can be done to debug this?

I'm observing similar problems in devices already deployed in the field and I'd appreciate pointer towards solving this.

Parents Reply Children
  • If you can't update to a new SDK, then the only workaround I see is the one suggested earlier: "That is indeed correct, it seems to be an issue with the twi manager and easydma, so the workaround for now is to disable easydma (when using twi manager) or use the twi driver directly. I have reported it internally."

    There have no update to the case I reported internally for SDKv13, but I do know there have been changes both to twi driver and manager that may have picked up your issue already for SDKv15.

  • Does it mean that for future project there's some change in SDK15 that addresses this problem? If yes please provide a pointer. We are currently considering permanent switch to competitor platform because of problems caused by this bug.

  • There have no update to the case I reported internally for SDKv13, but I do know there have been changes both to twi driver and manager that may have picked up your issue already for SDKv15. 

  • Could you please check that the changes actually changed something for this case? The way this support case was resolved - no reply for 2 months, even a 'won't fix' caused significant consequences for my company. Working I2C is top priority in our projects. Plus there were other bugs (i.e. app_timer and nrf_log). Judging by some posts on this forum nrf_log bug is present in SDK15. I need to be 200% sure that the problem is solved before I'll consider your platform.

  • Please understand that we handle several hundred cases every week, and our main focus is to find a workaround when a customer report an issue, in this case 2 have been identified. The next step is to gather information whether the same issue is seen in many customer designs, and prioritize accordingly. For the TWI there have actually been very few reports, even if it is used in a large amount of the designs. This indicate that the underlying TWI drivers works as intended, but the TWI manager may for this specific SDK release have an issue for specific usage and configuration.

Related