MCU Stops Responding

Hi Everyone,

I am working on a project which involves reading data from an accelerometer ( MC3672 ) using NRF52832.

Currently sampling is done via the data ready pin, which is sensed by GPIOTE module and a flag is set which then

gets processed in the main loop.

The program would work fine for a few minutes before the MCU would stop responding all of a sudden and this is seen

by an increase in current consumption from 2mA to 9mA. Initially, I thought that it was due to it being stuck in a endless loop,

however after some debugging, I have nailed it down to the line that it would stop at, but I have no idea what is causing this.

Code for reading data from the accelerometer in the main loop 

MCU would stop at the M_DRV_MC36XX_HandleINT(&accel_int) line 

ie I would see Handle Int printed but not Handle Int Done in the RTT viewer.

					if(interrupt_flag)
					{						
						uint8_t n;
						spi_change(SPIM0_SS_ACCEL_PIN);			
						s_m_drv_mc36xx_int_t accel_int;
						NRF_LOG_INFO("Handle Int\r\n");
						NRF_LOG_FLUSH();
						M_DRV_MC36XX_HandleINT(&accel_int);
						NRF_LOG_INFO("Handle Int Done\r\n");
						NRF_LOG_FLUSH();
						n = M_DRV_MC36XX_ReadDataBurst(accel,ACCEL_FIFO_THRESHOLD);
						spi_change(SPIM0_SS_NAND_PIN);
						interrupt_flag = false;											
					}

M_DRV_MC36XX_HandleINT(&accel_int) function 

Execution stops at _M_DRV_MC36XX_REG_WRITE(E_M_DRV_MC36XX_REG_STATUS_2, &_bRegStatus2, 1);

Eg I see Accel H printed but not Accel E

int M_DRV_MC36XX_HandleINT(s_m_drv_mc36xx_int_t *ptINT_Event)
{
    uint_dev _bRegStatus2 = 0;
    _M_DRV_MC36XX_REG_READ(E_M_DRV_MC36XX_REG_STATUS_2, &_bRegStatus2, 1);

    ptINT_Event->bWAKE =                                                    \
        _M_DRV_MC36XX_REG_STATUS_2_INT_WAKE(_bRegStatus2);
    ptINT_Event->bACQ =                                                     \
        _M_DRV_MC36XX_REG_STATUS_2_INT_ACQ(_bRegStatus2);
    ptINT_Event->bFIFO_EMPTY =                                              \
        _M_DRV_MC36XX_REG_STATUS_2_INT_FIFO_EMPTY(_bRegStatus2);
    ptINT_Event->bFIFO_FULL =                                               \
        _M_DRV_MC36XX_REG_STATUS_2_INT_FIFO_FULL(_bRegStatus2);
    ptINT_Event->bFIFO_THRESHOLD =                                          \
        _M_DRV_MC36XX_REG_STATUS_2_INT_FIFO_THRESH(_bRegStatus2);
    ptINT_Event->bSWAKE_SNIFF =                                             \
        _M_DRV_MC36XX_REG_STATUS_2_INT_SWAKE_SNIFF(_bRegStatus2);

		NRF_LOG_INFO("Accel H\r\n");
		NRF_LOG_FLUSH();
/** clear interrupt flag */
#ifdef    M_DRV_MC36XX_CFG_BUS_SPI
    _M_DRV_MC36XX_REG_WRITE(E_M_DRV_MC36XX_REG_STATUS_2, &_bRegStatus2, 1);
#endif
		NRF_LOG_INFO("Accel E\r\n");
		NRF_LOG_FLUSH();
    return (M_DRV_MC36XX_RETCODE_SUCCESS);
}

The function calls the SPI function to transfer the data 

#define _M_DRV_MC36XX_REG_WRITE(bRegAddr, pbDataBuf, bLength) \
mcube_write_regs(0, 0, bRegAddr, pbDataBuf, bLength)

This is the bit I dont understand, all code execute fine and I can see Accel done printed before 

the MCU crashes. 

int8_t mcube_write_regs(bool bSpi, uint8_t chip_select, uint8_t reg,       \
                         uint8_t *value, uint8_t size)
{
    /** Please implement I2C/SPI write function from platform SDK */
    /** 0 = SPI, 1 = I2C */
	  uint8_t CMD[1+size];
		CMD[0] = reg;
		memcpy(&CMD[1], value, size);
    if(!bSpi) {
			CMD[0] &= ~(1UL << 7);
			CMD[0] |= 1UL << 6;
        /** SPI write function */
			Serialize_SPI(CMD,1+size,NULL,0,true);
			NRF_LOG_INFO("Accel done\r\n");
			NRF_LOG_FLUSH();
//			Serialize_SPI_Write(CMD,1+size,true);
    } else {
        /** I2C write function */
    } 

    return 0;
}

I am trying to find a way to analyze this with a Saleae logic analyser as they dont support repeat trigger

and this problem only occurs once and at a random time.

Thanks for the help.

Rgs,

Bryan Hsieh

Parents
  • Hello Bryan,

    I see that you have added NRF_LOG_FLUSH() inside M_DRV_MC36XX_HandleINT().

    I don't know where your first snippet is from (an interrupt or the main() loop), but you should never add NRF_LOG_FLUSH() or NRF_LOG_PROCESS() anywhere unless you mean to shut down immediately after. It should only be used in the main() loop and in error handlers. The reason for this is that this is not what we call a thread safe / thread aware function. What could happen (and is probably happening) in your case is that NRF_LOG_FLUSH() or NRF_LOG_PROCESS() is called regularly from your main loop (either directly, or indirectly from a function called something like "idle_state_handle()". Then, if you get an interrupt, and this NRF_LOG_PROCESS() execution is interrupted half way through, and then you call NRF_LOG_PROCESS() again from that interrupt, then your nRF52832 will crash, which is probably what you are seeing.

    If you are having problems with too much log for the log module to handle (which I guess is the reason you added NRF_LOG_FLUSH() to another location), I suggest you either increase the log buffer size, or turn off deferred logging (both of these are done from sdk_config.h. Look for NRF_LOG_BUFSIZE and NRF_LOG_DEFERRED).

    Best regards,

    Edvin

  • I tried to figure out where it says that you shouldn't call NRF_LOG_PROCESS() from elsewhere (NRF_LOG_FLUSH() is just a repeated call to NRF_LOG_PROCESS()), but I only found this in nrf_log_ctrl.h:

    /**@brief Macro for processing a single log entry from a queue of deferred logs.
     *
     * You can call this macro from the main context or from the error handler to process
     * log entries one by one.
     *
     * @note If logs are not deferred, this call has no use and is defined as 'false'.
     *
     * @retval true    There are more logs to process in the buffer.
     * @retval false   No more logs in the buffer.
     */
    #define NRF_LOG_PROCESS()    NRF_LOG_INTERNAL_PROCESS()

  • Hi Edvin,

    Thanks for the prompt response. I do not think the error is related to the log process as they were added 

    after I noticed the MCU would stop advertising randomly. But thanks for the best practice when it comes to logging.

Reply Children
  • Ah, maybe I misunderstood then.

    So even if you remove the additional NRF_LOG_FLUSH() you are still stuck somewhere, right? Have you tried debugging and stepping through mcube_write_regs()? Where does it stop?

    Is your SPI set up with an event handler or without? And in what context is mcube_write_regs() called? Is it from inside an interrupt, or from main() (in the main()'s while loop?)

    I suspect that you are facing an interrupt priority blocking issue.

    Best regards,

    Edvin

  • To Edvin,

    Hi sorry for the late reply.

    I thought the problem was resolved but upon testing the device recently, the same CPU locking behaviour occurred again. 

    The SPI is set up with an event handler and the mcube_write_regs is called from the main loop. 

    The issue is not interrupt related as the CPU locking occurs only after SPI transfer has finished ie the SPI_done flag has been set to true by the SPI interrupt handler. Therefore the CPU is not stuck in a while loop, it crashes after the Serialize_SPI(CMD,1+size,NULL,0,true) function, which is what is confusing.

  • Using parameter size to define stack allocation and (worse) memcpy() would be flagged as an error by MISRA; if size is ever 0 there will be a crash or something horrible; fix by using predetermined max buffer size (not an issue here due to the +1) or do not perform stuff like memcpy() if size == 0. Possibly not the issue, but worth addressing as often the errors are very random and hard to catch.

  • Thanks. But unfortunately the problem is unrelated to it, as like you said the size is always +1. Also  _M_DRV_MC36XX_REG_WRITE function ( where the program would fail after ) is always writing 1 as the size however despite that the MCU would freeze randomly after a few minutes.

  • Hello,

    Sorry, I was out of office, and I will be next week as well. Unfortunately, due to limited staffing during the summer, I am not certain my colleagues will have the time to look into unhandled "old" cases like this one. 

    bryanhsieh said:
    Therefore the CPU is not stuck in a while loop, it crashes after the Serialize_SPI(CMD,1+size,NULL,0,true) function, which is what is confusing

    This is not our function. What does it do? Did you try debugging it to see what's going on? Are there places inside (or after, I didn't understand where you meant) that the application can be stuck?

    Best regards,

    Edvin

Related