This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

Serial comm stops after 8-10h.

I have developed a custom board with NRF52810 (Rigado), a standalone CAN controller (500 kbaud) and serial communication (57600 baud).


The SW is based on the BLE_APP_UART from SDK 15.3.0, SoftDevice S112, modified with SPI support. The custom board is flashed using the nrf52832 DK and Segger.

The NR52810 runs without external LF crystal (RC only).
The CAN controller uses SPI @4MHz and easy DMA. SPI_DEFAULT_CONFIG_IRQ_PRIORITY set to 3 (from default 6).
The serial communication is for receiving 10Hz GPS data.

The data from the SPI and serial comm are served by interrupt service routines (one for SPI and one for serial comm), and the data are put into global static structs.

The CAN data (from SPI) is transmitted over BLE each 40ms and data from the serial comm is transmitted over BLE each 100ms (app_timers reading the global static structs). Approx 100 bytes in each payload.

Everything is running fine for approx. 8-10h.

However, after 8-10h the serial communication stops (does not receive). The CAN/SPI communication and transmit over BLE runs as normal.

Power cycling my custom board does not help. Sometimes a reflash is needed in order to get the serial comm up and running again. Sometimes it is enough to just attach the nrf52832 DKs VDD, GND, SWD IO and SWD CLK pins to the corresponding pins on my custom board.


Any suggestions to why my serial comm stops after 8-10h?

Any suggestions to why power cycling does not result in a clean boot, resetting the serial comm?

Parents
  • Hi,

    Do you get any error codes in the application when the serial interface stops working? Are you using app_uart_fifo, like it is used in ble_app_uart, or do you use another library/driver for UART?

    Can you provide the state of the registers in the UART(E) peripheral when the device is in the error state? Have you checked the UART lines with a logic analyzer, before and after the issue occurs, to see if there are any changes to the communication between the device?

    Best regards,
    Jørgen

Reply
  • Hi,

    Do you get any error codes in the application when the serial interface stops working? Are you using app_uart_fifo, like it is used in ble_app_uart, or do you use another library/driver for UART?

    Can you provide the state of the registers in the UART(E) peripheral when the device is in the error state? Have you checked the UART lines with a logic analyzer, before and after the issue occurs, to see if there are any changes to the communication between the device?

    Best regards,
    Jørgen

Children
  • Thanks for your response.

    UART config as in ble_app_uart using app_uart_fifo:

      static volatile ret_code_t err_code;
        app_uart_comm_params_t static comm_params =
        {
            .rx_pin_no    = RX_PIN_NUMBER,
            .tx_pin_no    = TX_PIN_NUMBER,
            .rts_pin_no   = RTS_PIN_NUMBER,
            .cts_pin_no   = CTS_PIN_NUMBER,
            .flow_control = APP_UART_FLOW_CONTROL_DISABLED,
            .use_parity   = false,
        };
          
        comm_params.baud_rate = useBaudRate; 
        APP_UART_FIFO_INIT(&comm_params,
                           UART_RX_BUF_SIZE,
                           UART_TX_BUF_SIZE,
                           uart_event_handle,
                           APP_IRQ_PRIORITY_LOWEST,
                           err_code);
        APP_ERROR_CHECK(err_code);

    UART event handler:

    void uart_event_handle(app_uart_evt_t * p_event)
    {
    //    static uint8_t data_array[BLE_NUS_MAX_DATA_LEN];  //BLE_NUS_MAX_DATA_LEN ~244 bytes
        static uint8_t data_array[128];
        static uint8_t index = 0;
        
        uint8_t  indexLastReceived = 0;
        uint8_t  status            = _EMPTY;
        
        switch (p_event->evt_type)
        {
            case APP_UART_DATA_READY:
                UNUSED_VARIABLE(app_uart_get(&data_array[index])); //typecast to void
                indexLastReceived = index;  //save current received
                index++;                    //prepare next char
                if (index >= sizeof(data_array))
                {
                  index=0;
                }
    
                if (data_array[indexLastReceived] == '\n')  
                {
                     //NRF_LOG_DEBUG("GPS data received");
                     index = 0; //prepare next char
                     data_array[indexLastReceived] = '\0'; //replace \n with \0 to make NULL-terminated received msg
                     if (sd_mutex_acquire(&p_GPSmutex)==NRF_SUCCESS) //allocate struct
                     {
                        //Read GPS data
                        status = gps_location(&gpsDataReader, data_array); //decode GPS data and parse to the shared static struct gpsDataReader
                        if (status == NMEA_GPGGA)
                        {
                            gpsDataUpdated = true;
                            timeoutCounter = 0;
                        }
                        sd_mutex_release(&p_GPSmutex);
                     } //end mutex
                } //endif data_array[indexLastUsed]
                break;
    
            case APP_UART_COMMUNICATION_ERROR:
                //APP_ERROR_HANDLER(p_event->data.error_communication);
                app_uart_flush();
                index = 0;  //=> skip all received chars
                serialCommErrCounter++;
                break;
    
            case APP_UART_FIFO_ERROR:
                //APP_ERROR_HANDLER(p_event->data.error_code);
                app_uart_flush();
                index = 0; //=> skip all received chars
                break;
    
            default:
                break;
        }
    }

    At error events, I (temporarily) call app_uart_flush() instead of the app_error_handler(). When debugging (using ses) with brake point at the app_uart_communication_error, I get to many overruns. I guess it is because the debugger have problems keeping up. The GPS device is continuously pushing data to the serial port (no flow control)...

    Have not verified with a scope yet that the GPS device stops pushing data, but I doubt that.

    Remains to capture the UART(E) registers when the serial comm stops.




  • Update: I experienced a lot of app_uart_communication_error. Hundreds pr minute. Reason: "start bit received while previous data still lies in RXD".

    Changed the uart_default_config_priority from 6 to 2, the app_irq_priority in app_uart_fifo_init() from lowest to low. Vent from hundreds of app_uart_communication_error pr. min to ~1 app_uart_communication_error pr minute.

    Reconfigured the GPS to only send the two nmea sentences in use, and could then reduce the baudrate from 57600 to 38400.  Rock steady now with zero app_uart_communication_error.

    If I understand the ble_app_uart correctly, the app_uart_fifo uses easy dma of one byte only. I had to use a baudrate of 57600 to receive all nmea sentences from the GPS within 100ms. That resulted in an interrupt each ~174us and almost 100% duty cycle. By reducing to 38400 and reconfigure the GPS I got a duty cycle of 40% and an interrupt each 260us. 

    I am not sure if these performance issues was the reason why the serial comm stopped. Have to let it run for ~10 hours to see...

  • Preliminary conclusion: Changed the uart_default_config_priority back to the default config (6). Have tested several runs now. The application runs approximately rock steady. Got 5 comm errors on ~9 million bytes.

    It seems clear that the issues I experienced was due to the fact that the application was not able to handle the uart communication load at comm speed above 38kbaud. I realize there are many tickets regarding the uart communication and the need for EasyDMA, but, as I see it, there are no good solutions yet. 

    I have considered the experimental solution (libuarte), but my gut feeling is that I am not sure that will solve my need for running at 115kbaud (have to use 9 ppi's and one timer on the nr52810) with SoftDevice.

    I have concluded on that I will add the max3107 uart in my hw. The max3107 have spi, 256 byte fifo and the possibility to config to give interrupt on LF as well as on timeout since last byte received.   

Related