Communication breakage between nrf52840 chip and host (imx6ull on stress testing)

Upon stress testing of the thread layer communication over a period of time (around 30 mins to 1 hr) the communication (over UART, 115200 baud rate) between the host (imx6ull based SoC) and nrf52840 chip gets disconnected and it does not recover until otbr-agent is restarted.

Continuous commands were sent from controller to the development kit to reproduce the issue.

1. 868b8d791fac9752d154ef0f0614ca15019056a9 - otbr-agent commit id (github.com/.../ot-br-posix)
2. IEEE 802.15.4 hardware platform - nrf52840
3. Build was done using Yocto recipe.
4. Network topology - Star topology (direct communication between controller and development kit)

Expected behaviour is no communication between the nrf52840 chip and the host(imx6ull based SoC).

Should we raise the transmit buffer size and if we can how can it be done?
Or is there any way to increase the tx timeout of a packet to higher value and will it help resolve the issue.

Parents
  • Expected behaviour is communication breakage should not happen between the nrf52840 chip and the host(imx6ull based SoC).
    Sorry for the confusion

  • Hello,

    What is programmed on the nRF52840? Is it the RCP or NCP sample? Did you do any changes to the application before flashing it?

    What HW are you running on (on the nRF)? Is it a DK or a custom board?

    Did you try to analyze the UART wires? Does the otbr actually send data that the nRF doesn't respond to, or is it dead? (using a logic analyzer, such as the saleae logic analyzer).

    Best regards,

    Edvin

  • Hi Edvin,

    As suggested, the above two configurations were enabled in the prj.conf. The host is able to properly communicate with ncp module, however only few debug messages were seen in the RTT terminal as captured in the below excerpts;

    SEGGER J-Link V7.86d - Real time terminal output
    SEGGER J-Link V9.7, SN=59701277
    Process: JLinkExe
    [00:00:00.002,349] <inf> ieee802154_nrf5: nRF5 802154 radio initialized
    [00:00:00.002,532]
    <err> qspi_nor: JEDEC id [c2 28 16] expect [c2 28 17] 
    *** Booting Zephyr OS build v3.2.99-ncs2 ***
    [00:00:00.004,211] <inf> coprocessor_sample:

    =========================================================
    OpenThread Coprocessor application is now running on NCS
    =========================================================
     


    Are we missing any other other configurations ?

    In addition, we would also like to notify that PTA configurations were also enabled as part of the RCP firmware. Is this HandleRcpTimeout() issue is due to enabling PTA ? any inputs would be much helpful.

  • At least according to the log, there are no crashes then. 

    Vignesh Ravi said:
    Is this HandleRcpTimeout() issue is due to enabling PTA ?

    Not sure. I have not tested it before. Does it crash if you disable it?

    Did you try to analyse the UART pins? Do you use flow control on the UART?

    BR,

    Edvin

  • Hi Edvin,

    >> At least according to the log, there are no crashes then.    
    For this particular instance of test, we did not wait till the host gets disconnected with rcp. However, we were expecting runtime logs getting continuously captured in the RTT terminal, but there were only few lines of logs captured and then the logs stopped. Are we missing any other other configurations to see the runtime & crash related logs in RTT terminal ? Your support would be much appreciated.    
       



    >> Did you try to analyse the UART pins? Do you use flow control on the UART?
       
    We are yet to start analysing the UART pins. However, the HW flow control is enabled.

  • Hello,

    I spoke with our Thread Team. They wanted to ask you a few questions regarding this issue:

    1: Is it a possibility to change the UART baudrate from 115200 to 1000000? When the baudrate is 115200, then the radio is capable of delivering more data than the UART can handle, so the issue could be that this is overflowing, and in that case, the Flow Control will not help, because data is lost due to a UART TX buffer overflow on the nRF.

    2: Regardless of whether it is possible to change to 1M baudrate on UART, where/how did you set the baudrate on the nRF's UART?

    3: They also asked about more details on What/how you are building using Yocto. We assume that it is the border router itself (not the nRF) that is built using Yocto. Can you please verify that? And can you share some more details on this?

    4: And can you share some details on how the nRF and the imx6ull are connected. Are they connected directly via UART, or do you use some UART -> USB bridge? A Voltage level shifter? Anything else in between, or is it just pcb traces carrying the UART signal?

    Best regards,

    Edvin

  • Hi Edvin,

    Thanks for discussing with the Thread team. We shall compile our response for all your queries and get back as quickly as possible.

Reply Children
No Data
Related