Central uart on thingy91:nrf52840 unstabile ble_data_received

Hi,

I am running central_uart example on thingy91:nrf52840 it is paired up with Thingy52 running peripheral_uart example.

everything seems to run fine i send and receive msg every 5 sec.
my problem is that the ble_data_received is toggling with triggered when buffer is full (40 bytes) and end of msg that is normaly 15-20 bytes.

how can i controll this? is there any time settings or end of msg character i can use

Rune

Fullscreen
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
/*
* Copyright (c) 2018 Nordic Semiconductor ASA
*modifyed by rv
* SPDX-License-Identifier: LicenseRef-Nordic-5-Clause
*/
/** @file
* @brief Nordic UART Service Client sample
*/
#include <errno.h>
#include <zephyr/kernel.h>
#include <zephyr/device.h>
#include <zephyr/devicetree.h>
#include <zephyr/sys/byteorder.h>
#include <zephyr/sys/printk.h>
#include <zephyr/bluetooth/bluetooth.h>
#include <zephyr/bluetooth/hci.h>
#include <zephyr/bluetooth/conn.h>
#include <zephyr/bluetooth/uuid.h>
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

Parents
  • Hei Rune,

    The ble_data_received() callback should be invoked each time you receive a BLE notification packet from the peripheral regardless of what the packet size is. How do you determine that the last 15-20 byte message is not received, are you looking at the messages relayed to the uart interface, or at the debug log messages on RTT (those printed with the LOG_* macros)?

    Best regards,

    Vidar

  • Hi Vidar,

    I am Looking at the Log msg on the rtt. the peripheral reveives a msg on bt_nus from central unit and passes it thrue to the uart. the reply from uart is then sent to central uart thrue bt_nus. ( standard uart bridge ).
    I can see on the rtt that the peripheral unit is sending and receiving good. on the central unit the ble_data_received() trigger when buffer is full(40 byte) and the data contet is 2 or 3 mesgs.
    This is not constant, suddenly it can work as expected and trigger on one msg and then go backto trigger on buffer full.

    Br

    Rune

  • Hi Rune,

    The log shows you have a stack overflow in your main thread. This can be fixed by increasing the stack size (CONFIG_MAIN_STACK_SIZE). But it does not make sense that you get this on one board and not the other if you are running the same FW build and the same boards. 

    I notice you have debug options enabled now; this will lead to some increase in stack usage. Do you also get a stack overflow if you uncheck the "enable debug options" checkbox in your project configuration?

    With regards to the original problem you reported, I am still waiting for the developers to have a look at it and find a root cause.

    Best regards,

    Vidar

  • Tanks Vidar,

    By disabling debug option made the error disappear.

    Regards

    Rune

  • Presently using nRF5340, NCS 2.2.0

    I am seeing this UART crash relating to uart_rx_disable() / uart_rx_enable(), even when using NCS 2.2.0

    This was mentioned above as a fix in NCS 2.1.3


    I have a shell running on our USB CDC.

    UART0 is enabled for async API, and UART is communicating to an external AT modem. (115K2, N, 8, 1, no flow control)

    In our application, when I perform a uart_rx_disable(), I will subsequently call uart_rx_enable() about 1.5 seconds later.

    After the uart_rx_enable(), the crash will occur only after some subsequent transmit / receive activity on UART0. I can send the AT commands via shell commands. It only takes 5-20 AT/OK commands for the spontaneous crash to occur.

    If I comment any code relating to uart_rx_disable() / uart_rx_enable(), then all UART0 AT communications remain flawless forever.


    Would be curious is you have any insights into `I suspect the issue may be related to the UART async adapter which is enabled when USB CDC is used`


    I suspect it is related to the USB CDC.

    If I disable the "debug option" during my build, this crash still occurs.

  • What I recall from debugging this is that I sometimes did not receive the UART_RX_BUF_RELEASED event from the async adapter. But I did not really debug it much further. Instead, I reported it as a bug internally to let the developers continue the investigation. 

    Below are the commits that fixed the issue mentioned in the v2.1.3 release notes (are also included in v.2.3.0)

    https://github.com/nrfconnect/sdk-zephyr/commit/595fbf9b975103eede31c2b5c68a392517f252f6 

    https://github.com/zephyrproject-rtos/zephyr/commit/c984a343afdb06abb7cf8cca03c00eacdf88675e 

    https://github.com/zephyrproject-rtos/zephyr/pull/52340/commits/4a9c18f806914a910163e65da417a34becb92531 

    And based on the commit messages, it does not sound like they are addressing the problems OP reported here.

    mtsunstrum said:

    I suspect it is related to the USB CDC.

    If I disable the "debug option" during my build, this crash still occurs.

    Is it an option to try v.2.3.0 to see if it fixes the problem you experienced?

    Thanks

    Vidar

  • Thanks  for the response.

    I won't yet try out NCS 2.3.0, but I found a workaround that may highlight where the issue may reside.

    In my application, I have an external AT modem where at times I need to reset it.

    This causes the AT modem to send a UART break signal to the nRF5340, which lasts about 600ms.

    Before resetting the AT modem, I would call uart_rx_disable() first, so as not to trigger UART framing/break errors. I then subsequently call uart_rx_enable() about 1500ms later, after the AT modem has stabilized.

    After uart_rx_enable() via USB shell interface, I would send simple "AT" messages, and get an "OK" back from the modem. But after about 2-8 attempts, I would get the random crash.


    To "workaround" the issue, what I did is not call uart_rx_enable()

    In this scenario, once we get the UART break signal from the UART modem, our uart callback handler would get about 4 UART_RX_STOPPED events with reason codes 4 and 8 ( framing error and break signal ), then it would be followed by a UART_RX_DISABLED event.

    Upon receiving above events, I would take no action. In this scenario, the uart library is taking care of disabling the uart receive.

    In my code, after resetting the AT modem, I would simply sleep for 1500ms, then I would simply call uart_rx_enable() appropriately.

    AT command operations were flawless after this.


    It gives me a suspicion that the programmatic uart_rx_disable() operation is suspect. If I just let the uart receive disable itself upon receiving a uart break condition as described above, it works.

Reply
  • Thanks  for the response.

    I won't yet try out NCS 2.3.0, but I found a workaround that may highlight where the issue may reside.

    In my application, I have an external AT modem where at times I need to reset it.

    This causes the AT modem to send a UART break signal to the nRF5340, which lasts about 600ms.

    Before resetting the AT modem, I would call uart_rx_disable() first, so as not to trigger UART framing/break errors. I then subsequently call uart_rx_enable() about 1500ms later, after the AT modem has stabilized.

    After uart_rx_enable() via USB shell interface, I would send simple "AT" messages, and get an "OK" back from the modem. But after about 2-8 attempts, I would get the random crash.


    To "workaround" the issue, what I did is not call uart_rx_enable()

    In this scenario, once we get the UART break signal from the UART modem, our uart callback handler would get about 4 UART_RX_STOPPED events with reason codes 4 and 8 ( framing error and break signal ), then it would be followed by a UART_RX_DISABLED event.

    Upon receiving above events, I would take no action. In this scenario, the uart library is taking care of disabling the uart receive.

    In my code, after resetting the AT modem, I would simply sleep for 1500ms, then I would simply call uart_rx_enable() appropriately.

    AT command operations were flawless after this.


    It gives me a suspicion that the programmatic uart_rx_disable() operation is suspect. If I just let the uart receive disable itself upon receiving a uart break condition as described above, it works.

Children
  • It sounds like you have a good workaround in place. I still think what you describe may be a result of the UART driver bug I mentioned. The fix was included in v2.1.3 and 2.3.0, but not in v.2.2.0 which is the oldest release of the three.

    mtsunstrum said:
    This causes the AT modem to send a UART break signal to the nRF5340, which lasts about 600ms.

    Assuming the modem is not driving the signal low, it should be possible to avoid the break signal by enabling the internal pull-up on the RX line, as we are doing in our board files: https://github.com/nrfconnect/sdk-zephyr/blob/main/boards/arm/nrf52840dk_nrf52840/nrf52840dk_nrf52840-pinctrl.dtsi#L15

  • Hi   thanks for your reply.

    I thought I had it under control, and it worked most of the time ( maybe 80% fewer lockups), so likely I was on the right path. 

    Unfortunately, the only way out of the situation was for my firmware to do a cold reset, after a UART break had been detected. Fortunately, my application logic allowed me to use this "hacky" workaround.

    You mentioned the fix not being in v2.2.0, but it was in the earlier v2.1.3 and the later v2.3.0. Are you correct in saying v2.2.0 did not have the fix ?

    Yes, in my situation, the modem is driving the break signal, so usage of UART RX pullup would not help me out.

    Cheers

  • Hi,

    Yes, v2.2.0 does not have the fix which may seem a bit counterintuitive. The thing is that v2.1.3 was a minor release made after v2.2.0 had been tagged with fixes backported from v2.2.0 and the main branch. Ideally there should have been a 2.2.1 version to address this problem, but it was not prioritized.

    https://developer.nordicsemi.com/nRF_Connect_SDK/doc/latest/nrf/releases/release-notes-2.1.3.html 

    Cheers

    Vidar

  • Aaah .... made the move to NCS 2.3.0 and it all seems good now.

    Within about a millisecond or so, the 115K2 UART isr will have detected the break signal, and disable uart receive. When I re-enable uart receive later, it works properly.

  • Thanks for confirming. I am glad to hear that it seems to work now.