This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

BLE APP CLI example (CLI over BLE UART): poor reliability

I'm trying to implement CLI over BLE UART for one of our company projects. I started from the experimental ble_app_cli example in the SDK. I'm using SDK 15.3.0 on the Nordic PCA10056 SDK board. I keep running into instability problems of various kinds.

In order to narrow down the problem, I decided to simply use the example as is, and see if I can reproduce the issues. For the most part, I end up facing the same problems. 

Steps to reproduce:

  • start the example in debug mode, so that I can see that's happening.
  • connect to the device using an Android BLE Terminal emulator (I use this one https://play.google.com/store/apps/details?id=de.kai_morich.serial_bluetooth_terminal&hl=en&gl=US)
  • see the CLI prompt, send led on and led off commands, it all works
  • walk away so that the smartphone loses BLE connection
  • at least 50% of the times, the PCA10056 device doesn't get the disconnect event, and the device gets into a state where the BLE stack is not communicating anymore. It's also possible to get into the same state using for example the UART functionality of the nRF Toolbox and sending bogus commands
  • I added a NRF_LOG_INFO("BLE Cli enabled");/NRF_LOG_INFO("BLE Cli disabled"); after the nrf_cli_init()/uninit for the BLE CLI in ble_evt_handler(), and I can see that the BLE cli gets initialized but never disconnects/uninits, leaving the stack connected even if the phone is disconnected

Is anyone using the CLI over BLE UART? I know I could implement some sort of watchdog, and that the example is only meant as an example, but the reliability I'm seeing so far makes the use of the CLI over the air a bit questionable. Usually the Nordic examples are a lot more robust than this

What am I doing wrong?

Parents
  • just a thought:

    Note that there is no standard for "UART over BLE" - they're all proprietary, manufacturer-specific services.

    So how do we know that this app has a reliable implementation of NUS ... ?

  • You are right, we don't... but that app works with every other BLE serial implementation. And, as long as I never lose connectivity, the app works just fine. The connectivity loss is not something the app handles anyway

    And, as I said, I can get the example in the same state by simply using the Nordic provided UART applet in nRF Toolbox. In a few cases, even just using the nRF Connect. So that Serial Terminal app is not the real problem, even assuming that is not 100% compatible

  • at least 50% of the times, the PCA10056 device doesn't get the disconnect event, and the device gets into a state where the BLE stack is not communicating anymore.

     Have you tried debugging this issue? Does the log from the nRF52840 say anything at this point? 

    The example is located in the experimental folder, so it may not be properly tested.

    BR,

    Edvin

  • I spent 2 days debugging before creating this thread, yes

    Enabling logs completely locks the device when it connects, failing somewhere inside nrf_sdh_ble_evts_poll(). The only way to recover is to "break"pause on the debugger, and the code ends in HardFault handler with no stack trace. I tried even putting a breakpoint in HardFault handler, but it never reaches it before I hit

    If I enable fewer logs, I get a lot of "Lost logs - increase log backend queue size", but everything I try to do to fix that (following various suggestions from other threads here" cause only more problems

    I tried understanding why the log engine fails, but I went into a maze that keeps leading into the Softdevice dead end. I tried adding breakpoints (using Monitor mode Debugging to keep the BLE stack alive), but even that didn't lead to any meaningful find: I always get to a point where a call into the Softdevice fails, with no extra information

    I would appreciate if you found a way to enable logs on that example that allows to understand why be_evt_handler() in main.c is never called with a BL_GAP_EVT_DISCONNECTED when a connection is lost. As I said, I added a couple of NRF_LOG_INFO(_ for connected and disconnected in that handler, disabled the other logs, and I can see that many times the disconnected event is never reached. I could not find a combination of meaningful logs to enable that doesn't lead to a hardfault in the code (which is not to say it's impossible, just that I could not figure it out). 

    I do understand it's under experimental, but considering it's been "experimental" for at least 2 years, it seems more abandoned than just experimental. In a way that's what makes me wonder how prudent would be to base our OTA configuration strategy on the BLE UART CLI, given it seems to be a non-supported component. If even an example as simple as that one fails so badly (unrecoverable without a reset), adding CLI to a much more complex project, with multiple VS UUIDs seems to be looking for problems

Reply
  • I spent 2 days debugging before creating this thread, yes

    Enabling logs completely locks the device when it connects, failing somewhere inside nrf_sdh_ble_evts_poll(). The only way to recover is to "break"pause on the debugger, and the code ends in HardFault handler with no stack trace. I tried even putting a breakpoint in HardFault handler, but it never reaches it before I hit

    If I enable fewer logs, I get a lot of "Lost logs - increase log backend queue size", but everything I try to do to fix that (following various suggestions from other threads here" cause only more problems

    I tried understanding why the log engine fails, but I went into a maze that keeps leading into the Softdevice dead end. I tried adding breakpoints (using Monitor mode Debugging to keep the BLE stack alive), but even that didn't lead to any meaningful find: I always get to a point where a call into the Softdevice fails, with no extra information

    I would appreciate if you found a way to enable logs on that example that allows to understand why be_evt_handler() in main.c is never called with a BL_GAP_EVT_DISCONNECTED when a connection is lost. As I said, I added a couple of NRF_LOG_INFO(_ for connected and disconnected in that handler, disabled the other logs, and I can see that many times the disconnected event is never reached. I could not find a combination of meaningful logs to enable that doesn't lead to a hardfault in the code (which is not to say it's impossible, just that I could not figure it out). 

    I do understand it's under experimental, but considering it's been "experimental" for at least 2 years, it seems more abandoned than just experimental. In a way that's what makes me wonder how prudent would be to base our OTA configuration strategy on the BLE UART CLI, given it seems to be a non-supported component. If even an example as simple as that one fails so badly (unrecoverable without a reset), adding CLI to a much more complex project, with multiple VS UUIDs seems to be looking for problems

Children
Related