This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

How to wait for the BLE_GATTC_EVT_PRIM_SRVC_DISC_RSP event?

Dear Nordic Support,

My project is split into two parts: A firmware application that is based on the Generic Level Server model example (from the Mesh SDK) and an Android app written in Flutter. Both are working pretty well - in 99% of the cases.
However, in some rare, and hard to reproduce cases it happens that connecting from the Flutter/Android app to the firmware fails. If it fails, Android logs either error 0x13 or error 0x85. Here's an example:

17:23:20.371 D/BleMeshManager: [BLE LOG] Connected to ED:B0:A7:E1:F7:84
17:23:20.374 D/BleMeshManager: [BLE LOG] wait(300)
17:23:20.677 D/BleMeshManager: [BLE LOG] Discovering services...
17:23:20.677 D/BleMeshManager: [BLE LOG] gatt.discoverServices()
17:23:21.133 D/BleMeshManager: [BLE LOG] Connection parameters updated (interval: 247.5ms, latency: 0, timeout: 4000ms)
17:23:22.865 D/BleMeshManager: [BLE LOG] Connection parameters updated (interval: 7.5ms, latency: 0, timeout: 3000ms)
17:23:22.896 D/BleMeshManager: [BLE LOG] [Callback] Connection state changed with status: 19 and new state: 0 (DISCONNECTED)
17:23:22.897 D/BleMeshManager: [BLE LOG] Error: (0x13): GATT CONN TERMINATE PEER USER
17:23:22.898 D/BleMeshManager: [BLE LOG] Disconnected

Note: The firmware application doesn't log anything suspicions if this error happens.


Softdevice Known Issue:

The following is from the s132_nrf52_7.3.0 release notes:
"The ble_gattc_service_t::uuid field is incorrectly populated in the BLE_GATTC_EVT_PRIM_SRVC_DISC_RSP event if sd_ble_gattc_primary_services_discover() or sd_ble_gattc_read() is called when a Primary Service Discovery by Service UUID is already ongoing. When the application has called sd_ble_gattc_primary_services_discover(), it should wait for the BLE_GATTC_EVT_PRIM_SRVC_DISC_RSP event before calling sd_ble_gattc_primary_services_discover() or sd_ble_gattc_read() (DRGN-11300)."

Looking at the above log messages, could the 0x13 or 0x85 error (and hence the failed connection) be caused by this issue?
Have you ever seen something similar in the past? (Would be great if you could ask around your team mates as well btw.)

Now considering that my firmware application is based on the Generic Lever Server model (from the Mesh SDK examples):
How can you wait for the BLE_GATTC_EVT_PRIM_SRVC_DISC_RSP event before calling sd_ble_gattc_primary_services_discover() or sd_ble_gattc_read()? How would you implement this? Right now, the firmware application neither handles BLE_GATTC_EVT_PRIM_SRVC_DISC_RSP nor does it call sd_ble_gattc_primary_services_discover() or sd_ble_gattc_read(), so I'm a little lost how to implement a workaround for this issue, as well as if it (the known issue) is actually related the the failed connections attempts, so your advise on this is very much appreciated.

Thanks for your help,
Michael.

  • Looking at the above log messages, could the 0x13 or 0x85 error (and hence the failed connection) be caused by this issue?

    I don't think so no. (I believe this is already handled by the discovery module in any case.)

    Based on the "GATT CONN TERMINATE PEER USER" disconnect reason it looks like the peer (firmware application) is calling sd_ble_gap_disconnect() with BLE_HCI_REMOTE_USER_TERMINATED_CONNECTION disconnect reason. You can set a brekpoint or debugging to find where this is called?

    Do you also have a log of the 0x85 error code? I think that one is different.

    Kenneth

  • Hi Kenneth,

    Thank you very much for your help!

    The firmware application calls sd_ble_gap_disconnect() with BLE_HCI_REMOTE_USER_TERMINATED_CONNECTION at DFU event handler just before entering bootlader.

    At the Android app, the 0x13 error happens right after provisioning, when connecting to the new node for the first time, just before the app key is assigned. A time at which it really shouldn't enter bootloader mode...
    Question now is, why does the firmware application enter bootloader mode after provisioning? What could be the reason that ble_dfu_evt_handler() receives a BLE_DFU_EVT_BOOTLOADER_ENTER_PREPARE event?

    The 0x13 error happens only once (if lucky twice) while bulk provisioning 25 mesh nodes. The chance that the debugger is connected to the right node is rather small, so this is a little difficult to debug... Do you know of any common pitfalls when adding DFU to a firmware application? Any ideas or advise, regardless how unlikely it might seem, is very much welcome.

    About the 0x85 Error:
    Is just a timeout that happens when the connection to the firmware application isn't established in 30 seconds. Happens occasionally, but a simple reconnect "fixed" the issue.

    Again thanks for  your help,
    Michael.

  • Hi Michael, 
    I'm taking over the case as it's related to DFU.

    It's quite strange that in your application you have DFU service. Could you let me know more about your application ? Did you implement the DFU buttonless service into the mesh application ? 
    Did you flash the bootloader ? 


    The Buttonless DFU service will disconnect and switch to bootloader mode if the indication is set and the characteristic is written with 0x01. Read more about it here.


    I would suggest to capture sniffer trace so we would know what happens over the air.

  • Hello Hung Bui,
     
    Thanks for your help! The device is an industrial dimmer that's plugged into a light. The firmware application is quite simple: 

    • It's based on the generic level server model (from the Mesh SDK) to which I've added the code from the buttonless DFU example.
    • PWM is used to adjust the brightness of the light.
    • nrfx_gpiote is used to set some digital output pins (LEDs) and to receive events if a DI pin changes state (like when a button is pushed).  

    Considering the whole application is rather simple stuff in just 610 lines of code (comments and empty lines omitted), I assume that I made a (most likely pretty obvious) mistake here.
     
    The (secure) bootloader is flashed only during production - together with the Softdevice (7.3.0), a matching settings page and the application. Then later on, when a DFU is done from within the Android app, only the firmware application is programmed. DFU, btw, is working reliable. More than half a dozen of runs each bulk DFU'ing 25+ devices (in sequence) always worked - without a single issue.
     
    Speaking of DFU: Why is it strange to have a DFU service in a mesh application?
     
    I'll try to get some Wireshark logs. It will take a few days, because it's only reproducible at my client's test installation, and well, things are a little complicated right now... It's a 150km ride, so any ideas beside the sniffer logs would be great.
     
    I assume that I made an obvious mistake here - most likely that I've either overlooked or forgot something!?! Is there something like "the most common pitfalls when adding DFU"? What's the first thing you would look at? (beside the sniffer logs)

    Any ideas are welcome,
    Again, thanks for your help,
    Michael.

  • Hi Michael, 

    Thanks for the explanation. The reason I mentioned that it's strange to have DFU buttonless in a mesh application because usually for mesh application we implement Mesh DFU. It's a different protocol from the BLE DFU and it support updating large number of nodes at the same time. 

    There is a potential issue that I can see here. It's on how you initialize the DFU buttonless after provisioning and before configuration. I assume it's when you receive error 0x13. 

    So the way it works now with the PB-GATT (provisioning with the phone) is that first it will be intialize with Mesh Provisioning Service (0x1827). But after provisioning it need to switch to Mesh Proxy Service (0x1828). It's not possible for the softdevice to switch service, so we need to disable softdevice and enable it again, and intialize the 0x1828 service. 


    So I don't know if you already handle that (check NRF_SDH_EVT_STATE_ENABLED event inside sd_state_evt_handler() in mesh_provisionee.c). I suspect that there were an issue that some how the phone confused the location of the DFU Buttonless service and instead of writing to the Mesh Proxy Service it does a write to the DFU Buttonless service and putting it to DFU mode. 
    The error can come from both sides, either the phone did a write to a wrong service, or the firmware on the nRF52 didn't handle it properly. 

Related