bt_gatt_notify seems to block when the central does not process the notifications

Hello,

I'm working on an application with a rather unusual/complicated setup, and I found something I cannot really understand.

For context, I'm working with two nodes (a and B), a peripheral device (A) running periodic notifications, and a central+peripheral device (B) that subscribes to the notification characteristic. In addition, a smartphone also subscribes to notifications from A and from B.

I found an unexpected behaviour with the `bt_gatt_notify` and 1bt_gatt_notify_cb` functions. My understanding is that these two functions should never block. They pass the notification data to the bluetooth stack, which takes care of the rest. However, I have found out a situation where this does not seem to be the case.

This happened due to a bug in my central+peripheral device (B). I accidentally created a situation where as soon as the first notification was received, the app CPU would get stuck in an infinite loop, leaving no resources for the workqueues dedicated to the BLE (see figure). 

What I found strange is that once this condition occurred on device B, shortly after device A would block on a call of `bt_gatt_notify`. I verified the same behaviour with `bt_gatt_notify_cb`, but I'm confused because I think this function should not block, and it also seems to be the case by looking in its implementation.

Am I missing something on how this function is supposed to work? The documentation is a bit lacking in this regard (likely because it's implementation-dependent?)

Thank you!

Parents
  • Hi Andrea,

    This happened due to a bug in my central+peripheral device (B). I accidentally created a situation where as soon as the first notification was received, the app CPU would get stuck in an infinite loop, leaving no resources for the workqueues dedicated to the BLE (see figure). 

    I am trying to understand the problem here. Even though you mentioned that this strange behavior you are noticing regarding the calls being blocks is due to a bug that you already know I am not sure how to map this block to the suspected blocking nature of the bt_gatt_notify and the callback. Can you please be patient and post some code snippets to spoon feed me on how you think that this block in device B is due to the bt_gatt_notify or its callback? I could not see that from you systemviewer image, all i can see is that main is now looping infinitely.

Reply
  • Hi Andrea,

    This happened due to a bug in my central+peripheral device (B). I accidentally created a situation where as soon as the first notification was received, the app CPU would get stuck in an infinite loop, leaving no resources for the workqueues dedicated to the BLE (see figure). 

    I am trying to understand the problem here. Even though you mentioned that this strange behavior you are noticing regarding the calls being blocks is due to a bug that you already know I am not sure how to map this block to the suspected blocking nature of the bt_gatt_notify and the callback. Can you please be patient and post some code snippets to spoon feed me on how you think that this block in device B is due to the bt_gatt_notify or its callback? I could not see that from you systemviewer image, all i can see is that main is now looping infinitely.

Children
No Data
Related