This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

Sporadic Notification Bursts

Hi everybody,

we have a remote control based on the nRF52832 and are experiencing a wierd issue with notifications.

We are using a custom 6 byte payload keymap characteristic in a custom service, that uses notifications to send keypresses to the host. Usually this works fine. However when the remote is left idle for a couple of minutes, sporadically the next few notifications are bursted several times. This not an issue of our code calling the sd_ble_gatts_hvx funtion too often (we tested guarding it with app timer checks). If this issue occurs then the whole timeslot is filled with identical notifications, with the more data flag being set. Sometimes the issue is resolved with the next notification (key release), sometimes it continues for the next few button presses too, sometimes the release code is lost.

You can see one of these bursts as captured by a PCA10056 sniffer in Wireshark. Note that I have never seen one of the bursts before a LL_PING_REQ, but a ping does not reliably trigger this behaviour (in fact there are several pings, followed by normal notifications just before in this log).

Our initial project is based on the SDK 13.0.0 and the Bluetooth Developer Studio example using Softdevice 4.0.5. For debugging the issue the project was ported to SDK 14.2.0 with SD 5.1.0. We are using GNU Tools ARM Embedded/5.4 2016q2 on Windows 10 x64, but have tried the default GNU Tools ARM Embedded/4.9 2015q3 as well. Neither the SDK/SD nor the compiler switch changed the behaviour. It occurs connected to our custom central (nRF52832) as well as connected to an Android 9 Smartphone running nRF Connect. It does not seem to be a signal strength issue, since it occurs even with central reported RSSIs of -58dBm.

We do use a few non-standard softdevice settings for longer MTUs and event lengths due to other data heavy services, but since the problem occurs even without them those should not interfere here.

I have attached a minimal project, that shows this behaviour on the PCA10040 board. The only services used are the battery levels and the remote control service. It needs a plain SDK 14.2.0 and SD 5.1.0. Place the projects dir in the SDK root and replace the SDK bsp.h with the one from the archive (adds custom events). 

Connect to the "Remote PCA10040" device, enable service 0xdb60's 0xdb61 characteristics notifications and press the buttons. On push each button transmits a 6 byte bitmap that resembles our remote control. This message is repeated every 100ms until the button is released, at which point an empty bitmap is transmitted.

demo_minimal.zip

Could you please help us fixing this wierd issue? We are really running out of ideas what might cause this..

Thank you!

Jann

  • I've now switched the app timer to the scheduler based version and here it gets kind of interesting: Normally this works without problems, but when the issue occurs, the timeout function is called endlessly (with the non-scheduler approach the timeout callback is called a few times and then stops on its own, presumably when the queue is full). Eventually triggering a timer start or stop by pressing a button stops it, but not always. 

    Suffice to say I am pretty confused by now :)

    €: So it's not the queue being filled up. Turned scheduler off again, built the demo with the profiler on and printed the app timer queue utilization with every timeout and it never rises over 3. Also built the app timer with the keep rtc active flag, but no dice.

  • Hello Jann,

    Sorry for the late reply. I was out of office on Friday. 

    I am having some issues replicating the issue. I did manage to reproduce it a couple of times with the project that you sent me last week, but not with the new  one.

    However, I recall that there was a bug in the app_timer at some point in time. I remembered it when you said "the timeout function is called endlessly". I will look for more details regarding this bug, but can you check one thing for me:

    I see that you have (by default) only one app_timer running while you are not pressing any buttons, which is the battery timer, is that correct?

    The bug was present while there only was one active timer, which had a long timeout (longer than half of the app_timer maximum count value, that is 24 bits = 0xFFFFFF = 16 777 215.

    I believe the app_timer is capable of having timeouts longer thant he maximum count value (it wraps around a couple of times if it needs to), but the bug is related to the fact that there is only one active timer with a timeout of over half of this.

    So, to work around this, you have two options. Either, you can increase the prescaler for the app_timer, so that the BATTERY_LEVEL_MEAS_INTERVAL will be less than 0xFFFFFF/2, or you can create a dummy app_timer instance. A timer that times out every minute or so, that doesn't do anything. It just has an empty timeout handler.

    Can you test this for me, and see if that solves the issue?

    Best regards,

    Edvin

  • Hi Edvin,

    not a problem, I've been on vacation the last week and just got back into the office today myself - hence my late reply.

    Yes, there is only the battery measurement timer running permanently, all other timers are stopped in idle mode.

    Thank you for the hint with the app_timer bug. I've now added a continuous idle timer at 0x7fff00 ticks and so far the issue has not occurred again in the test setup. If this does the trick here I'll add the timer to our main project and see if that fixes the issue for good. I'll let you know how it went!

    Thank you & best wishes,

    Jann

  • Hello Jann,

    That is good news. 

    Yes. I believe this bug has been present in some SDK versions, but a bit difficult to remember exactly which. I believe it came, was fixed, and came back again at some point before being fixed again ...

    Let me know if it doesn't solve the issue. 

    Best regards, 

    Edvin

Related