nRF52840 app not sleeping, SoftDevice seems involved

Hello-

I'm seeing the nRF52840 stay awake when I don't want it to. I'm calling sd_app_evt_wait() every turn through the main loop with SEVONPEND disabled, but it's waking up and falling through.

I'm capturing a histogram of the entire IRQ set, now including the 16 core exceptions, and I found a "smoking gun"- There's almost a 1:1 correspondence between my main loop turns and the core SVCALL_IRQn exception interrupt:

Reading the nRFSDK / SoftDevice headers, it looks like there's a syscall-style interface- when calling from the app into SoftDevice, it issues the SVC with the call id encoded in the instruction.

My hunch about what's happening here is that I've managed to get my app into a state where it's calling into SoftDevice (maybe to test something?) once per main loop turn. That issues an SVC, which sets the Event register. Then, when I call into sd_app_evt_wait() at the end of my loop, it clears the Event register and simply continues. The cycle repeats again, and my app never sleeps.

So, here are some questions I was hoping an expert could give me some advice + insight into please!

1. Is my hypothesis reasonable? Could this be happening? Does SVC set the Event register?

2. Is there anything in the nRFSDK that lets me audit SoftDevice supervisor calls in an automated way? (I think I can build this into my IRQ capture-and-forward shim)

3. Do you have any other ideas or suggestions?

Sorry to file a relatively open-ended question, but time is critical for us so I'd love any thoughts you all might have!

Best,

Charles

Parents
  • Ah, sorry, I'm going to give readers some whiplash here-

    I think my hypothesis was incorrect! If I'm seeing an almost exactly 1:1 relationship between SVCALL_IRQn calls and main loop turns and sleep calls, that means that things are working ok because sd_app_evt_wait() is run via a SVCALL.

    So I want to explore a different track, perhaps it's a GPIO issue. I see in section 6.9 of the nRF52840 Product Specification:

    I see that "Wake-up from high or low level triggers on all pins" is a separate line item from "Trigger interrupt on state changes on any pin".

    Is it possible to configure the GPIO and/or GPIOTE subsystems so that the level of a pin will cause a wake-up, and/or keep the system from sleeping? We are using PPI to drive a wifi/ble coexistence pin and it's tied to PA/LNA, does the PPI system have level-based wakes anywhere?

    Note that my IRQ histogram shows that I am not getting GPIOTE_IRQn 100,000+ times per minute, so it's not a strobing pin:

    I'll keep studying the data sheet, but if anyone has any information or quick answers about pin levels being able to keep the nRF52840 from entering WFE sleep, that would help me greatly Slight smile

    Thanks,

    Charles

  • Hi Charles, 

    The main purpose of PPI is to not get the CPU involved in the automation of event and task. So PPI wouldn't wake the CPU up. But the event/interrupt can.
    I think what important here is that if you have an interrupt that trigger and even tyou clear the event after you handle the interrupt. If not the CPU won't be able to enter sleep mode if the event has not been cleared. 

    Have you tried to disable the GPIO interrupt to check if it has something to do with waking up the CPU ? 


    The softdevice only trigger an event when there is BLE activity that require the application to handle. So It's not normal that the application get waking up on every sd_app_evt_wait(). Note that sd_app_evt_wait() will not exist when the softdevice is active to handle BLE event. It only exist when there is application interrupt or when there is BLE event.

    I would also suggest to use the Power Profiler Kit to monitor the activity of the CPU so that we have better view of when the CPU waking up or if the CPU enters sleep mode at all. 

  • Thanks for the response, I appreciate it. Unfortunately this only happens sometimes, and I can't get it to happen at my bench, so I can't use local setups like the Power Profiler Kit.

    I do have real-time battery statistics I can watch that we collect for this purpose, though, and this issue is happening very rarely.

    Since writing this post, I reverse-engineered the SoftDevice binary, and disassembled the functions on the path from the SVCall ISR into the implementation of sd_app_evt_wait(), I see that it loops back into the wfe call after every softdevice interrupt and only returns when a non-softdevice interrupt happens. So, I don't think it's SoftDevice's fault here!

    I am curious what you mean by 

    "I think what important here is that if you have an interrupt that trigger an event you clear the event after you handle the interrupt. If not the CPU won't be able to enter sleep mode if the event has not been cleared."

    Do you meant the PPI event here? I am using the PPI task/event system to tie the PA + LNA pins to a WiFi coex pin on a separate chip on our board. Could I ask you for more information about "clear the event" please?

    Thank you again,

    Charles

Reply
  • Thanks for the response, I appreciate it. Unfortunately this only happens sometimes, and I can't get it to happen at my bench, so I can't use local setups like the Power Profiler Kit.

    I do have real-time battery statistics I can watch that we collect for this purpose, though, and this issue is happening very rarely.

    Since writing this post, I reverse-engineered the SoftDevice binary, and disassembled the functions on the path from the SVCall ISR into the implementation of sd_app_evt_wait(), I see that it loops back into the wfe call after every softdevice interrupt and only returns when a non-softdevice interrupt happens. So, I don't think it's SoftDevice's fault here!

    I am curious what you mean by 

    "I think what important here is that if you have an interrupt that trigger an event you clear the event after you handle the interrupt. If not the CPU won't be able to enter sleep mode if the event has not been cleared."

    Do you meant the PPI event here? I am using the PPI task/event system to tie the PA + LNA pins to a WiFi coex pin on a separate chip on our board. Could I ask you for more information about "clear the event" please?

    Thank you again,

    Charles

Children
  • Sorry- to be more complete here, I'm doing this:

    void ble_pa_lna_enable(void) {
      if (s_util.pa_lna_enabled) {
        return;
      }
    
      APP_ERROR_CHECK_BOOL(s_util.radio_gpios.notification_gpio != NULL);
      APP_ERROR_CHECK_BOOL(s_util.radio_gpios.lna_gpio != NULL);
      APP_ERROR_CHECK_BOOL(s_util.radio_gpios.pa_gpio != NULL);
    
      s_util.pa_lna_enabled = true;
      nrf_ppi_channel_t ch_set, ch_clr;
      APP_ERROR_CHECK(nrfx_ppi_channel_alloc(&ch_set));
      APP_ERROR_CHECK(nrfx_ppi_channel_alloc(&ch_clr));
      APP_ERROR_CHECK(sd_ble_opt_set(BLE_COMMON_OPT_PA_LNA,
                                     &(ble_opt_t) {
                                       .common_opt = {
                                         .pa_lna = {
                                           .pa_cfg = {
                                             .enable = 1,
                                             .active_high = 1,
                                             .gpio_pin = s_util.radio_gpios.pa_gpio->pin,
                                           },
                                           .lna_cfg = {
                                             .enable = 1,
                                             .active_high = 1,
                                             .gpio_pin = s_util.radio_gpios.lna_gpio->pin,
                                           },
                                           .ppi_ch_id_set = ch_set,
                                           .ppi_ch_id_clr = ch_clr,
                                           .gpiote_ch_id = PA_LNA_GPIOTE_CHANNEL_ID,
                                         },
                                       },
                                     }));
    
      nrf_gpiote_task_configure(RADIO_NOTIFICATION_GPIOTE_CHANNEL_ID,
                                s_util.radio_gpios.notification_gpio->pin,
                                (nrf_gpiote_polarity_t)GPIOTE_CONFIG_POLARITY_None,
                                NRF_GPIOTE_INITIAL_VALUE_LOW);
    
      APP_ERROR_CHECK(nrfx_ppi_channel_fork_assign(
        ch_set,
        nrf_gpiote_task_addr_get(
          nrf_gpiote_set_task_get(RADIO_NOTIFICATION_GPIOTE_CHANNEL_ID))));
    
      APP_ERROR_CHECK(nrfx_ppi_channel_fork_assign(
        ch_clr,
        nrf_gpiote_task_addr_get(
          nrf_gpiote_clr_task_get(RADIO_NOTIFICATION_GPIOTE_CHANNEL_ID))));
    }
    
    void ble_radio_notification_enable(void) {
      if (s_util.enabled) {
        return;
      }
      s_util.enabled = true;
      APP_ERROR_CHECK_BOOL(s_util.pa_lna_enabled);
      nrf_gpiote_task_enable(RADIO_NOTIFICATION_GPIOTE_CHANNEL_ID);
    }
    
    void ble_radio_notification_disable(void) {
      if (!s_util.enabled) {
        return;
      }
      s_util.enabled = false;
      APP_ERROR_CHECK_BOOL(s_util.pa_lna_enabled);
      nrf_gpiote_task_disable(RADIO_NOTIFICATION_GPIOTE_CHANNEL_ID);
    }
    

    This code is doing two things-

    1. It's loading the desired PA/LNA pin behavior into SoftDevice using channel 0 (PA_LNA_GPIOTE_CHANNEL_ID) 

    2. It's forking the set and clear events into a new task using channel 1 (RADIO_NOTIFICATION_GPIOTE_CHANNEL_ID)

    This way the PA set event sets my WiFi coex pin, and the LNA clear event clears my WiFi coex pin.

    I don't have the WiFi coex pin set up to trigger an interrupt, because my nRF52840 firmware doesn't need to react to it; I just need the PA + LNA tasks to also set and clear this WiFi coex pin.

    Note that my "disable" call just calls nrf_gpiote_task_disable here- I'm not ever explicitly clearing the PA/LNA events. Is it possible that a poorly-timed call to my ble_radio_notification_disable function is leaving the event in a state where it's preventing the nRF52840 from sleeping in sd_app_evt_wait()?

    In general, is this approach reasonable and sound?

    Thanks again for reading!

  • Hi Charles, 

    What I meant by "clear the event" is to set an event register to 0 when you handle the event. For example like this: 


    But I was wrong when thinking that not clearing it may keep the CPU to not entering sleep mode. The interrupt handler should be able to clear the interrupt source. 
    I don't see any problem in the code you provided. I couldn't think of anything else but an application interrupt that keep waking the CPU up. 

    If you can reproduce the issue and can capture a power profile it should be possible to see if the CPU could enter sleep mode at all. 
    And if possible to reproduce, I would suggest to test on one of our example (with additional code to control PA LNA and coex) to see if you can see the same problem. 

  • Thanks for the response. Because this is is very hard to reproduce, I'm looking for ways to gather more clues from units in the field.

    Another question- if I have a BLE central connect and hold a very fast connection interval, 7.5ms, could that cause behavior like this? sd_app_evt_wait() seems like it doesn't return to my code from radio/timer0/rtc0 interrupts, but I'm curious if it's possible.

    I'm seeing ~100k+ main loop runs per minute, is the only option left that I actually have an interrupt firing that many times (kilohertz range)? I'd expect to see it on my IRQ histogram (see initial post) but I don't.

  • Hi Charles, 
    Is there any chance that it's locked up somewhere and couldn't enter sd_app_evt_wait() ? 

    Short interval of 7.5ms shouldn't be a problem. The BLE activity only takes from 2ms for an connection event. And if there is no BLE activity that requires the app involvement sd_app_evt_wait() will not return because of the BLE activity. 


     Could you give me some more info about the softdevice ? Which version are you using ? 

  • I'm incrementing a counter every time I turn through my main loop, and I'm incrementing a separate counter every time I call sd_app_evt_wait(), and I'm incrementing a counter per-IRQn (all of them, via an assembly VTOR shim that intercepts all interrupts), so I can see all of the softdevice interrupts as well.

    Note that the number of SVCALL interrupts is almost identical to the number of calls to sd_app_evt_wait(), which is also almost identical to the number of main loop turns.

    This data tells me that I am regularly calling my main loop, and that I am also regularly calling sd_app_evt_wait(), and I'm not blasting SoftDevice with other calls.

    Please look to my first and second posts in this thread to observe the data that I've captured; perhaps something will jump out at you? 

    If I had some other high-frequency interrupt occurring, I'd expect to see it in my table above.

    I'm using nRFSDK 17.0.2, SD140 7.0.1, nRF52840.

    Also; I'm explicitly _clearing_ SEVONPEND, is it possible that disabled-but-pending interrupts are somehow contributing to wake?

Related