app_sched_execute() has no event_handler, SDK 12.1.0

Question

When debugging my application, I'm finding that gdb blocks at app_scheduler.c:253 like this:

(gdb) bt full
#0  0xfffffffe in ?? ()
No symbol table info available.
#1  <signal handler called>
No symbol table info available.
#2  0x00000000 in ?? ()
No symbol table info available.
#3  0x000258e2 in app_sched_execute () at /Users/Eliot/dev/nRF5_SDK_12.1.0_0d23e2a/components/libraries/scheduler/app_scheduler.c:253
        event_index = 2
        p_event_data = 0x20002614 <APP_SCHED_BUF.9106+48>
        event_data_size = 0
        event_handler = 0x0
#4  0x000223ea in main () at main.c:158
        err_code = 0
        main_loop_count = 5

Here is app_sched_execute() in SDK 12.1.0:

void app_sched_execute(void)
{
    while (!is_app_sched_paused() && !APP_SCHED_QUEUE_EMPTY())
    {
        // Since this function is only called from the main loop, there is no
        // need for a critical region here, however a special care must be taken
        // regarding update of the queue start index (see the end of the loop).
        uint16_t event_index = m_queue_start_index;

        void * p_event_data;
        uint16_t event_data_size;
        app_sched_event_handler_t event_handler;

        p_event_data = &m_queue_event_data[event_index * m_queue_event_size];
        event_data_size = m_queue_event_headers[event_index].event_data_size;
        event_handler   = m_queue_event_headers[event_index].handler;

        event_handler(p_event_data, event_data_size); // L253

        // Event processed, now it is safe to move the queue start index,
        // so the queue entry occupied by this event can be used to store
        // a next one.
        m_queue_start_index = next_index(m_queue_start_index);
    }
}

I can't see how my handler can ever be 0x0. I've been through my application and there are eight places where I schedule something with app_sched_event_put(). They all look fine. I check the return code each time anyway.

app_sched_execute() is safe to call when nothing has been scheduled, right?

My main loop is pretty typical:

while (true)
{
	main_loop_count++;

	// Do some stuff

	app_sched_execute();

	err_code = sd_app_evt_wait();
	APP_ERROR_CHECK(err_code);
}

Edit 1

I added NRF_LOG statements to app_scheduler.c on both app_sched_event_put() and app_sched_execute() to show the address of the handler. I also modified app_sched_execute() to just continue when the handler was NULL (not a good fix, but helped me debug).

Here's a grep of the log:

:INFO:app_sched_event_put(): Handler: 0x24005
:INFO:app_sched_execute(): Handler: 0x24005
:INFO:app_sched_event_put(): Handler: 0x24109
:INFO:app_sched_execute(): Handler: 0x24109
:INFO:app_sched_event_put(): Handler: 0x24369
:INFO:app_sched_execute(): Handler: 0x24369
:INFO:app_sched_event_put(): Handler: 0x20fb9
:INFO:app_sched_execute(): Handler: 0x20fb9
:INFO:app_sched_event_put(): Handler: 0x23f49
:INFO:app_sched_execute(): Handler: 0x23f49
:INFO:app_sched_execute(): Handler: 0x23f49
:INFO:app_sched_execute(): Handler: 0x0
:INFO:app_sched_execute(): Handler: 0x0
:INFO:app_sched_execute(): Handler: 0x0
:INFO:app_sched_execute(): Handler: 0x0
:INFO:app_sched_execute(): Handler: 0x0

So it's all going fine until 0x23f49 comes along and gets executed twice and then screws everything up. I looked in my map file for that address and didn't find it exactly, but found 0x234f8 instead:

 .text.state_event_modem_on
                0x0000000000023f48       0x84 _build/state.o
                0x0000000000023f48                state_event_modem_on

state_event_modem_on() is indeed the function I expect to be scheduled at this point, but it's at 0x23f48, NOT 0x23f49. Ignoring that for a second, if I look into what state_event_modem_on() does, it eventually calls a function that sits waiting for 6 seconds for our modem to respond to an AT command over UART. While it's waiting, it calls app_sched_execute().

Presumably, this is bad. I presume a function that has itself been scheduled can not run the scheduler queue? Therefore I should not schedule anything that's long-running. But then I also don't want to run long-running things in the main loop. So what do I do?

Susheel Nuguru · Accepted Answer

There is some conflict in what you do.
app_sched_execute()->state_event_modem_on()->after sometime->app_sched_execute() 
 If you look at the app_Sched_execute() function it shows you that the queue is modified after your event_handler returns. So what is happening is that your event handler is getting called multiple times even though it was put in the queue only once. 
 Yes you are right, cannot call app_sched_execute() from event_handler that will be called by app_sched_execute() causing loops and undefined behavior. 
 I am not sure what you mean by long running things .. why can't you move your waiting functionality wait in main and set a flag in state_event_modem_on() for the wait to exit.

app_sched_execute() has no event_handler, SDK 12.1.0

Top Replies