This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

app_sched_execute() has no event_handler, SDK 12.1.0

When debugging my application, I'm finding that gdb blocks at app_scheduler.c:253 like this:

(gdb) bt full
#0  0xfffffffe in ?? ()
No symbol table info available.
#1  <signal handler called>
No symbol table info available.
#2  0x00000000 in ?? ()
No symbol table info available.
#3  0x000258e2 in app_sched_execute () at /Users/Eliot/dev/nRF5_SDK_12.1.0_0d23e2a/components/libraries/scheduler/app_scheduler.c:253
        event_index = 2
        p_event_data = 0x20002614 <APP_SCHED_BUF.9106+48>
        event_data_size = 0
        event_handler = 0x0
#4  0x000223ea in main () at main.c:158
        err_code = 0
        main_loop_count = 5

Here is app_sched_execute() in SDK 12.1.0:

void app_sched_execute(void)
{
    while (!is_app_sched_paused() && !APP_SCHED_QUEUE_EMPTY())
    {
        // Since this function is only called from the main loop, there is no
        // need for a critical region here, however a special care must be taken
        // regarding update of the queue start index (see the end of the loop).
        uint16_t event_index = m_queue_start_index;

        void * p_event_data;
        uint16_t event_data_size;
        app_sched_event_handler_t event_handler;

        p_event_data = &m_queue_event_data[event_index * m_queue_event_size];
        event_data_size = m_queue_event_headers[event_index].event_data_size;
        event_handler   = m_queue_event_headers[event_index].handler;

        event_handler(p_event_data, event_data_size); // L253

        // Event processed, now it is safe to move the queue start index,
        // so the queue entry occupied by this event can be used to store
        // a next one.
        m_queue_start_index = next_index(m_queue_start_index);
    }
}

I can't see how my handler can ever be 0x0. I've been through my application and there are eight places where I schedule something with app_sched_event_put(). They all look fine. I check the return code each time anyway.

app_sched_execute() is safe to call when nothing has been scheduled, right?

My main loop is pretty typical:

while (true)
{
	main_loop_count++;

	// Do some stuff

	app_sched_execute();

	err_code = sd_app_evt_wait();
	APP_ERROR_CHECK(err_code);
}

Edit 1

I added NRF_LOG statements to app_scheduler.c on both app_sched_event_put() and app_sched_execute() to show the address of the handler. I also modified app_sched_execute() to just continue when the handler was NULL (not a good fix, but helped me debug).

Here's a grep of the log:

:INFO:app_sched_event_put(): Handler: 0x24005
:INFO:app_sched_execute(): Handler: 0x24005
:INFO:app_sched_event_put(): Handler: 0x24109
:INFO:app_sched_execute(): Handler: 0x24109
:INFO:app_sched_event_put(): Handler: 0x24369
:INFO:app_sched_execute(): Handler: 0x24369
:INFO:app_sched_event_put(): Handler: 0x20fb9
:INFO:app_sched_execute(): Handler: 0x20fb9
:INFO:app_sched_event_put(): Handler: 0x23f49
:INFO:app_sched_execute(): Handler: 0x23f49
:INFO:app_sched_execute(): Handler: 0x23f49
:INFO:app_sched_execute(): Handler: 0x0
:INFO:app_sched_execute(): Handler: 0x0
:INFO:app_sched_execute(): Handler: 0x0
:INFO:app_sched_execute(): Handler: 0x0
:INFO:app_sched_execute(): Handler: 0x0

So it's all going fine until 0x23f49 comes along and gets executed twice and then screws everything up. I looked in my map file for that address and didn't find it exactly, but found 0x234f8 instead:

 .text.state_event_modem_on
                0x0000000000023f48       0x84 _build/state.o
                0x0000000000023f48                state_event_modem_on

state_event_modem_on() is indeed the function I expect to be scheduled at this point, but it's at 0x23f48, NOT 0x23f49. Ignoring that for a second, if I look into what state_event_modem_on() does, it eventually calls a function that sits waiting for 6 seconds for our modem to respond to an AT command over UART. While it's waiting, it calls app_sched_execute().

Presumably, this is bad. I presume a function that has itself been scheduled can not run the scheduler queue? Therefore I should not schedule anything that's long-running. But then I also don't want to run long-running things in the main loop. So what do I do?

Related