When debugging my application, I'm finding that gdb blocks at app_scheduler.c:253 like this:
(gdb) bt full
#0 0xfffffffe in ?? ()
No symbol table info available.
#1 <signal handler called>
No symbol table info available.
#2 0x00000000 in ?? ()
No symbol table info available.
#3 0x000258e2 in app_sched_execute () at /Users/Eliot/dev/nRF5_SDK_12.1.0_0d23e2a/components/libraries/scheduler/app_scheduler.c:253
event_index = 2
p_event_data = 0x20002614 <APP_SCHED_BUF.9106+48>
event_data_size = 0
event_handler = 0x0
#4 0x000223ea in main () at main.c:158
err_code = 0
main_loop_count = 5
Here is app_sched_execute()
in SDK 12.1.0:
void app_sched_execute(void)
{
while (!is_app_sched_paused() && !APP_SCHED_QUEUE_EMPTY())
{
// Since this function is only called from the main loop, there is no
// need for a critical region here, however a special care must be taken
// regarding update of the queue start index (see the end of the loop).
uint16_t event_index = m_queue_start_index;
void * p_event_data;
uint16_t event_data_size;
app_sched_event_handler_t event_handler;
p_event_data = &m_queue_event_data[event_index * m_queue_event_size];
event_data_size = m_queue_event_headers[event_index].event_data_size;
event_handler = m_queue_event_headers[event_index].handler;
event_handler(p_event_data, event_data_size); // L253
// Event processed, now it is safe to move the queue start index,
// so the queue entry occupied by this event can be used to store
// a next one.
m_queue_start_index = next_index(m_queue_start_index);
}
}
I can't see how my handler can ever be 0x0. I've been through my application and there are eight places where I schedule something with app_sched_event_put()
. They all look fine. I check the return code each time anyway.
app_sched_execute()
is safe to call when nothing has been scheduled, right?
My main loop is pretty typical:
while (true)
{
main_loop_count++;
// Do some stuff
app_sched_execute();
err_code = sd_app_evt_wait();
APP_ERROR_CHECK(err_code);
}
Edit 1
I added NRF_LOG
statements to app_scheduler.c on both app_sched_event_put()
and app_sched_execute()
to show the address of the handler. I also modified app_sched_execute()
to just continue when the handler was NULL (not a good fix, but helped me debug).
Here's a grep of the log:
:INFO:app_sched_event_put(): Handler: 0x24005
:INFO:app_sched_execute(): Handler: 0x24005
:INFO:app_sched_event_put(): Handler: 0x24109
:INFO:app_sched_execute(): Handler: 0x24109
:INFO:app_sched_event_put(): Handler: 0x24369
:INFO:app_sched_execute(): Handler: 0x24369
:INFO:app_sched_event_put(): Handler: 0x20fb9
:INFO:app_sched_execute(): Handler: 0x20fb9
:INFO:app_sched_event_put(): Handler: 0x23f49
:INFO:app_sched_execute(): Handler: 0x23f49
:INFO:app_sched_execute(): Handler: 0x23f49
:INFO:app_sched_execute(): Handler: 0x0
:INFO:app_sched_execute(): Handler: 0x0
:INFO:app_sched_execute(): Handler: 0x0
:INFO:app_sched_execute(): Handler: 0x0
:INFO:app_sched_execute(): Handler: 0x0
So it's all going fine until 0x23f49 comes along and gets executed twice and then screws everything up. I looked in my map file for that address and didn't find it exactly, but found 0x234f8 instead:
.text.state_event_modem_on
0x0000000000023f48 0x84 _build/state.o
0x0000000000023f48 state_event_modem_on
state_event_modem_on()
is indeed the function I expect to be scheduled at this point, but it's at 0x23f48, NOT 0x23f49. Ignoring that for a second, if I look into what state_event_modem_on()
does, it eventually calls a function that sits waiting for 6 seconds for our modem to respond to an AT command over UART. While it's waiting, it calls app_sched_execute()
.
Presumably, this is bad. I presume a function that has itself been scheduled can not run the scheduler queue? Therefore I should not schedule anything that's long-running. But then I also don't want to run long-running things in the main loop. So what do I do?