No heap space for incoming notifications

We have an application running on an nRF9160 development board (shortly to be ported to a production board), which listens on a serial link for sensor data, which is then sent via udp/dtls, via NB-IoT.

The development board is connected to a serial terminal for diagnostics.

After several messages have been sent there's a warning message printed out on the console:

"W: No heap space for incoming notification: +CSCON: 0"

"W: No heap space for incoming notification: +CSCON: 1"

I've tried doubling heap space and also system workqueue stack size in prj.conf

# Heap and stacks

CONFIG_HEAP_MEM_POOL_SIZE=4096

CONFIG_MAIN_STACK_SIZE=4096

CONFIG_SYSTEM_WORKQUEUE_STACK_SIZE=4096

However this has made no difference.

There is no apparent impact on the application itself, but I would of course prefer to properly handle whatever is causing the warning.

Help with this would be appreciated. Thanks.

Top Replies

Parents

+1 Hakon over 2 years ago

Hello,

are you using the system workqueue in your application? It's possible that the system workqueue is running tasks that are blocking the at monitor from running so the at notification fifo won't clear out. Have you checked this?
Cancel
Vote Up +1 Vote Down

Sign in to reply

Reject Answer

Cancel
0 Ron Segal over 2 years ago in reply to Hakon

Thanks for the reply, appreciate your picking this up.

Apart from the work task waiting for messages to arrive in a message queue:

k_msgq_get(&receive_event_msq, &rxevt, K_FOREVER);

There are no other tasks that have been started by me to run on the system work queue.

New to Zephyr, I've been assuming that waiting on a k_msqq_get would automatically yield to allow other tasks on the system work queue to run. Is that incorrect?
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Ron Segal over 2 years ago in reply to Achim Kraus
Hi Achim,

server_transmission_worker_init(); // initialise the udp worker baseuart_connect_async(baseuart); // Also creates dma buffer storage //## Kick off main transmission worker here k_work_submit(&server_transmission_work); // uses system workthread

Yes, in main() a system workqueue worker task to transmit/receive data over dtls/NB-IoT is initialised, a uart connection to another mcu is also initialised, then the system workqueue transmission worker task is started. This waits on k_msgq_get for messages that are created on the queue by bytes coming across the uart link by the (simple) uart dma interrupt routine that marshals the bytes into a message pushed onto the message queue. Those messages are then sent via dtls/NB-IoT to a dtls2mqtt gateway with responses being sent back across the uart link.

Am quite open to modifying this design if there is a better approach, or if it is preferable that the work is done in an application workqueue rather than the system one.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Achim Kraus over 2 years ago in reply to Ron Segal

I'm mainly a java developer, so I'm not that used to zephyr.

As far as I understand zephyr and the idea of a job-queue, it's no good practice to wait in such a job.

But you may wait in you main-thread, or you may use an own thread, which then is able to wait.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Ron Segal over 2 years ago in reply to Achim Kraus

Thanks, was aware that the system queue thread isn't to be blocked for any significant length of time. I'd assumed that a k_msgq_get would implement an automatic yield but maybe this is wrong. Apart from using a different thread, another solution might be to create an 'automatic yield' by waiting on the message queue with a k_msqq_get with a short timeout period (rather than waiting forever), then call yield, then loop back to k_msqq_get and so on, only exiting the loop when a message is received. Will perhaps try that simple change anyway, see what happens.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Achim Kraus over 2 years ago in reply to Ron Segal

A short timeout in k_msgq_get will be a polling. Maybe working.

There is also some more sophisticated function (e.g. Events).

A thread is not that complicated and changing that job into a thread should not take too long.

Anyway, it's you to decide.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Ron Segal over 2 years ago in reply to Achim Kraus

Can't see a downside to polling in this case as nothing else needs to be done in user land and it isn't possible to miss a message. However, I may try using a different thread at least to accumulate more practical experience with Zephyr. Anyway, thanks, this has been really helpful. Will report back later on results. Cheers Ron.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Reply

0 Ron Segal over 2 years ago in reply to Achim Kraus

Can't see a downside to polling in this case as nothing else needs to be done in user land and it isn't possible to miss a message. However, I may try using a different thread at least to accumulate more practical experience with Zephyr. Anyway, thanks, this has been really helpful. Will report back later on results. Cheers Ron.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Children

0 Achim Kraus over 2 years ago in reply to Ron Segal

That polling causes energy consumption, a wait not.

But, yes, check a short polling interval and we will see, if that helps.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Ron Segal over 2 years ago in reply to Achim Kraus

In the end, it was as you indicated Achim, ridiculously simple, in about 4 lines of code, to create another workqueue thread and assign the work task to that with no other changes. Since doing that a few hours ago the application has been almost continuously running with no warnings. Will see what happens overnight, then all being well close this question .. again! Cheers Ron.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Achim Kraus over 2 years ago in reply to Ron Segal

Great news!
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

0 Ron Segal over 2 years ago in reply to Achim Kraus

Running for more than 12 hours now with no warning messages. Clearly the problem is solved.

Am including code here in case it helps others.

#define TRANSMISSION_STACK_SIZE 1024
#define TRANSMISSION_PRIORITY 5

K_THREAD_STACK_DEFINE(transmission_stack_area, TRANSMISSION_STACK_SIZE); // define memory for application workqueue

struct k_work_q transmission_work_q; // application workqueue

static struct k_work server_transmission_work; // A work Q element - infinite loop that receives, parses and acts on message events

...


k_work_queue_init(&transmission_work_q); // intialise workqueue

// start workqueue
k_work_queue_start(&transmission_work_q, transmission_stack_area,
                   K_THREAD_STACK_SIZEOF(transmission_stack_area), TRANSMISSION_PRIORITY,
                   NULL);

k_work_init(&server_transmission_work, server_transmission_work_fn);  // initialise worker task - points to function that does the work

// k_work_submit(&server_transmission_work);  // uses system workqueue
k_work_submit_to_queue(&transmission_work_q, &server_transmission_work); // worker task uses application workqueue

Initially when the stack size of the workqueue was set to 512 the device panicked. At 1024 it is running perfectly.

Thanks again. Cheer Ron.