ZigBee Application crash - ERROR 29 [RET_OVERFLOW] at [...]zigbee_app_utils.c:849

Application crashes with the following log

[06:45:22.727,691] <err> zigbee_app_utils: ERROR 29 [RET_OVERFLOW] at WEST_TOPDIR/nrf/subsys/zigbee/lib/zigbee_app_utils/zigbee_app_utils.c:849
[06:45:22.727,691] <err> zboss_osif: ZBOSS fatal error occurred
The relevant lines from zigbee_app_utils
 
#if defined CONFIG_ZIGBEE_ROLE_END_DEVICE
wait_for_user_input = false;
is_rejoin_start_scheduled = false;

zb_ret_t zb_err_code = ZB_SCHEDULE_APP_ALARM(
stop_network_rejoin,
ZB_TRUE,
ZB_MILLISECONDS_TO_BEACON_INTERVAL(
ZB_DEV_REJOIN_TIMEOUT_MS));
ZB_ERROR_CHECK(zb_err_code);
#endif
It seems that the Alarm cannot be scheduled due to RET_OVERFLOW.

- The implementation of the signal handler is here https://github.com/Rogger/schneggi-sensor/blob/64b2408f1053a15622c618249d887ca7694eb250/src/main.c#L704, in there is as well the call to app utils.
Context:
SDK & toolchain in use 2.5.0
Any help or guidance is highly appreciated, thank you in advance
Best Regards
Michael
Parents
  • Hi,

    Could you get me the network key for this trace?

    Regards,

    Elfving

  • Hi,

    Sure, the key: ffb12ed4228949b66f1a72143a263808 (normal byte order).

    Thanks!

  • Hi Elfving,

    I implemented the workaround KRKNWK-12017 and increased the Q_SIZE to 128.

    The rejoin functionality seems to work now correctly, but unfortunately the fatal error still exists. It might take longer to crash than before, in my case about 33 hours.

    [33:38:13.675,354] <dbg> zboss_osif: zigbee_event_poll: Received new Zigbee event: 0x01
    [33:38:13.750,335] <dbg> zboss_osif: zb_trace_msg_port_do: 
                                         de ad 1a 02 13 a5 67 23  7c 00 39 03 1f 00 00 00 |......g# |.9.....
                                         31 53 03 00 3a a8 78 00  00 00 00 00             |1S..:.x. ....    
    --- 117 messages dropped ---
    [33:38:13.750,396] <dbg> zboss_osif: zb_trace_msg_port_do: 
                                         de ad 1a 02 13 a5 68 23  7c 00 39 03 36 00 00 00 |......h# |.9.6...
                                         31 53 03 00 41 a8 78 00  00 00 00 00             |1S..A.x. ....    
    [33:38:13.750,488] <dbg> zboss_osif: zb_trace_msg_port_do: 
                                         de ad 1a 02 13 a5 69 23  7c 00 39 03 5e 00 00 00 |......i# |.9.^...
                                         31 53 03 00 48 a8 78 00  00 00 00 00             |1S..H.x. ....    
    [33:38:13.750,549] <dbg> zboss_osif: zb_trace_msg_port_do: 
                                         de ad 1a 02 13 a5 6a 23  7c 00 39 03 21 00 00 00 |......j# |.9.!...
                                         31 53 03 00 4f a8 78 00  00 00 00 00             |1S..O.x. ....    
    [33:38:13.750,640] <dbg> zboss_osif: zb_trace_msg_port_do: 
                                         de ad 1a 02 13 a5 6b 23  7c 00 39 03 64 00 00 00 |......k# |.9.d...
                                         31 53 03 00 57 a8 78 00  00 00 00 00             |1S..W.x. ....    
    [33:38:13.750,732] <dbg> zboss_osif: zb_trace_msg_port_do: 
                                         de ad 1a 02 13 a5 6c 23  7c 00 39 03 0b 00 00 00 |......l# |.9.....
                                         31 53 03 00 5e a8 78 00  00 00 00 00             |1S..^.x. ....    
    [33:38:13.750,793] <dbg> zboss_osif: zb_trace_msg_port_do: 
                                         de ad 1a 02 13 a5 6d 23  7c 00 39 03 51 00 00 00 |......m# |.9.Q...
                                         31 53 03 00 65 a8 78 00  00 00 00 00             |1S..e.x. ....    
    [33:38:13.750,885] <dbg> zboss_osif: zb_trace_msg_port_do: 
                                         de ad 1a 02 13 a5 6e 23  7c 00 39 03 2d 00 00 00 |......n# |.9.-...
                                         31 53 03 00 6c a8 78 00  00 00 00 00             |1S..l.x. ....    
    [33:38:13.750,946] <dbg> zboss_osif: zb_trace_msg_port_do: 
                                         de ad 1a 02 13 a5 6f 23  7c 00 39 03 01 00 00 00 |......o# |.9.....
                                         31 53 03 00 74 a8 78 00  00 00 00 00             |1S..t.x. ....    
    [33:38:13.751,037] <dbg> zboss_osif: zb_trace_msg_port_do: 
                                         de ad 1a 02 13 a5 70 23  7c 00 39 03 15 00 00 00 |......p# |.9.....
                                         31 53 03 00 7a a8 78 00  00 00 00 00             |1S..z.x. ....    
    [33:38:13.751,129] <dbg> zboss_osif: zb_trace_msg_port_do: 
                                         de ad 1a 02 13 a5 71 23  7c 00 39 03 35 00 00 00 |......q# |.9.5...
                                         31 53 03 00 82 a8 78 00  00 00 00 00             |1S....x. ....    
    [33:38:13.751,190] <dbg> zboss_osif: zb_trace_msg_port_do: 
                                         de ad 1a 02 13 a5 72 23  7c 00 39 03 7c 00 00 00 |......r# |.9.|...
                                         31 53 03 00 89 a8 78 00  00 00 00 00             |1S....x. ....    
    [33:38:13.751,281] <dbg> zboss_osif: zb_trace_msg_port_do: 
                                         de ad 1a 02 13 a5 73 23  7c 00 39 03 00 00 00 00 |......s# |.9.....
                                         01 ef 03 00 9c d9 78 00  00 00 00 00             |......x. ....    
    [33:38:13.751,342] <dbg> zboss_osif: zb_trace_msg_port_do: 
                                         de ad 1a 02 13 a5 74 23  7c 00 39 03 05 00 00 00 |......t# |.9.....
                                         29 2a 02 00 40 78 79 00  00 00 00 00             |)*..@xy. ....    
    [33:38:13.751,434] <dbg> zboss_osif: zb_trace_msg_port_do: 
                                         de ad 0a 02 13 a5 75 23  7c 00 98 03             |......u# |...    
    [33:38:13.751,495] <dbg> zboss_osif: zb_trace_msg_port_do: 
                                         de ad 12 02 13 a5 76 23  60 08 4c 00 7c 00 00 00 |......v# `.L.|...
                                         99 03 00 00                                      |....             
    [33:38:13.751,525] <err> zboss_osif: ZBOSS fatal error occurred

    I attached the Wireshark capture. Search for the "rejoin request", this is when the device restarted after fatal error.
    290124.zip

    What else could I try?

    Thank you very much
    MR

  • Hi Rogger, 

    I struggle with similar issue (still without solution) but I took a while to take a look at your code. 

    Modify your sensor_loop callback as follows: 

    static void sensor_loop(zb_bufid_t bufid){
    	static uint16_t cycles = 0;
    
    	LOG_INF("Loop");
    	update_sensor_values();
    	LOG_DBG("%d", cycles % BATTERY_SLEEP_CYCLES);
    	if (cycles % BATTERY_SLEEP_CYCLES == 0)
    	{
    		update_battery();
    		cycles = 0;
    	}
    	cycles++;
    	LOG_INF("Sleep for %d seconds", SLEEP_INTERVAL_SECONDS);
    	
    	/* ADDON START */
        ZB_SCHEDULE_APP_ALARM_CANCEL(sensor_loop, ZB_ALARM_ANY_PARAM); 	
        /* ADDON END */
        
        ZB_SCHEDULE_APP_ALARM(sensor_loop, bufid, 
            ZB_MILLISECONDS_TO_BEACON_INTERVAL(
              SLEEP_INTERVAL_SECONDS * 1000));
    }



    I do not know if it will help, but I suggest to give it a try.
    Please share the results afterwards ;) 

  • Hello pwpot,

    Have you created another DevZone case for this issue yourself?

    Regards,

    Elfving

  • Hi, Elfving

    I am going to but at this moment I cannot provide any detailed info about the issue as device is already on client site.

    I conclude based on some LED blinks and on my own experience.
    Some time ago I made a mistake in code which resulted in ZBOSS Queue overflow. 
    I was scheduling one of the callbacks multiple times without proper cancellation.
    As a result of that ZBOSS scheduler stopped working and the ZBOSS Queue did not recover from such state.

    In my case the problem is also related with rejoin and it happens less often then how it is in Rogger code, its days, sometimes weeks, not hours. 
    I assume this is because Rogger is scheduling ZBOSS callback to use it as a main while loop (very often) where I am doing it mostly during rejoin.

    I am not sure if ZBOSS deadlocks or sth and if can it be detected by classic watchdog. 

    Is it possible to increase ZB_CONFIG_SCHEDULER_Q_SIZE and still be FOTA compatible ? 
    Developing with ZBOSS guide says that: 
    The memory configuration must not be changed between firmware versions if the firmware is going to be upgraded using the OTA DFU.

    I think it is very generic info and maybe ZB_CONFIG_SCHEDULER_Q_SIZE is not the part of that.

  • Hi Pwpot,

    Thanks for the tip, unfortunately this did not help.

Reply Children
Related