Crash issue on sdk v2.9.0

SOC: nrf52840

SDK: v2.9.0

Device is a central, it connected to 16 peripherals, these peripherals will power on 90 seconds and then power off 10 seconds(one cycle).

After ten days of operation, there will be a crash.

Has anyone run into this problem? How do you solve it? Thank you!

Crash 1, reason is K_ERR_KERNEL_OOPS=3.

Crash 2, reason is K_ERR_ARM_BUS_IMPRECISE_DATA_BUS=26.

CONFIG_LOG_PROCESS_THREAD_STACK_SIZE=768

logging thread max stack usage: 744 B(97%)

 

Parents
  • Hi Leo,

    Since you started monitoring stack size in the second attempt, I take it you have suspected stack overflow.

    That would be my guess too. I am not sure if the stack monitoring can monitor up to the crash. What I mean is: If the stack overflow and crash, then can the monitor catch that, or is the event lost?

    Generally, I think if a max usage goes over 90%, then it is a good idea to budget more.

    Could you please give the log thread more stack? Please also check other threads to see if any is having similar issues.

    Another thing to watch out for is to see if the stack use increases over time, which would indicate a memory leak somewhere.

    Hieu

  • Hi Hieu,

    Thank you! 

    I will increase the stack for these thread.

    Leo

  • No problem. I hope that resolves the issue.

  • I increase CONFIG_LOG_PROCESS_THREAD_STACK_SIZE to 1024, this error still appears

    bt_sdc_hci_driver: SoftDevice controller ASSERT: 33,748

    And new bugs have appeared:

    [20:26:54.767,669] <err> bt_sdc_hci_driver: SoftDevice Controller ASSERT: 53, 295
    [20:26:54.767,700] <err> os: ***** HARD FAULT *****
    [20:26:54.767,700] <err> os:   Fault escalation (see below)
    [20:26:54.767,700] <err> os: ARCH_EXCEPT with reason 3

     

    [20:26:54.767,730] <err> os: r0/a1:  0x00000003  r1/a2:  0x00000004  r2/a3:  0x00000003
    [20:26:54.767,730] <err> os: r3/a4:  0x00000003 r12/ip:  0x20025860 r14/lr:  0x00058087
    [20:26:54.767,761] <err> os:  xpsr:  0x01000011
    [20:26:54.767,761] <err> os: Faulting instruction address (r15/pc): 0x0004cdb0
    [20:26:54.767,822] <err> os: >>> ZEPHYR FATAL ERROR 3: Kernel oops on CPU 0
    [20:26:54.767,822] <err> os: Fault during interrupt handling

     

    [20:26:54.767,852] <err> os: Current thread: 0x2001aff8 (idle)
    [20:26:55.171,417] <err> os: Halting system

Reply
  • I increase CONFIG_LOG_PROCESS_THREAD_STACK_SIZE to 1024, this error still appears

    bt_sdc_hci_driver: SoftDevice controller ASSERT: 33,748

    And new bugs have appeared:

    [20:26:54.767,669] <err> bt_sdc_hci_driver: SoftDevice Controller ASSERT: 53, 295
    [20:26:54.767,700] <err> os: ***** HARD FAULT *****
    [20:26:54.767,700] <err> os:   Fault escalation (see below)
    [20:26:54.767,700] <err> os: ARCH_EXCEPT with reason 3

     

    [20:26:54.767,730] <err> os: r0/a1:  0x00000003  r1/a2:  0x00000004  r2/a3:  0x00000003
    [20:26:54.767,730] <err> os: r3/a4:  0x00000003 r12/ip:  0x20025860 r14/lr:  0x00058087
    [20:26:54.767,761] <err> os:  xpsr:  0x01000011
    [20:26:54.767,761] <err> os: Faulting instruction address (r15/pc): 0x0004cdb0
    [20:26:54.767,822] <err> os: >>> ZEPHYR FATAL ERROR 3: Kernel oops on CPU 0
    [20:26:54.767,822] <err> os: Fault during interrupt handling

     

    [20:26:54.767,852] <err> os: Current thread: 0x2001aff8 (idle)
    [20:26:55.171,417] <err> os: Halting system

Children
No Data
Related