This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

Random Zephyr hangs due to RNG / Entropy

1205.test.zip

On nRF52840, after power up, I continue to randomly get hangs. I am having trouble figuring it out, but it seems every time I encounter the issue, it is related to RNG. If I attach gdb, it is typically stuck in 'CC_PalWaitInterruptRND'. I thought I could set 'CONFIG_TEST_RANDOM_GENERATOR=y', but CMake complains it is overriden by ENTROPY_HAS_DRIVER... I am not anywhere near enough a CMake expert to figure out if it is even possible to disable the Nordic Entropy to test.

I use openthread which selects a whole string of entropy requirements in the Kconfig tree. I even tried hang modifying some Kconfig source files to no avail.

Here is one such backtrace, but please note that is not just the zephyr coap stack, it also happens when openthread tries to initialize (which means it generates random numbers for its own coap stack). I reduce the occurrence of the board hangs by replacing the sys_rand32_get() with my own call. That stopped the Zephyr coap hangs shown below and the hangs moved into openthread stack which has its own calls to generate random numbers for coap tokens.

#0 0x0002e2ae in CC_PalWaitInterruptRND ()
#1 0x0002eb00 in LLF_RND_WaitRngInterrupt ()
#2 0x0002e8e6 in getTrngSource ()
#3 0x0002ea7a in LLF_RND_GetTrngSource ()
#4 0x0002de14 in mbedtls_hardware_poll ()
#5 0x00057460 in entropy_cc3xx_rng_get_entropy (dev=<optimized out>, buffer=<optimized out>, length=<optimized out>)
at /home/user/NORDIC_CONNECT_SDK/ncs/nrf/drivers/entropy/entropy_cc310.c:67
#6 0x0001ee38 in z_impl_entropy_get_entropy (length=4, buffer=0x20013564 <dfu_task_stack+3972> "\354\062\001 ܛ", dev=<optimized out>)
at /home/user/NORDIC_CONNECT_SDK/ncs/zephyr/include/drivers/entropy.h:77
#7 entropy_get_entropy (length=4, buffer=0x20013564 <dfu_task_stack+3972> "\354\062\001 ܛ", dev=<optimized out>)
at zephyr/include/generated/syscalls/entropy.h:33
#8 z_impl_sys_rand32_get () at /home/user/NORDIC_CONNECT_SDK/ncs/zephyr/subsys/random/rand32_entropy_device.c:33
#9 0x00016e82 in sys_rand32_get () at zephyr/include/generated/syscalls/rand32.h:33
#10 coap_next_token () at /home/user/NORDIC_CONNECT_SDK/ncs/zephyr/subsys/net/lib/coap/coap.c:339

Parents Reply Children
  • I wonder if this bug is still valid? I'm running nRF52840 devkit with Zephry 2.9.1 application which sends BLE mesh messages as much as possible. Usually within an hour it crashes:

    00> [01:06:12.418,426] <err> os: ***** MPU FAULT *****
    00> [01:06:12.418,426] <err> os: Stacking error (context area might be not valid)
    00> [01:06:12.418,426] <err> os: r0/a1: 0x00000000 r1/a2: 0x00000000 r2/a3: 0x00000000
    00> [01:06:12.418,457] <err> os: r3/a4: 0x00000000 r12/ip: 0x2000eda8 r14/lr: 0x2000ed80
    00> [01:06:12.418,457] <err> os: xpsr: 0x41000200
    00> [01:06:12.418,487] <err> os: Faulting instruction address (r15/pc): 0x0004bb66
    00> [01:06:12.418,518] <err> os: >>> ZEPHYR FATAL ERROR 2: Stack overflow on CPU 0
    00> [01:06:12.418,548] <err> os: Current thread: 0x200043b8 (unknown)
    00> [01:06:12.686,706] <err> os: Halting system

    Checked address 0x0004bb66 from map-file:

    .text.CC_PalWaitInterruptRND
    0x0004bb60 0x2c /opt/ncs/v2.9.1/nrfxlib/crypto/nrf_cc310_platform/lib/cortex-m4/soft-float/no-interrupts/libnrf_cc310_platform_0.9.19.a(cc_pal_interrupt_ctrl.c.obj)
    0x0004bb60 CC_PalWaitInterruptRND
    .text.CC_PalWaitInterrupt
    0x0004bb8c 0x18 /opt/ncs/v2.9.1/nrfxlib/crypto/nrf_cc310_platform/lib/cortex-m4/soft-float/no-interrupts/libnrf_cc310_platform_0.9.19.a(cc_pal_interrupt_ctrl.c.obj)
    0x0004bb8c CC_PalWaitInterrupt



  • Hi, Can you create a new ticket for this and link to this ticket in the new case? 

    Regards,
    Jonathan

  • Apparently bug is still valid as my problem got fixed by workaround presented above.

Related