OpenThread crashes on 5340

Basic OpenThread application built for 5340 crashes in net app.

The same application works fine on 21540 (52840 + 21540)

The crash seems to happen when some thread packet is received. If I turn off all other thread transmitters, the crash doesn't happen.

Details:

I: [N] Mle-----------: Role detached -> child
I:   Local IPv6 = Origin:SLACC * fd11:22:0:0:8f99:b944:f28f:617b
E: No response within timeout 500
ASSERTION FAIL @ WEST_TOPDIR/zephyr/drivers/ieee802154/ieee802154_nrf5.c:1149
    802.15.4 serialization error
E: r0/a1:  0x00000004  r1/a2:  0x0000047d  r2/a3:  0x00000001
E: r3/a4:  0x00011cc9 r12/ip:  0xa0000000 r14/lr:  0x000180af
E:  xpsr:  0x41000000
E: Faulting instruction address (r15/pc): 0x00058b1e
E: >>> ZEPHYR FATAL ERROR 4: Kernel panic on CPU 0
E: Current thread: 0x200035b8 (openthread)
E: Resetting

Parents
  • Actually it turns out the samples out of the SDK crash as well.  They just don't have logging as enabled so the crash is quieter.

    Just build Nordic samples openthread/cli for nrf5340dk_nrf5340_cpuapp, flash it,, and bring up a thread network.  The board crashes as soon as it receives any thread packet on the network

  • Hi,

    I am not able to reproduce the issue. I tried both with CLI and CoAP Server in v2.1.1

    Are you debugging when you get this error? Have you made changes to any SDK files? Do you have another nRF5340 DK to with?

    Best regards,

    Marte

  • Hi,

    I tested with one nRF5340 DK and one nRF52840 DK, as it should not matter what the sender is when the issue is that the nRF5340 crashes when receiving thread packets. I also tested with v2.0.1, and I could not get it to crash there either.

    Please answer my previous questions.

    Marte Myrvold said:
    Are you debugging when you get this error?
    Marte Myrvold said:
    Have you made changes to any SDK files?

    Best regards,

    Marte

  • I have been trying to update to NCS 2.1.2 and cant build I get

    o
    /usr/local/GNU-Tools-ARM-Embedded/gcc-arm-none-eabi-10.3-2021.10/bin/../lib/gcc/arm-none-eabi/10.3.1/../../../../arm-none-eabi/bin/ld.bfd: modules/openthread/platform/libopenthread_platform.a(logging.c.obj): in function `z_log_msg_runtime_vcreate':
    ./build_5340/zephyr/include/generated/syscalls/log_msg.h:75: undefined reference to `z_impl_z_log_msg_runtime_vcreate'
    collect2: error: ld returned 1 exit status

    I have to use full (non minimal) logging to get the build to work.  It seems openthread platform ignores CONFIG_LOG_MODE_MINIMAL=y but this used to work

    At NCS v2.1.2 I am still seeing the exact same crash on the 5340. There are only minor changes I made to the zephyr SDK related to some device drivers not involved in Nordic radio or board.  Everything else is directly from the sdk-nrf repo and your sdk-zephyr fork at v2.1.2

    ***

    I have gotten the cli sample to work on the 5340.  It seems something in flash is different for the sample and my build. If I flash the sample using west flash --recover it works, and then flash my application (nrfjprog merged.hex), my application works.  If I use west flash --recover in my application it goes back to crashing. If I then flash the sample (nrfjprog merged.hex) it still crashes. 

    My application is using mcuboot now, but it was crashing before we started using the bootloader.

  • Hi,

    Sorry for the delay, I have been out of office.

    So if I understand you correctly, the only time it works is if you start with an erased board (or erase it during the first flashing), program CLI sample with west flash command, and then program your application without erasing flash first? The file merged.hex only has the firmware for one core. If you program with west flash command it will program both cores using merged_domain.hex. Can you check whether you have a merged_domain.hex file in your build folder, or if you only have merged.hex? The issue might be that the network core is not programmed correctly.

    Best regards,

    Marte

  • When I flash merged_domains.hex to the 5340 is when I see the failure.  See log below for the *failing* case.  When I flash merged_domains.hex from the CLI sample, and then flash just the AP processor with my application things work OK.  There must be some difference in the network app?  How do I find that?

    -- west flash: using runner nrfjprog
    Using board 1050007064
    -- runners.nrfjprog: Recovering and erasing flash memory for both the network and application cores.
    Recovering device. This operation might take 30s.
    Writing image to disable ap protect.
    Erasing user code and UICR flash areas.
    Recovering device. This operation might take 30s.
    Writing image to disable ap protect.
    Erasing user code and UICR flash areas.
    -- runners.nrfjprog: Flashing file: /Users/bdodge/Mead/threadcpu/build_5340/zephyr/merged_domains.hex
    -- runners.nrfjprog: /Users/bdodge/Mead/threadcpu/build_5340/zephyr/merged_domains.hex targets both nRF53 coprocessors; splitting it into: /Users/bdodge/Mead/threadcpu/build_5340/zephyr/GENERATED_CP_NETWORK_merged_domains.hex and /Users/bdodge/Mead/threadcpu/build_5340/zephyr/GENERATED_CP_APPLICATION_merged_domains.hex
    Parsing image file.
    Verifying programming.
    Verified OK.
    Parsing image file.
    Verifying programming.
    Verified OK.
    Applying pin reset.
    -- runners.nrfjprog: Board with serial number 1050007064 flashed successfully.

  • Hi,

    Is it possible for you to upload your project here so I can test on my side? If you do not want to share it publicly I can make the ticket private. In that case let me know.

    If you look in your build directory you should have a directory named "802154_rpmsg". This is the build directory for the network core application. One thing you can do is to compare the 802154_rpmsg/zephyr/.config file in both projects, to check whether the same configurations are set. Another thing is if you do a pristine build, look for when the network core application is built, and see if the same files are used in the build. You should be able to find it by looking for the following line in the build output:

    === child image 802154_rpmsg - CPUNET begin ===

    Best regards,

    Marte

Reply
  • Hi,

    Is it possible for you to upload your project here so I can test on my side? If you do not want to share it publicly I can make the ticket private. In that case let me know.

    If you look in your build directory you should have a directory named "802154_rpmsg". This is the build directory for the network core application. One thing you can do is to compare the 802154_rpmsg/zephyr/.config file in both projects, to check whether the same configurations are set. Another thing is if you do a pristine build, look for when the network core application is built, and see if the same files are used in the build. You should be able to find it by looking for the following line in the build output:

    === child image 802154_rpmsg - CPUNET begin ===

    Best regards,

    Marte

Children
  • Thanks,  It turned out there was one critical difference in the .config

    < # CONFIG_NRF_802154_ENCRYPTION is not set
    ---
    > CONFIG_NRF_802154_ENCRYPTION=y

    I haven't figured out why it is not set in my build.  In the build output for CPUNET I see

    NRF_802154_ENCRYPTION_ENABLED=1

    but that is not the same thing?

    I can force it by adding

        -D802154_rpmsg_CONFIG_NRF_802154_ENCRYPTION=y

    To my build command line and that resolves that diff in the .config, and my build now works on the 5340dk

    I will try and figure out what is in my prj.conf that is turning that off, but at first look there is nothing in there I don't need for my project.

  • Hi,

    In the CLI sample this is set in the configuration for the child image. If you look at the sample you will see there is a directory called child_image with the file 802154_rpmsg.conf where this is set. You can add the child image directory and 802154_rpmsg.conf with CONFIG_NRF_802154_ENCRYPTION=y in your project as well.

    Best regards,

    Marte

Related