nRF52833 custom board Hard Fault when using Bluetooth

Hello,

I am trying to get into Zephyr development and for my project I have created a custom board based on nRF52833 QDAA (40 pin). I have been successfully able to verify the board functionality using the nRF5 SDK by using some of the examples provided by Nordic.

Unfortunately I am running into issues when trying to use nRF Connect SDK (version 2.5.0).

The board seems to work pretty well, but when I enable Bluetooth (for example just a simple advertising), the board successfully starts up, sends out one advertisement packet and then crashes.

[00:00:00.017,547] <err> mpsl_init: MPSL ASSERT: 112, 2185
[00:00:00.017,578] <err> os: ***** HARD FAULT *****
[00:00:00.017,578] <err> os:   Fault escalation (see below)
[00:00:00.017,608] <err> os: ARCH_EXCEPT with reason 3

[00:00:00.017,608] <err> os: r0/a1:  0x00000003  r1/a2:  0x00000000  r2/a3:  0x00000009
[00:00:00.017,639] <err> os: r3/a4:  0x20001090 r12/ip:  0x20000970 r14/lr:  0x0000bbe5
[00:00:00.017,639] <err> os:  xpsr:  0x41000018
[00:00:00.017,669] <err> os: Faulting instruction address (r15/pc): 0x000108b8
[00:00:00.017,700] <err> os: >>> ZEPHYR FATAL ERROR 3: Kernel oops on CPU 0
[00:00:00.017,730] <err> os: Fault during interrupt handling

[00:00:00.017,761] <err> os: Current thread: 0x20001090 (logging)
[00:00:00.478,393] <err> os: Halting system

From the zephyr.map I have found out that the faulting instruction is part of zephyr/arch/arch/arm/core/aarch32/libarch__arm__core__aarch32.a(irq_manage.c.obj).

I am stuck at this issue as I have no idea what could be the cause or how to resolve it, when I build the firmware for nRF52833DK and flash it on the DK, it seems to be running fine.

My prj.conf:

# Bluetooth configuration
CONFIG_BT=y
CONFIG_BT_DEVICE_NAME="BugPack"

# Logging configuration
CONFIG_LOG=y

To be honest, I have no idea what to do about this, how to debug the issue, as I have just started working with Zephyr. Any help with the issue would be very appreciated.

I will also attach the board schematic and Zephyr board files.

Thank you very much in advance!

Matej


bugpack.zipSchematic_BugPack_v2.pdf

Parents
  • Hello Matej,

    Please try to measure the startup time of your 32M crystal as I suggested in this post:  RE: MPSL ASSERT: 112, 2134 if using multiple BLE communication . The MPSL assertion indicates that the crystal oscillator was not active when it was supposed to.

    Best regards,

    Vidar 

  • First of all, thank you very much for your response.

    It seems like the crystal is not starting at all? The code gets stuck in the while loop. I have tried replacing the crystal as well as capacitors, but no luck so far.

    I am using the following crystal, I suppose that should be alright? https://www.lcsc.com/product-detail/Crystals_HD-7E032000F01_C655046.html Originally I have tried starting it up with 8pF capacitors, later with 12pF capacitors, without any success.

    I have also figured out that I can start the device and run BT on it when I enable CONFIG_BT_LL_SW_SPLIT=y  in the prj.conf

    Do you by any chance have any other ideas? I might as well try to assemble one more prototype board just to make sure it is not some weird hardware issue caused by repeated re-soldering of some components on the current board...

    UPDATE: The issue is unfortunately present on both prototypes. Unfortunately don`t have an oscilloscope at home to check what is going on... I will have an access to one during the week though.

  • HFCLKSTAT SRC is 0, meaning the device is probably using the internal oscillator?

  • Yes, '0' means the CPU is running off the internal oscillator. However, this makes it even more strange that your program got stuck waiting for the 'EVENTS_HFCLKSTARTED' event. The crystal is clearly able to ramp up when using the old SDK and with the Zephyr controller.

    From the schematics, I noticed that you have not mounted the optional inductors for the DCDC regulator. Please confirm that the DCDC is not enabled by ensuring that CONFIG_BOARD_ENABLE_DCDC=y is not set in the .config output file in build/zephyr/

  • The DCDC was in fact set to y. I have messed up when copying KConfig from the nRF52833DK board.

    Unfortunately fixing it did not help, even after that the HFCLKSTAT SRC is still 0 and not working :/ ...

    The schematic is based on the circuit configuration no. 6 example for QFN40 (Product specification page 594).

  • Please try to call mpsl_clock_hfclk_latency_set() with MPSL_CLOCK_HF_LATENCY_WORST_CASE and see if you encounter the same assert. I'm still confused as to why the Softdevice and Zephyr controller are seemingly able to use the HFXO, while you are not able to start the crystal 'manually' with my code snippet. Also, when testing this, make sure you don't have any breakpoints set. The CPU should not be halted when the Bluetooth stack is enabled, as this will break the real-time requirements of the controller, which can lead to random asserts.

  • I do not have any breakpoints set, and I am using only printk() for debug.

    mpsl_clock_hfclk_latency_set() seems to run fine. This debug code:

    void main(void)
    {
        mpsl_clock_hfclk_latency_set(MPSL_CLOCK_HF_LATENCY_WORST_CASE);
        printk("HFCLKSTAT: %s\n", to_binary(*hfclkstat));
    
        NRF_TIMER1->BITMODE = TIMER_BITMODE_BITMODE_32Bit << TIMER_BITMODE_BITMODE_Pos;
        NRF_TIMER1->TASKS_CLEAR = 1;
        NRF_CLOCK->EVENTS_HFCLKSTARTED = 0;
        NRF_TIMER1->TASKS_START = 1;
        NRF_CLOCK->TASKS_HFCLKSTART = 1;
        int counter = 1000000;
        while(NRF_CLOCK->EVENTS_HFCLKSTARTED == 0 && counter-- > 0);
    //    while(NRF_CLOCK->EVENTS_HFCLKSTARTED == 0);
        NRF_TIMER1->TASKS_CAPTURE[0] = 1;
    
        printk("HF Clock has started. Startup time: %d uS\n", NRF_TIMER1->CC[0]);
    }

    is able to get to the end and prints out:

    *** Booting nRF Connect SDK v2.5.0 ***
    HFCLKSTAT: 00000000000000010000000000000000
    HF Clock has started. Startup time: 250000 uS

    ... just timing out at the counter when trying to start up the crystal - otherwise it gets stuck in the while loop.



    Today I have been very lucky as I have been able to get my hands on a very similar board to what I am working with, also with nRF52833 QDAA and without DCDC. I have tried the exact same code and have been able to run everything just fine, with crystal starting after 361 us, therefore I believe the issue is related to physical layout of my board - I have probably messed something up so that it just barely works under some circumstances, but not under others. (I have also done more thorough testing of my board and found out that even though it is somewhat working with the CONFIG_BT_LL_SW_SPLIT=y, there still seems to be a pretty large packet loss).

    As I am still in high school (that is not even EE focused), I make quite a lot of errors and I believe my current board might need a slight redesign - it was my first attempt to design a tiny low power board based on flex PCB and it simply might be on the edge of working state...

    Thank you very much, your time and help was very valuable for me! I will add a response to this thread once I will be able to get my new prototype board, hopefully with better design.

Reply
  • I do not have any breakpoints set, and I am using only printk() for debug.

    mpsl_clock_hfclk_latency_set() seems to run fine. This debug code:

    void main(void)
    {
        mpsl_clock_hfclk_latency_set(MPSL_CLOCK_HF_LATENCY_WORST_CASE);
        printk("HFCLKSTAT: %s\n", to_binary(*hfclkstat));
    
        NRF_TIMER1->BITMODE = TIMER_BITMODE_BITMODE_32Bit << TIMER_BITMODE_BITMODE_Pos;
        NRF_TIMER1->TASKS_CLEAR = 1;
        NRF_CLOCK->EVENTS_HFCLKSTARTED = 0;
        NRF_TIMER1->TASKS_START = 1;
        NRF_CLOCK->TASKS_HFCLKSTART = 1;
        int counter = 1000000;
        while(NRF_CLOCK->EVENTS_HFCLKSTARTED == 0 && counter-- > 0);
    //    while(NRF_CLOCK->EVENTS_HFCLKSTARTED == 0);
        NRF_TIMER1->TASKS_CAPTURE[0] = 1;
    
        printk("HF Clock has started. Startup time: %d uS\n", NRF_TIMER1->CC[0]);
    }

    is able to get to the end and prints out:

    *** Booting nRF Connect SDK v2.5.0 ***
    HFCLKSTAT: 00000000000000010000000000000000
    HF Clock has started. Startup time: 250000 uS

    ... just timing out at the counter when trying to start up the crystal - otherwise it gets stuck in the while loop.



    Today I have been very lucky as I have been able to get my hands on a very similar board to what I am working with, also with nRF52833 QDAA and without DCDC. I have tried the exact same code and have been able to run everything just fine, with crystal starting after 361 us, therefore I believe the issue is related to physical layout of my board - I have probably messed something up so that it just barely works under some circumstances, but not under others. (I have also done more thorough testing of my board and found out that even though it is somewhat working with the CONFIG_BT_LL_SW_SPLIT=y, there still seems to be a pretty large packet loss).

    As I am still in high school (that is not even EE focused), I make quite a lot of errors and I believe my current board might need a slight redesign - it was my first attempt to design a tiny low power board based on flex PCB and it simply might be on the edge of working state...

    Thank you very much, your time and help was very valuable for me! I will add a response to this thread once I will be able to get my new prototype board, hopefully with better design.

Children
Related