Bluetooth Mesh MPU fault with Logging V2

Hi,

Currently, I am building an application with Bluetooth mesh and I tried to use Zephyr's logging v2 modules. However, when I build it I get MPU-fault errors at boot. For verification, I tried it also with a BT mesh sample project. For the test, I used the light fixture example and added the following configs. Each config gave the same result.

CONFIG_LOG_MODE_MINIMAL=n
CONFIG_LOG2_MODE_DEFERRED=y
CONFIG_LOG_BACKEND_UART=y
CONFIG_LOG_MODE_MINIMAL=n
CONFIG_LOG2_MODE_IMMEDIATE=y
CONFIG_LOG_BACKEND_UART=y

I also tried this with RTT backend instead of UART, but also same result.

CONFIG_LOG_BACKEND_RTT=n
CONFIG_LOG_BACKEND_UART=y

Log ouput:

▒*** Booting Zephyr OS build v2.7.99-ncs1-1  ***system
Initializing...
[00:00:00.012,298] <err> os: ***** HARD FAULT *****
[00:00:00.012,329] <err> os:   Fault escalation (see below)
[00:00:00.012,329] <err> os: ***** MPU FAULT *****
[00:00:00.012,329] <err> os:   Data Access Violation
[00:00:00.012,359] <err> os:   MMFAR Address: 0x20006eac
[00:00:00.012,359] <err> os: r0/a1:  0x00000008  r1/a2:  0x00000195  r2/a3:  0x00000000
[00:00:00.012,390] <err> os: r3/a4:  0x00000004 r12/ip:  0x0000019d r14/lr:  0x00000000
[00:00:00.012,390] <err> os:  xpsr:  0x61000000
[00:00:00.012,390] <err> os: Faulting instruction address (r15/pc): 0x0000128c
[00:00:00.012,451] <err> os: >>> ZEPHYR FATAL ERROR 0: CPU exception on CPU 0
[00:00:00.012,481] <err> os: Current thread: 0x20003708 (main)

Using log minimal or logging v1 does work. Does anyone know how I can get this to work with logging v2?

Setup:

  • nRF52833DK
  • nRF Connect SDK v1.9.1
  • Example: Bluetooth mesh Light fixture

Parents
  • Hi, small update. I found this Q&A on Nordic devzone which had a similar issue. I increased the main stack size `CONFIG_MAIN_STACK_SIZE` and flashed it to the DK. In the Bluetooth mesh light fixture example, this solved it for both log modes. For my own application, this method solved the issue for deferred logging mode. But for log mode immediate, I got a slightly different error. I increased the main stack size up to 8192.

    *** Booting Zephyr OS build v2.7.99-ncs1-1  ***
    [00:00:00.010,681] <inf> App: Initializing...
    [00:00:00.022,430] <inf> fs_nvs: 6 Sectors of 4096 bytes
    [00:00:00.028,503] <inf> fs_nvs: alloc wra: 2, c40
    [00:00:00.034,027] <inf> fs_nvs: data wra: 2, 1d4
    [00:00:00.104,309] <inf> sdc_hci_driver: SoftDevice Controller build revision:
                                             0e e7 c5 66 67 18 3c ac  b3 d2 cc 81 a3 dc f1 c0 |...fg.<. ........
                                             c0 36 02 22                                      |.6."
    [00:00:00.137,664] <inf> bt_hci_core: HW Platform: Nordic Semiconductor (0x0002)
    [00:00:00.160,888] <inf> bt_hci_core: HW Variant: nRF52x (0x0002)
    [00:00:00.182,739] <inf> bt_hci_core: Firmware: Standard Bluetooth controller (0x00) Version 14.50663 Build 1008
    [00:00:00.209,838] <inf> bt_hci_core: No ID address. App must call settings_load()
    [00:00:00.233,184] <inf> App: Bluetooth initialized
    [00:00:00.274,322] <inf> bt_hci_core: Identity: C9:C6:04:AF:B8:52 (random)
    [00:00:00.296,997] <inf> bt_hci_core: HCI: version 5.2 (0x0b) revision 0x12fe, manufacturer 0x0059
    [00:00:00.321,838] <inf> bt_hci_core: LMP: version 5.2 (0x0b) subver 0x12fe
    [00:00:00.344,848] <inf> bt_mesh_prov_device: Device UUID: 00000000-0000-0080-f744-ea62d0b23649
    [00:00:00.370,727] <inf> App: Mesh initialized
    [00:00:00.391,113] <inf> App: Boot count: 786
    [00:00:00.455,902] <err> os: ***** MPU FAULT *****
    [00:00:00.476,623] <err> os:   Stacking error (context area might be not valid)
    [00:00:00.499,938] <err> os:   Data Access Violation
    [00:00:00.520,874] <err> os:   MMFAR Address: 0x20005ad8
    [00:00:00.542,175] <err> os: r0/a1:  0x3a579300  r1/a2:  0xbc1a9c0b  r2/a3:  0xe8bc7378
    [00:00:00.566,345] <err> os: r3/a4:  0x8565b192 r12/ip:  0x58d21214 r14/lr:  0x5fdb17e0
    [00:00:00.590,515] <err> os:  xpsr:  0x21000000
    [00:00:00.611,022] <err> os: Faulting instruction address (r15/pc): 0x0004086c
    [00:00:00.634,307] <err> os: >>> ZEPHYR FATAL ERROR 2: Stack overflow on CPU 0
    [00:00:00.657,623] <err> os: Current thread: 0x20002cf0 (BT Mesh adv)
    [00:00:00.700,103] <err> fatal_error: Resetting system
    

    For me, deferred logging v2 is enough. But I'm curious as to why it still doesn't work with log mode immediate v2 in my own application?

  • It seems I made a mistake. Everything seemed to work, but when the device gets provisioned it still causes an MPU fault. I use the nRF mesh app (iOS version) to provision the device with the BT mesh light fixture example. 

    ▒*** Booting Zephyr OS build v2.7.99-ncs1-1  ***
    Initializing...
    [Bluetooth initialized
    Mesh initialized
    Successfully enabled LC server
    00:00:00.012,176] <inf> fs_nvs: 6 Sectors of 4096 bytes
    [00:00:00.012,176] <inf> fs_nvs: alloc wra: 0, f90
    [00:00:00.012,207] <inf> fs_nvs: data wra: 0, 44
    [00:00:00.012,390] <inf> sdc_hci_driver: SoftDevice Controller build revision:
                                             0e e7 c5 66 67 18 3c ac  b3 d2 cc 81 a3 dc f1 c0 |...fg.<. ........
                                             c0 36 02 22                                      |.6."
    [00:00:00.017,517] <inf> bt_hci_core: No ID address. App must call settings_load()
    [00:03:07.352,600] <wrn> bt_l2cap: Ignoring data for unknown channel ID 0x003a
    [00:03:09.456,726] <err> os: ***** MPU FAULT *****
    [00:03:09.456,756] <err> os:   Stacking error (context area might be not valid)
    [00:03:09.456,756] <err> os:   Data Access Violation
    [00:03:09.456,756] <err> os:   MMFAR Address: 0x20005518
    [00:03:09.456,787] <err> os: r0/a1:  0xd8043b82  r1/a2:  0x5caeff8e  r2/a3:  0xef0cac29
    [00:03:09.456,787] <err> os: r3/a4:  0xe4e1b764 r12/ip:  0xad304278 r14/lr:  0x9bb478d9
    [00:03:09.456,787] <err> os:  xpsr:  0x21000000
    [00:03:09.456,817] <err> os: Faulting instruction address (r15/pc): 0x0004d6fe
    [00:03:09.456,848] <err> os: >>> ZEPHYR FATAL ERROR 2: Stack overflow on CPU 0
    [00:03:09.456,878] <err> os: Current thread: 0x200029c0 (BT RX)
    ▒*** Booting Zephyr OS build v2.7.99-ncs1-1  ***system
    

  • Hi.

    Sorry about the delayed response.

    Did you manage to solve this, or do you still need assistance with this issue?

    Br,
    Joakim

  • Hi Joakim,

    Thank you for your response.


    No, this issue is still not solved. Everything works with deferred logging module v2 until I provision the device. Then it creates an MPU fault.

    Using arm-none-eabi-addr2line I also found the following things:

    • Fault instruction address (r15/pc): "fixed_data_alloc" ".../ncs/v1.9.1/zephyr/subsys/net/buf.c:146"
    • r14/lr: usually returns where the function is called from. But now it returned nothing (??:0)
    • Thread is rx_thread (from ".../ncs/v1.9.1/zephyr/subsys/bluetooth/host/hci_core.c:65")

    Because the error is generated in the Bluetooth Rx thread (and it is a stack overflow in the net buffer), I tried to increase Bluetooth Rx stack size, but this would create an assertion fail loop for z_spin_lock_valid(l)).

    Added settings:

    CONFIG_MAIN_STACK_SIZE=4096
    CONFIG_BT_RX_STACK_SIZE=4096
    CONFIG_LOG_MODE_MINIMAL=n
    CONFIG_LOG2_MODE_DEFERRED=y
    CONFIG_LOG_BACKEND_UART=y

    Logs:

    *** Booting Zephyr OS build v2.7.99-ncs1-1 ***
    Initializing...
    [Bluetooth initialized
    ASSERTION FAIL [z_spin_lock_valid(l)] @ WEST_TOPDIR/zephyr/include/spinlock.h:129
    Recursive spinlock 0x20004cf8
    0[00:00:00.011,779] <inf> fs_nvs: alloc wra: 0, f58
    ASSERTION FAIL [z_spin_lock_valid(l)] @ WEST_TOPDIR/zephyr/include/spinlock.h:129

    Do you know what goes wrong here?

  • Hi.

    Seems it failed for BT Mesh adv thread as well. Could you try increasing BT_MESH_ADV_STACK_SIZE?

    Br,
    Joakim

  • Hi Joakim,

    This solved the issue! Thanks for your support.

    To summarize the issue:

    You have to increase "BT_MESH_ADV_STACK_SIZE" and "BT_RX_STACK_SIZE" and "MAIN_STACK_SIZE".


    With kind regards
    Tim

Reply Children
No Data
Related