Need Help: Combining TinyML Inference with BLE DFU on nRF7002DK – Crashing at Runtime

Hi everyone,

I’m working on a project involving the nRF7002DK, where I’m trying to combine two separately working applications:

  1. A TinyML CIFAR-10 image classification project using TensorFlow Lite Micro
  2. A BLE DFU-capable peripheral application (6153.peripheral_lbs_ble_fota.zip) that allows firmware updates over Bluetooth via the nRF Connect mobile app

I want to Enable on-device ML inference while also supporting Bluetooth firmware/model updates via DFU.

Issue:

Individually, both applications work perfectly. However, once I integrate them, the merged application builds and flashes successfully but crashes at runtime during the model setup phase. The crash is caused by a usage fault related to unaligned memory access, which occurs during MicroInterpreter initialization and tensor_arena usage.

What I’ve Tried:

  • Replaced dynamic memory with a statically allocated tensor_arena using alignas(16)
  • Tried placing tensor_arena in .noinit memory section
  • Increased stack sizes in prj.conf
  • Verified all include paths and CMake configurations
  • Moved TinyML logic to src and called setup() and loop() after BLE initialization
  • Attempted to reduce tensor arena size and print memory layout

Despite all this, the crash persists. BLE functionality remains intact, but the inference setup always causes a Zephyr fatal error and system reset.

Request: If anyone has experience with combining ML inference and BLE DFU on Nordic devices or specifically with the nRF7002DK. I would really appreciate any guidance or working examples. Potential issues I’m considering include:

  • Flash/RAM layout conflicts
  • SPI/QSPI usage overlap
  • Memory section misconfigurations
  • Conflicts with TF-M or secure/non-secure regions

This integration is an important part of my university research project. I’ve included links to both the BLE DFU sample and my TinyML repo:

I’ve also attached a screenshot of the crash output in the serial monitor. Any help or suggestions would mean a lot. Happy to share logs or a zipped version of the project if needed.

Thanks in advance!


Parents
  • Hi,

    Potential issues I’m considering include:

    • Flash/RAM layout conflicts
    • SPI/QSPI usage overlap
    • Memory section misconfigurations
    • Conflicts with TF-M or secure/non-secure regions

    I think your ideas on what could be the issues sound reasonable. Let me suggest some things for each:

    Conflicts with TF-M or secure/non-secure regions

    Can you try to just not use TF-M?

    Flash/RAM layout conflicts

    MCUboot and the application does not share RAM. TF-M is another story, but see above.

    Also, perhaps try to change the size of MCUboot using CONFIG_PM_PARTITION_SIZE_MUCBOOT to see if that somehow aligns things by chance.

    Happy to share logs or a zipped version of the project if needed.

    The image is kinda low quality. Can you insert the logs as a .txt file?

    Regards,
    Sigurd Hellesvik

  • Hi Sigurd,

    Thanks a lot for getting back to me.

    Based on what you suggested, here’s what I’m planning to do and a few things I wanted to check with you:

    For TF-M, I’ll try building the project without it, so everything just runs in the non-secure domain. Hopefully that will help with the crash during model setup. I’ll test it and let you know if the unaligned memory issue still shows up.

    About the flash and RAM layout, thanks for explaining that MCUboot and the app don’t share RAM. That makes sense now. I’ll also try tweaking the MCUboot partition size using CONFIG_PM_PARTITION_SIZE_MCUBOOT like you said, and see if adjusting things helps avoid any memory conflicts.

    Regarding the logs, yeah, I realize the screenshot wasn’t super clear. I uploaded

    it as a .txt file so it’s easier to go through.

    Also, if it helps, I’ve included the links above where you can download both the BLE DFU sample and my TinyML CIFAR-10 project. It would be great if you could try testing them on your side as well. One suggestion,  before building the TinyML CIFAR-10 application, please run the command "west config manifest.project-filter -- +tflite-micro" in the nRF Terminal to make sure the TensorFlow Lite Micro modules are properly pulled in. I found this step necessary to avoid module issues while building.

    Thanks again for helping me out with this, I’ll try the changes you mentioned and post back with updates soon.

    Best,
    Gowtham Raj

    *** Using nRF Connect SDK v2.9.0-7787b2649840 ***
    *** Using Zephyr OS v3.7.99-1f8f3dc29142 ***
    I: Starting bootloader
    I: Image index: 0, Swap type: none
    I: Image index: 1, Swap type: none
    I: Bootloader chainload address offset: 0xc000
    I: Jumping to the first image slot [00:00:00.002,929] <inf> spi_nor: mx25r6435f@0: 8 MiBy flash
    *** Booting My Application v1.2.3 - unknown commit ***
    *** Using nRF Connect SDK v2.9.0-7787b2649840 ***
    *** Using Zephyr OS v3.7.99-1f8f3dc29142 ***
    Starting Bluetooth Peripheral LBS with TinyML example
    [00:00:00.010,803] <inf> fs_nvs: 2 Sectors of 4096 bytes
    [00:00:00.010,803] <inf> fs_nvs: alloc wra:[00:00:00.010,833] <inf> fs_nvs: data wra: 0, 1c
    [00:00:00.031,585] <inf> bt_hci_core: HW Platform: Nordic Semiconductor (0x0002)
    [00:00:00.031,616] <inf> bt_hci_core: HW Variant: nRF53x (0x0003)
    [00:00:00.031,646] <inf> bt_hci_core: Firmware: Standard Bluetooth controller (0x00) Version 45.41337 Build 3074452168
    [00:00:00.066,284] <inf> bt_hci_core: No ID address. App must call settings_load()
    Bluetooth initialized
    [00:00:00.067,687] <inf> bt_hci_core: Identity: F1:8C:5E:8D:93:D2 (random)
    [00:00:00.067,718] <inf> bt_hci_core: HCI: version 6.0 (0x0e) revision 0x206b, manufacturer 0x0059
    [00:00:00.067,749] <inf> bt_hci_core: LMP: version 6.0 (0x0e) subver 0x206b
    Advertising successfully started
    Firmware version 2.0 - TinyML integrated
    Initializing TinyML model...
    [00:00:00.073,760] <err> os: ***** USAGE FAULT *****
    [00:00:00.073,760] <err> os:   Unaligned memory access
    [00:00:00.073,791] <err> os: r0/a1:  0x2002538c  r1/a2:  0x20023c68  r2/a3:  0x200176fc
    [00:00:00.073,791] <err> os: r3/a4:  0x20012d19 r12/ip:  0x00000018 r14/lr:  0x00025abd
    [00:00:00.073,791] <err> os:  xpsr:  0x29000000
    [00:00:00.073,822] <err> os: Faulting instruction address (r15/pc): 0x0003ebae
    [00:00:00.073,852] <err> os: >>> ZEPHYR FATAL ERROR 31: Unknown error on CPU 0
    [00:00:00.073,883] <err> os: Current thread: 0x200251c0 (unknown)
    [00:00:00.204,040] <err> fatal_error: Resetting system *** Booting MCUboot v2.1.0-dev-12e5ee106034 ***
    *** Using nRF Connect SDK v2.9.0-7787b2649840 ***

  • Gowtham Raj said:
    For TF-M, I’ll try building the project without it, so everything just runs in the non-secure domain. Hopefully that will help with the crash during model setup. I’ll test it and let you know if the unaligned memory issue still shows up.

    Without TF-M, everything will run in the Secure domain! This does not mean it is more secure, as it is the Separation that makes it secure. See Security by separation if you want to learn more.

    Gowtham Raj said:
    t would be great if you could try testing them on your side as well.

    Perhaps later on. First I will try and help you solve this yourself.

    From the logs:
    In the fault message, it says (unknown). If you set CONFIG_THREAD_NAME, you can get this to instead tell you the thread name where it fails. This can sometimes be useful.

    Also, perhaps try to debug og add more prints after the "Initializing TinyML model...", so we can see exactly which function is called when the fault happens.

    Oh and one more idea:
    You say that the sample crashes when you add BLE DFU. => Is it the BLE or the DFU?
    Try to add only MCUboot to the project and not BLE. Does that cause the fault?
    Try to add only BLE to the project and not DFU. Does that cause the fault?

Reply
  • Gowtham Raj said:
    For TF-M, I’ll try building the project without it, so everything just runs in the non-secure domain. Hopefully that will help with the crash during model setup. I’ll test it and let you know if the unaligned memory issue still shows up.

    Without TF-M, everything will run in the Secure domain! This does not mean it is more secure, as it is the Separation that makes it secure. See Security by separation if you want to learn more.

    Gowtham Raj said:
    t would be great if you could try testing them on your side as well.

    Perhaps later on. First I will try and help you solve this yourself.

    From the logs:
    In the fault message, it says (unknown). If you set CONFIG_THREAD_NAME, you can get this to instead tell you the thread name where it fails. This can sometimes be useful.

    Also, perhaps try to debug og add more prints after the "Initializing TinyML model...", so we can see exactly which function is called when the fault happens.

    Oh and one more idea:
    You say that the sample crashes when you add BLE DFU. => Is it the BLE or the DFU?
    Try to add only MCUboot to the project and not BLE. Does that cause the fault?
    Try to add only BLE to the project and not DFU. Does that cause the fault?

Children
No Data
Related