performance issue with multi threadn in ncs with nrf52832

hello Nordic

i am working with nrf52832, with zephyr, ncs v1.7.1,

the device has a sensor that sample at 26khz each sample is 3 channels of 2 bytes data. the sensor use spi1 and has high thread priority (cooperative) the samples are then stream via a double buffer to the flash which uses spi2.

when tying to write a page (256 bytes) to the flash at a test in the beginning of the app it take 1ms to write that page but when trying to write within the app, while the other sensor is sampling, it takes 3ms and also i get a BUS FAULT. here is the log:

[00000018] <err> os: ***** BUS FAULT *****
[00000018] <err> os:   Unstacking error
[00000018] <err> os: ***** HARD FAULT *****
[00000018] [1;31m<err> os:   Fault escalation (see below)
[00000018] <err> os: ***** BUS FAULT *****
[00000018] <err> os:   Precise data bus error
[00000018] <err> os:   BFAR Address: 0xfe80fec0
[00000018] <err> os: r0/a1:  0x2000bef0  r1/a2:  0xfe80fec0  r2/a3:  0xfe80ff28
[00000018] <err> os: r3/a4:  0x2000beef r12/ip:  0xfa000000 r14/lr:  0x00016d79
[00000018] <err> os:  xpsr:  0x81000005
[00000018] <err> os: s[ 0]:  0x00000000  s[ 1]:  0x00000000  s[ 2]:  0x00000000  s[ 3]:  0x00000000
[00000018] <err> os: s[ 4]:  0x00000000  s[ 5]:  0x00000000  s[ 6]:  0x00000000  s[ 7]:  0x00000000
[00000018] <err> os: s[ 8]:  0x00000000  s[ 9]:  0x00000000  s[10]:  0x00000000  s[11]:  0x00000000
[00000018] <err> os: s[12]:  0x00000000  s[13]:  0x00000000  s[14]:  0x00000000  s[15]:  0x00000000
[00000018] <err> os: fpscr:  0x20001d78
[00000018] <err> os: Faulting instruction address (r15/pc): 0x0003ef10
[00000018] <err> os: >>> ZEPHYR FATAL ERROR 0: CPU exception on CPU 0
[00000018] <err> os: Fault during interrupt handling

[00000018] <err> os: Current thread: 0x20003210 (unknown)
[00000018] <err> fatal_error: Resetting system
*** Booting Zephyr OS build v2.6.99-ncs1-1  ***

now i do have in the app another driver that uses spi (at the same time (getting samples and writing samples to flash), but it uses a different instance of spi so i don't see why it shell create such a delay .. here is my complete .dts file:

// Copyright (c) 2022 Nordic Semiconductor ASA
// SPDX-License-Identifier: Apache-2.0

/dts-v1/;
#include <nordic/nrf52832_qfaa.dtsi>

/ {
	model = "halo_ep2";
	compatible = "halo-ep2";

	chosen {
		zephyr,sram = &sram0;
		zephyr,flash = &flash0;
		zephyr,code-partition = &slot0_partition;
		nordic,pm-ext-flash = &augu_flash_is25lp128;
	};

	sensors_on: sensors_on_node {
		compatible = "augury,gpio-power";
		power_gpios = <&gpio0 22 GPIO_ACTIVE_LOW>;
		label = "SENSORS_ON";
		status = "okay";
		#power_resource-cells = <0>;
	};

	flash_on: flash_on_node {
		compatible = "augury,gpio-power";
		power_gpios = <&gpio0 23 (GPIO_PULL_UP | GPIO_ACTIVE_LOW)>;
		label = "FLASH_ON";
		status = "okay";
		#power_resource-cells = <0>;
	};	

    leds {
        compatible = "gpio-leds";
        led0_red: led_0 {
            gpios = <&gpio0 25 GPIO_ACTIVE_HIGH>;
            label = "RED_LED_0";
        };
        led1_orange: led_1 {
            gpios = <&gpio0 26 GPIO_ACTIVE_HIGH>;
            label = "ORANGE_LED_1";
        };
        led2_green: led_2 {
            gpios = <&gpio0 27 GPIO_ACTIVE_HIGH>;
            label = "GREEN_LED_2";
        };							
    };

	aliases {
		led0-red = &led0_red;
		led1-orange = &led1_orange;
		led2-green = &led2_green;

	};
};

	&adc {
		status ="okay";
	};

	&gpiote {
		status ="okay";
	};

	&gpio0 {
		status ="okay";
	};


	&spi1 {
		compatible = "nordic,nrf-spim";
		status = "okay";
		sck-pin = <20>;
		mosi-pin = <14>;    // p0.14
		miso-pin = <16>;    // p0.16
		cs-gpios = <&gpio0 13 GPIO_ACTIVE_LOW>;
		iis3dwb@0 {
			compatible = "st,iis3dwb";
			reg = <0>;
			int1_gpios = <&gpio0 11 (GPIO_ACTIVE_HIGH | GPIO_PULL_DOWN)>;
			int2_gpios = <&gpio0 12 (GPIO_ACTIVE_HIGH | GPIO_PULL_DOWN)>;
			label = "IIS3DWB";
			power_resources = <&sensors_on>;
			spi-max-frequency = <8000000>;
		};
	};

	&spi2 {
		status = "okay";
		sck-pin = <5>; // gpio 0 pin 5
		mosi-pin = <4>; // gpio 0 pin 4
		miso-pin = <2>; // gpio 0 pin 2
		compatible = "nordic,nrf-spi";
		cs-gpios = <&gpio0 3 (GPIO_PULL_UP | GPIO_ACTIVE_LOW)>;
		augu_flash_is25lp128: augu_flash_is25lp128@0 {
			compatible = "issi,augu_flash_is25lp128";
			reg = <0>;
			label = "AUGU_FLASH_IS25LP128";
			power_resources = <&flash_on>;
			jedec_id = [9d 60 18];
			spi-max-frequency = <8000000>; //<133000000>;
			page_size = <256>;
			block_size = <65536>;
			size = <0x8000000>;
		};
	};
	&flash0 {
		partitions {
			compatible = "fixed-partitions";
			#address-cells = <1>;
			#size-cells = <1>;

			mbr_partition: partition@0 {
				label = "mbr";
				reg = <0x00000000 0x00001000>;
			};
			boot_partition: partition@1000 {
				label = "mcuboot";
				reg = <0x1000 0xc000>;
			};
			slot0_partition: partition@d000 {
				label = "image-0";
				reg = <0xd000 0x6c800>;
			};
			slot1_partition: partition@79800 {
				label = "image-1";
				reg = <0x79800 0x800>;
			};
			storage_partition: partition@7a000 {
				label = "storage";
				reg = <0x7a000 0x6000>;
			};
		};
	};

(running the app on another board with nrd52840, with nand flash that writes 2k each time, works well, so i don't know if it is a performance issue or something else.. i will also mention that the RAM take on the nrf52832 with zephyr and all is high:

[567/572] Linking C executable zephyr/zephyr.elf
Memory region         Used Size  Region Size  %age Used
           FLASH:      253604 B     419328 B     60.48%
            SRAM:       60282 B        64 KB     91.98%
        IDT_LIST:          0 GB         2 KB      0.00%
[568/572] Generating zephyr/mcuboot_primary_app.hex
[569/572] Generating zephyr/mcuboot_primary.hex
[570/572] Generating ../../zephyr/app_update.bin, ../../zephyr/app_signed.hex, ../../zephyr/app_test_update.hex, ../../zephyr/app_moved_test_update.hex
[571/572] Generating ../../zephyr/dfu_application.zip
[572/572] Generating zephyr/merged.hex

any idea why i get the BUS FAULT ? 

p.s. i don't know if it has any affect but the page data is saved in a heap section previously allocated like so

K_HEAP_DEFINE(my_heap, 3072);   // 256*3*2 = 1536 + safe buffer

int mem_heap_alloc(int size, struct k_mem_block *block)
{
    block->data = NULL;
    block->data = k_heap_alloc(&my_heap, size, K_NO_WAIT);
    if (block->data == NULL)
    {
        return -ENOMEM;
    }

    return 0;
}

void mem_heap_free(struct k_mem_block *block)
{
    k_heap_free(&my_heap, block->data);
}

and the call is for 

mem_heap_alloc(with 2 page size) 
like a double buffer for steaming the 
data collected to flash, one page gets 
fill with samples while the other is written to flash, in theaory 

the other question i will leave for now it is also regarding a try to improve the current spi_nor_write function but anyhow it will not solve the 3ms issue and if it will remain on 1ms per page write as seen in the first test then we will be ok 

we think it is a performance issue (which we don't see when the same code runs on nrf52840 with a nand flash, which also writes 2k each time and not 256 pages)

one of the possible solutions is to change the api of the sensor driver so instead of using many time fetch and get, to get one api for fetching a pointer to a buffer of the samples stored in the driver, another idea is to have the driver use easy dma to put the samples to a buffer in memory with an interrupt after reaching 256 bytes, and then write to flash from this buffer. any idea if there is some support for one of this operations in zephyr/ncs ?,

any ideas as for which is better ?

any ideas of beast way to do preferred one ?

hope to read from you soon

best regards

Ziv

  • hi Håkon

    sorry for the delay..

    in build-folder/zephyr/zephyr.map, and find which thread is running on "0x20003200", and then increase the stack for this specific thread.

    i found this 

    .bss.logging_thread
                    0x0000000020003200       0xd8 zephyr/libzephyr.a(log_core.c.obj)
                    0x0000000020003200                logging_thread

    not sure which LOG, why should log thread fail and how should i deal with it, given i already at 92% of ram usage (so i can not just increase stacks how ever i want i assume

    this are some of the sizes in my .config file: 

    CONFIG_LOG_PROCESS_THREAD_STACK_SIZE=768
    CONFIG_IDLE_STACK_SIZE=320
    CONFIG_ISR_STACK_SIZE=2048
    CONFIG_THREAD_STACK_INFO=y
     
    To backtrack a fault, you can use arm-none-eabi-addr2line with the faulting address as an input:

    i am not sure how/where to use the line you added using the nrf Connect extension hope you can guide me a little bit on that

     

    hope to read you soon

    best regards

    Ziv

  • Hi Ziv,

     

    ziv123 said:

    i am not sure how/where to use the line you added using the nrf Connect extension hope you can guide me a little bit on that

    Unfortunately, the vscode plugin cannot do a reverse lookup for you.

    You'll need to use the command line prompt, or the internal terminal in vscode for this, here's an example on how to quickly open a build catalog from vs code:

     

    Then you run:

    c:\path\to\gnuarmemb\bin\arm-none-eabi-addr2line -e zephyr\zephyr.elf 0xFAULTING_ADDRESSES

     

    However, in this case it seems that your logging thread is running out of memory, so you'll need to adjust the stack size for it, which is done through this config that you posted:

    ziv123 said:
    CONFIG_LOG_PROCESS_THREAD_STACK_SIZE

    Try adjusting it to 1024 (or higher) to see if the fault disappears.

     

    Kind regards,

    Håkon

  • hi Håkon

    Try adjusting it to 1024 (or higher) to see if the fault disappears.

    though increasing it to 1024, seems to help at first look, i still wonder cause we have another board with the same code base that runs on nrf52840 (has larger RAM and FLASH), it has the same value in the CONFIG_LOG_PROCESS_THREAD_STACK_SIZE and we don't get the MPU FAULTS with it so it makes me wonder if that is the solution 

    this is my path for the compiler - /opt/gcc-arm-none-eabi-10.3-2021.10/ and under it there is no file or fir named "arm-none-eabi-addr2line" so again, not sure what is the command

    hope to read you soon

    best regards

    Ziv

  • Hi,

     

    ziv123 said:
    though increasing it to 1024, seems to help at first look, i still wonder cause we have another board with the same code base that runs on nrf52840 (has larger RAM and FLASH), it has the same value in the CONFIG_LOG_PROCESS_THREAD_STACK_SIZE and we don't get the MPU FAULTS with it so it makes me wonder if that is the solution 

    Logging is dynamic, ie. based on what is happening in real-time. A compiled image for 52832 and nrf52840 will differ, and timing on the bluetooth link (if running) also have some randomness thrown into the mix here. 

    ziv123 said:
    this is my path for the compiler - /opt/gcc-arm-none-eabi-10.3-2021.10/ and under it there is no file or fir named "arm-none-eabi-addr2line" so again, not sure what is the command

    I have the exact same version installed, and it's there at my end:

    /opt/gcc-arm-none-eabi-10.3-2021.10/bin/arm-none-eabi-addr2line

     

    If you have gcc (x86/x86-64) installed on your system, you can technically just call "addr2line" instead.

    Kind regards,

    Håkon

  • hi Håkon

    /opt/gcc-arm-none-eabi-10.3-2021.10/bin/arm-none-eabi-addr2line

    any idea whats wrong ?

Related