I changed the central UART example to work with a demo I'm writing. I removed the hardware UART part and only send messages to a peripheral. What I do is I measure different values from sensors using I2C, then send the values over BT UART. the issue is I'm sending 4 strings in a row then go measure again, but the example always fails and restarts after it sends with the following output:
[00:00:11.341,430] <err> mpsl_init: MPSL ASSERT: 112, 2152
00> [00:00:11.341,888] <err> os: ***** HARD FAULT *****
00> [00:00:11.342,285] <err> os: Fault escalation (see below)
00> [00:00:11.342,742] <err> os: ARCH_EXCEPT with reason 3
00> [00:00:11.343,200] <err> os: r0/a1: 0x00000003 r1/a2: 0x20005388 r2/a3: 0x0003c878
00> [00:00:11.343,841] <err> os: r3/a4: 0x0003527d r12/ip: 0x20014b14 r14/lr: 0x000168a1
00> [00:00:11.344,482] <err> os: xpsr: 0x41000018
00> [00:00:11.344,909] <err> os: Faulting instruction address (r15/pc): 0x0002942c
00> [00:00:11.345,458] <err> os: >>> ZEPHYR FATAL ERROR 3: Kernel oops on CPU 0
00> [00:00:11.346,008] <err> os: Fault during interrupt handling
I read in another issue raised by someone else, that this issue is related to HFCLK and tried the fix posted by NRF support
adding this :
before calling
and my output was:
00>
00> HF Clock has started. Startup time: 344 uS
I read that a bad startup time is 1.5 ms and more, so this seems good?
I also get this error some times:
00> [00:05:50.469,024] <err> os: ***** BUS FAULT *****
00> [00:05:50.469,421] <err> os: Precise data bus error
00> [00:05:50.469,848] <err> os: BFAR Address: 0x302d3b3e
00> [00:05:50.470,336] <err> os: r0/a1: 0x302d3b36 r1/a2: 0x20005388 r2/a3: 0x00000000
00> [00:05:50.471,008] <err> os: r3/a4: 0x2000530c r12/ip: 0x0003d579 r14/lr: 0x0001382d
00> [00:05:50.471,649] <err> os: xpsr: 0x81000000
00> [00:05:50.472,106] <err> os: Faulting instruction address (r15/pc): 0x00013a2a
00> [00:05:50.472,656] <err> os: >>> ZEPHYR FATAL ERROR 0: CPU exception on CPU 0
00> [00:05:50.473,205] <err> os: Current thread: 0x200020a0 (main)
00> [00:05:50.478,698] [1;31m<err> fatal_error: Resetting system
00> *** Booting Zephyr OS build v3.0.99-ncs1 ***
I tried by using NCS 2.0.0 and 2.1.0 and the issue is in both. I also read the same issues.
I tried to step through using Debug and breakpoints after sending and every step over or step into crashes too.
Here is my changes from the UART central example:
instead of main, I moved it to this function:
BLE Connects fine so I doubt that there is an issue here. all the callback functions are unchanged from the example except for the following:
I am searching by name instead of service
All the HW Uart stuff is removed. and When I receive data, by BLE I just ignore it, Since I only want to send.
My send message is the following:
The same but instead of sending from UART Call this with CHAR * payload
I manage to send data correctly, and I receive it correctly on the other side, and sending multiple values in a row is fine, it only breaks after sending. the first thing called after sending all the strings is a delay statement (k_busy_wait(), or k_msleep()), and the device just crashes afterwards.
I would really appreciate your help because I am completely lost.
Thank you in advance.
Edit: Managed to get more details since I moved to windows in hopes to solve it. I was using MACOS.
I ran the debugger and set the breakpoint at fatal errors. The result is the following:
Thread 7 hit Breakpoint 2, k_sys_fatal_error_handler (reason=0, esf=0x20021544 <z_interrupt_stacks+32708>) at C:/ncs/v2.0.0/nrf/lib/fatal_error/fatal_error.c:23
Call stack:
The [00:00:11.341,430] <err> mpsl_init: MPSL ASSERT: 112, 2152 erros stopped coming up by moving to windows, but now I only get this error:
00> [00:01:20.850,524] <err> os: ***** BUS FAULT *****
00> [00:01:20.850,555] <err> os: Precise data bus error
00> [00:01:20.850,555] <err> os: BFAR Address: 0x322d3b3d
00> [00:01:20.850,585] <err> os: r0/a1: 0x322d3b35 r1/a2: 0x00000000 r2/a3: 0x00000000
00> [00:01:20.850,616] <err> os: r3/a4: 0x00000000 r12/ip: 0x00000000 r14/lr: 0x00014639
00> [00:01:20.850,616] <err> os: xpsr: 0x41000000
00> [00:01:20.850,646] <err> os: Faulting instruction address (r15/pc): 0x00014836
00> [00:01:20.850,646] <err> os: >>> ZEPHYR FATAL ERROR 0: CPU exception on CPU 0
00> [00:01:20.850,677] <err> os: Current thread: 0x200025a8 (main)
00> [00:01:21.098,541] <err> fatal_error: Resetting system
I also want to clarify how the error happens. I have a loop that loops 5 times and each time calls the bt_central_send. what happens in the loop is:
use k_malloc to allocate memory for char buffer, use strcpy and strcat from sting.h to for the string. Then calles the send function with the string, and after the function is done it calls k_free on the buffer, then repeat until the 5 trings are sent. The receiver gets all the strings, they are all correct, and the function that sends returns, and its back to the main thread, the next thing after the function in the main thread is a k_msleep statement. I hope this info is enough to help trace the error.
Adding also: Sorry bare with me I am new to this.
I back-traced the error by doing this in the debug terminal:
-exec set $exc_frame = ($lr & 0x4) ? $psp : $msp
-exec set $stacked_xpsr = ((uint32_t *)$exc_frame)[7]
-exec set $exc_frame_len = 32 + (($stacked_xpsr & (1 << 9)) ? 0x4 : 0x0) + (($lr & 0x10) ? 0 : 72)
-exec set $sp =($exc_frame + $exc_frame_len)
-exec set $lr =((uint32_t *)$exc_frame)[5]
-exec set $pc =((uint32_t *)$exc_frame)[6]
-exec backtrace
The output was the following:
#0 z_impl_sensor_sample_fetch (dev=0xfffffeb0) at C:/ncs/v2.0.0/zephyr/include/zephyr/drivers/sensor.h:510
#1 sensor_sample_fetch (dev=0xfffffeb0) at zephyr/include/generated/syscalls/sensor.h:85
#2 pressure_sensor_read () at ../src/sensor_app.c:130
#3 0x0001475c in main () at ../src/main.c:64
which is right after the delay I mentioned before, so it goad like this:
Read pressure & temp sensor ->
Read Humidity and temp sensor ->
Read accel sensor ->
Call function to check values and send ->
BT sends the 5 strings correctly ->
k_msleep(for some time) ->
Read pressure & temp sensor -> (Only breaks if BT Send was used before, this is main():64)
It goes to this:
which also goas to:
This is in zephyr sensor.h driver, I did not change this
Finaly I reached:
which is confiremed by looking at the error output:
00> [00:02:31.014,923] <err> os: Faulting instruction address (r15/pc): 0x0001495a //This instruction address is presistant now
if I look at the zephyr.lst file i see:
00014954 <pressure_sensor_read>:
int pressure_sensor_read(){
14954: b538 push {r3, r4, r5, lr}
int rc = sensor_sample_fetch(pressure_sensor);
14956: 4b13 ldr r3, [pc, #76] ; (149a4 <pressure_sensor_read+0x50>)
14958: 6818 ldr r0, [r3, #0]
const struct sensor_driver_api *api =
1495a: 6883 ldr r3, [r0, #8]
return api->sample_fetch(dev, SENSOR_CHAN_ALL);
The sensor driver is working fine as long as I don't send any BT messages. so the same section of code is called over and over without issues, I can do thousands of reads and print on the terminal and do all sorts of stuff, but If i use the ble central uart and send this happens.
The sensor driver Im using is
which is shipped with the framework without any edits from me.
I would really appricate any help.
Edit:
I managed to fix !!
what I was doing is I was calling device_get_binding("device name") at the start of the application
so I had a sensor init function that included this:
Without BLE calling this once was fine, but once I started using BLE, for some reason I had to call this everytime before I called
which is wierd at least to me.
My code works fine now, but I am still wondering why the original code didn't
I hope I get some explination on what is going on.
I found it and manged to return my code to normal. and also managed to feel stupid afterward.
I had a buffer that stored the Strings and wrote them to BLE, but the issue was I was exceeding the buffer size before sending. Nothing to do with BLE, I had a single bad index value that was writing to the wrong location, and did not trigger a fault right away. I figured this out using the debug console. maybe what I am missing is I need to take a break.
Thanks NRF for existing I guess? sorry to bother you.