This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

Nordic NRF SDK v1.6: Issue with CIVETWEB on NRF9160

 Hello

I'm testing the Release v1.6 of NRF Connect SDK in combination with our application. On Nordic NRF Connect SDK v1.5.1 we managed to get the following setup working:

- nRF9160-DK with Ethernet shield from Phytec attached to it.

- Civetweb (CONFIG_CIVETWEB=y) --> note that this only works with Minimal LIBC

Now when I update to Nordic NRF Connect v1.6.0 civetweb crashes when running the init function (mg_start). However this happens only with the board nrf9160dk_nrf9160ns and not with QEMU.

I get the following console output:

[00:00:01.906,188] <inf> net_config: Interface 1 (0x20019880) coming up
[00:00:01.914,367] : Running dhcpv4 client...
[00:00:01.972,351] <err> os: Exception occurred in Secure State
[00:00:01.994,232] <err> os: ***** HARD FAULT *****
[00:00:02.015,014] <err> os:   Fault escalation (see below)
[00:00:02.036,529] <err> os: ***** BUS FAULT *****
[00:00:02.057,220] <err> os:   Precise data bus error
[00:00:02.078,186] <err> os:   BFAR Address: 0x50008158
[00:00:02.099,395] <err> os: r0/a1:  0x00000000  r1/a2:  0x00000000  r2/a3:  0xffffffff
[00:00:02.123,443] <err> os: r3/a4:  0x0000f3f8 r12/ip:  0xee6b2800 r14/lr:  0x0001a60b
[00:00:02.147,552] <err> os:  xpsr:  0x61040000
[00:00:02.168,029] <err> os: Faulting instruction address (r15/pc): 0x0001a614
[00:00:02.191,284] <err> os: >>> ZEPHYR FATAL ERROR 0: CPU exception on CPU 0
[00:00:02.214,447] <err> os: Current thread: 0x2001a600 (main)
[00:00:02.256,256] <err> os: Halting system

I attached the debugger and figured out that the crash happens in the following function:

Could you help me to narrow down the issue? How can I use the exception information to figure out more about the issue?

I already tried to increase the stack size of civetweb which didn't help. I also tried to increase the stack size of the thread that calls mg_start. Unfortunately my experience is that  the maintainer of civetweb is not very active in the Zephyr Community. Therefore I will not have much luck posting the issue on Zephyr GitHub.

Best regards,

Michael

Parents
  • Hi,

     

    The thread that is failing looks to be your main thread. Did you try adjusting the CONFIG_MAIN_STACK_SIZE?

     

    Could you help me to narrow down the issue? How can I use the exception information to figure out more about the issue?

     You can use arm-none-eabi-addr2line -e build/zephyr/zephyr.elf 0xFAULTING_ADDR to see if it gives a hint on where in your src the fault occurred.

    The content of PC and LR is in the flash, so you can try to do a lookup on those addresses

    arm-none-eabi-addr2line -e build/zephyr/zephyr.elf 0x1a60b

     

    Kind regards,

    Håkon

Reply
  • Hi,

     

    The thread that is failing looks to be your main thread. Did you try adjusting the CONFIG_MAIN_STACK_SIZE?

     

    Could you help me to narrow down the issue? How can I use the exception information to figure out more about the issue?

     You can use arm-none-eabi-addr2line -e build/zephyr/zephyr.elf 0xFAULTING_ADDR to see if it gives a hint on where in your src the fault occurred.

    The content of PC and LR is in the flash, so you can try to do a lookup on those addresses

    arm-none-eabi-addr2line -e build/zephyr/zephyr.elf 0x1a60b

     

    Kind regards,

    Håkon

Children
  • Hi Håkon

    Thanks for the help. Increasing the main stack size didn't help. I increased the stack up to 16 kB. Resolving the address pointed me to the module pthread_key:96

    /opt/toolchains/gcc-arm-none-eabi-10-2020-q4-major/bin/arm-none-eabi-addr2line -e build/zephyr/zephyr.elf 0x0001a684
    /opt/zephyrproject/zephyr/lib/posix/pthread_key.c:96
    

    This is inline with what the debugger was showing me.

    Do you have any idea why the error happens on the target and doesn't happen in QEMU. It would also help if we could figure out due to which change from v1.5.1 to v1.6.0 the error was introduced.

    Best regards,

    Michael

  • Hi Michael,

     

    My apologies, this was also shown in your initial picture, which just flew by me, unfortunately. This line seems to be causing issues for you:

    https://github.com/nrfconnect/sdk-zephyr/blob/master/lib/posix/pthread_key.c#L96

     

    In the stack frames, you should be able to right-click and select this specific frame, allowing you to peek at the variables at the time of the fault occurring. The value of thread_spec_data (and .key member) is very interesting here.

     

    Kind regards,

    Håkon

  • Hi Håkon

    For me it looks like one entry is invalid in the thread->key_list. This is when the debugger hits pthread_setspecific the first time:

    Now this are the individual iterations in the for loop:

    I guess 0x6f727245 is an invalid address. Is that correct? Could this issue have anything to do with THREAD_LOCAL_STORAGE support? I noticed that this is disabled on my target due to lack of support for it from the toolchain.

    Best regards,

    Michael 

  • Hi Håkon

    I'm one step further. I guess that the calling thread of mg_start (the one that starts civetweb webserver) needs to be a POSIX thread. It looks like civetweb is expecting this because it uses 

    struct posix_thread *thread = (struct posix_thread *)pthread_self();

    I guess than it's kind of a coincidence that it works on QEMU and it worked on SDK v1.5.1. I'm still working on getting it working but I will keep you posted.

    Best regards,

    Michael

  • Hi Michael,

     

    The line numbers in civetweb.c indicates that you're on a newer version than what was tagged out for ncs v1.6.0. Have you manually updated this?

    If you revert to the git hash that was used in ncs v1.6.0 (which is the same as the git hash in ncs v1.5.1), do you experience the same issues?

     

    Kind regards,

    Håkon

Related