This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

Running into a SIGTRAP, backtrace shows only main(), apparently happening in libgloss/arm/crt0.S

I'm using SDK v16 on an nRF52832 and occasionally run into a SIGTRAP with my custom application mainly forwarding BLE traffic from/to UART.

It works fine, only after quite a few interactions (can be 3 can be 30) - initiated by a BLE UART client running on Android - it stops being responsive.

Attached debugger (Black Magic Probe) provides me with the following output:

Starting program: /data/src/nrf5x-sdk-vanilla/projects/[..]/s132/armgcc/_build/nrf52832_xxaa.out

Program received signal SIGTRAP, Trace/breakpoint trap.
warning: while parsing target memory map (at line 1): Required element <memory> is missing
0x0002be5c in main ()
(gdb) l
1    ../../../../../../../../../libgloss/arm/crt0.S: No such file or directory.
(gdb) bt
#0  0x0002be5c in main ()
(gdb)

I'd be happy for any hint or idea. I could think of this being a arbitrary memory corruption. However I'm wondering about the SIGTRAP (not SEGV), libgloss/arm/crt0.S (no user code), as well as consistently ending up in this very state.

Parents
  • Compiling with -DDEBUG, -g3 and -O0 reveals some more:

    Program received signal SIGTRAP, Trace/breakpoint trap.
    warning: while parsing target memory map (at line 1): Required element <memory> is missing
    0x0002ce36 in app_error_fault_handler (id=16385, pc=225711, info=536936400)
        at ../../../../../../components/libraries/util/app_error_weak.c:100
    100	    NRF_BREAKPOINT_COND;
    (gdb) bt
    #0  0x0002ce36 in app_error_fault_handler (id=16385, pc=225711, info=536936400)
        at ../../../../../../components/libraries/util/app_error_weak.c:100
    #1  0x0002ccc4 in app_error_handler (error_code=16385, line_num=225711, 
        p_file_name=0x4001 "\211\240\201hh\200\211\340\201\233\346\020&O\360#\b")
        at ../../../../../../components/libraries/util/app_error_handler_gcc.c:49
    Backtrace stopped: previous frame identical to this frame (corrupt stack?)
    (gdb)

  • So error_code=16385 is 0x4001 which according to components/libraries/util/app_error.h is

    (NRF_FAULT_ID_SDK_RANGE_START + 1) /**< An error stemming from a call to @ref APP_ERROR_CHECK or @ref APP_ERROR_CHECK_BOOL. The info parameter is a pointer to an @ref error_info_t variable. */

    which is already bringing me closer - telling me it's a result from an APP_ERROR_CHECK() call (not explaining the corrupted stack yet, though). Now trying to figure out which APP_ERROR_CHECK() call.

    Unfortunately the info appears to be screwed. According to above comment for the define, info=536936400 is supposed to be a pointer to an instance of struct error_info_t, containing the information I'm looking for. Trying to access it via GDB however results in;

    (gdb) p *((error_info_t*)(info))
    Cannot access memory at address 0x2000ffd0

    Besides I do wonder about p_file_name=0x4001. How did the error_code make it as arg towards p_file_name which appears to actually contain a pointer?

  • a) Are you sure ret is actually 0x01 in this case? Here is the error codes that can be returned by sd_ble_gatts_hvx

    b) This will happen as the timing in the SD will be messed up when you halt at a breakpoint. i.e. the timers will continue to run and the event manager will be lost.

  • Re a) I don't know what else to tell from the GDB output, so yes, fairly sure

    Re b) it's the same corrupted stacktrace I get /without/ the breakpoint. See initial post (corrupted stacktrace without breakpoints set) and the one with breakpoints right after the break point in noop() is called only a line later. It's the same corrupted backtrace within APP_ERROR_CHECK(). Doesn't look like a co-incidence or GDB/breakpoint related (timing-)issue to me.

  • Are you able to recreate this issue on a nordic DK? and what hardware are you currently running your code on?

  • Hardware is an nRF52832.

    I now ordered a PCA10040 and will then try to reproduce. Can you elaborate on why you think this might be hardware specific?

  • Sorry for the late reply, Just back from vacation.

     

    daten said:
    Hardware is an nRF52832.

     Ok, so custom board then I suppose. I do not necessarily think it is a hardware issue, but it could be related to clock source/timing. Asking in case we would like to try to recreate the issue here.

Reply Children
No Data
Related