Zephyr crashing when I connect to a device

  has been helping me connect to a device and I am finally able to do that, but as soon as I do, Zephyr crashes.

This is with a lightly modified central_uart project

[00:00:00.024,810] <0x1b>[0m<inf> central_uart: Scan module initialized<0x1b>[0m␍␊

[00:00:00.024,810] <0x1b>[0m<inf> central_uart: NUS Client module initialized<0x1b>[0m␍␊

[00:00:00.029,846] <0x1b>[0m<inf> central_uart: Scanning successfully started<0x1b>[0m␍␊

[00:00:00.093,719] <0x1b>[0m<inf> central_uart: Filters matched. Address: E8:A0:DB:24:21:62 (random) connectable: 1<0x1b>[0m␍␊

[00:00:00.289,886] <0x1b>[0m<inf> central_uart: Connected: E8:A0:DB:24:21:62 (random)<0x1b>[0m␍␊

[00:00:00.290,039] <0x1b>[1;33m<wrn> central_uart: Failed to set security: -12<0x1b>[0m␍␊

[00:00:00.290,100] <0x1b>[1;31m<err> central_uart: Stop LE scan failed (err 0)<0x1b>[0m␍␊

[00:00:00.441,497] <0x1b>[0m<inf> central_uart: MTU exchange done<0x1b>[0m␍␊

ASSERTION FAIL [conn] @ WEST_TOPDIR/zephyr/subsys/bluetooth/host/gatt.c:5433␍␊

⇥ invalid parameters␍␊

␍␊

[00:00:01.791,351] <0x1b>[0m<inf> central_uart: Service discovery completed<0x1b>[0m␍␊

[00:00:01.800,994] <0x1b>[1;31m<err> os: r0/a1: 0x00000004 r1/a2: 0x00001539 r2/a3: 0x20000b60<0x1b>[0m␍␊

[00:00:01.800,994] <0x1b>[1;31m<err> os: r3/a4: 0x00000004 r12/ip: 0x00000000 r14/lr: 0x000217df<0x1b>[0m␍␊

[00:00:01.801,025] <0x1b>[1;31m<err> os: xpsr: 0x21000000<0x1b>[0m␍␊

[00:00:01.801,025] <0x1b>[1;31m<err> os: Faulting instruction address (r15/pc): 0x00031072<0x1b>[0m␍␊

[00:00:01.801,086] <0x1b>[1;31m<err> os: >>> ZEPHYR FATAL ERROR 4: Kernel panic on CPU 0<0x1b>[0m␍␊

[00:00:01.801,116] <0x1b>[1;31m<err> os: Current thread: 0x20002568 (BT RX WQ)<0x1b>[0m␍␊

[00:00:01.869,415] <0x1b>[1;31m<err> os: Halting system<0x1b>[0m␍␊

The program crashes here:

 

return snprintk(str, len, "%02X:%02X:%02X:%02X:%02X:%02X (%s)",
addr->a.val[5], addr->a.val[4], addr->a.val[3],
addr->a.val[2], addr->a.val[1], addr->a.val[0], type);

Which is in
addr.h (sic) Quite what code is doing in a header file is beyond me.

The prototype for the function is this:

int bt_addr_le_to_str(const bt_addr_le_t *addr, char *str, size_t len);

The values passed look reasonable to me.
addr->a.val[0]-[5] have the correct MAC address in it
str has (in decimal) 23 0 ... in it so that looks like it's just got a single character and anyway,
it shouldn't make a difference because, surely, that's being written into.

len = 30

type ends up with "random" in it.

I am assuming
snprintk is based on
snprint and so shouldn't copy more than len bytes anyway.

The string being used is that is passed into bt_addr_le_to_str is:
char addr[BT_ADDR_LE_STR_LEN];
and BT_ADDR_LE_STR_LEN is set to 30 so that all stacks up nicely.
Can anyone give me some pointers as to what might be wrong here?
BTW   that little project you sent me does not crash like this, but I can't
see where you set up the characteristics for that project, there is no reference to
BT_GATT_SERVICE_DEFINE anywhere in it. I would happily just mod that if I could work out how you've set up
its service and characteristics.
Thanks


  • DiBosco said:
    I can only assume that when modifying the code to allow for this new subscription I've messed something up.

    I’m not sure what could have triggered the unaligned memory access. The first thing I would do is look up the faulting instruction address to see which line of code it maps to. You can do this either by using arm-zephyr-eabi-addr2line or by placing a breakpoint at the address.

    With addr2line:

    1. Open a terminal in Vs code with the toolchain env set:

    2. Enter the following command from the "nRF connect" terminal

    In my case:

    DiBosco said:
    Would be great to have one of those take the traffic and send it to Cutecom to see what's going on at the same time I'm trying to use the BLE "UART"

    Unfortunately, the second COM port on the nRF52840 DK is not connected to the nRF chip on this board. Otherwise, you could have created a board overlay to direct logs to the second UARTE instance.

    DiBosco said:
    I do, of course, realise this and I also know that coding style varies massively from person to person. I do appreciate that some people even like K&R style indentation.

    I don’t have strong opinions on this to be honest, but I think it makes sense for us to follow the same style as Zephyr for consistency, even though there are always things that could be improved.

  • I've solved the crashing issue, that trick worked. 

    Am currently getting

    #define ENOTCONN 128 /* Socket is not connected */
    Error when I try to write to this:

    err = bt_custom_client_send(&custom_client, data_ptr, len);
    Even though, the system's telling me it's connected to the custom unit.

    Unfortunately, the second COM port on the nRF52840 DK is not connected to the nRF chip on this board. Otherwise, you could have created a board overlay to direct logs to the second UARTE instance.


    I don't understand what you're saying here.

    First question is:

    Assuming you have more than one UART, can you route the

    LOG_WRN & LOG_ERR
    traffic to that other UART? If so, how. How and where do you
    specify *which* UART
    is being used for the debug information?

    Also, I assume you can change between spitting out
    LOG_ERR and LOG_WRN and both? If so, how?


    Second question

    Assuming the above is possible:

    a. This second UART: what do you mean it's not connected to the nrf chip? Surely the second UART is a peripheral IN the nrf? Also, I thought this was a funky chip where you could connect any pin to any peripheral? So as long as the pins are accessible on the SIL headers I can connected up an FTDI  3V3 cable to the second UART just as I am to the first to get UART traffic in and out.


    b. There are clearly some virtual com ports on this DK, when the board is powered up via the USB  connection to the board, I get:

    crw-rw-rw- 1 root dialout 166, 0 Dec  4 19:50 /dev/ttyACM0
    crw-rw-rw- 1 root dialout 166, 1 Dec  4 19:50 /dev/ttyACM1


    Now, I know that the debug information comes out of this because at one point I somehow managed to open some sort of terminal in VSC and when

    CONFIG_LOG_BACKEND_UART=y

    Was in prj.conf

    I got the same serial info into VSC as I currently get into Cutecom.

    So is it not possible to use that VCP in Cutecom
    as the BACKEND UART? This would be the easiest and best way to
    do it.
    Just in case it matters, in the interest of readability I have split
    all the serial stuff into a new file, so when I not get a packet
    in over the UART in serial.c, I call this function in main.c

    void send_ble_data_to_xlrt(uint8_t *data_ptr, uint16_t len)
    {
    int err;
    err = bt_custom_client_send(&custom_client, data_ptr, len);
    if (err != NO_ERROR)
    {
    LOG_WRN("CUSTOM send timeout");
    }

    }
    This is where I get the error -128 not connected

    Eventually I'll put all the BLE stuff in another file
    so main is as it should be, a very basic file. Baby steps though.
  • Glad to hear you found the reason for the usage fault. Regarding the -ENOTCONN, could you check if it is returned here: 

    ? The 'conn' object should be assigned at the end of the bt_custom_handles_assign() function.

    Also, not directly related to this issue, but while reviewing this function again I noticed that I assigned the RX handle instead of the TX handle here:

    DiBosco said:
    I don't understand what you're saying here.

    Sorry for not being more clear. What I meant to say is that the second COM port (/dev/ttyACM1) exposed by the interface MCU isn't available on this board as there are no nRF pins routed to this UART - USB bridge on the board. You can of course hook up an external FTDI device to the pin headers, or simply use the nRF USB.

    From the nRF52840 DK user guide:

    (Ref. https://docs.nordicsemi.com/bundle/ug_nrf52840_dk/page/UG/dk/vir_com_port.html )

    To enable logging over the nRF USB port, you can try adding the following 2 overlays to the boards folder in your project source directory and do a pristine build.

    <PROJ_DIR>/boards/
    ├── nrf52840dk_nrf52840.conf
    └── nrf52840dk_nrf52840.overlay

    4555.nrf52840dk_nrf52840.overlay

    0636.nrf52840dk_nrf52840.conf

    DiBosco said:

    Assuming you have more than one UART, can you route the

    LOG_WRN & LOG_ERR
    traffic to that other UART? If so, how. How and where do you
    specify *which* UART

    The UART backend does not support individual filtering per device. I.e., you can't define one log level for one UART/USB instance and a different level for another.

  • Glad to hear you found the reason for the usage fault. Regarding the -ENOTCONN, could you check if it is returned here: 

    This morning  it's not giving me the same error and it's allegedly sending messages out to the remote unit. I haven't changed anything, so goodness knows what was going on last night. I do often find with this ecosystem is very flaky. I often have to clean and recompile because it sometimes just doesn't work after a normal compile and I often have to program twice because it fails the first time, so maybe it was the end of a long fruitless day and had I cleaned and rebuilt it would have been OK.

    Today, I have found out, from a colleague, that sending "status<CR><LF>" into the main RX characteristic puts the unit  I'm trying to communicate with into ATE mode.

    Previously, I've been trying to get this  this really complicated binary comms system it normally uses going. With this binary method the master is constantly sending out frames and expecting responses back and I had to use a specific program. So I was going:

    Program using PC_UART <-> nrf52840 bLE <-> Remote Unit BLE

    Now I can go:

    Cutecom on PC <->nrf52840 bLE <-> Remote Unit BLE

    And I don't have to worry about frames, it can all be ASCII based and simple. I can switch on the verbose errors and warnings in central_uart and read them on Cutecom along with any other data coming back over BLE from the custom unit.


    I have proven I can get into ATE on the custom unit mode over BLE by using BLE Scanner app on my phone and once I send  status<CR><LF> the unit goes into ATE mode, with the LCD on the front panel confirming it.

    However, if I send that via the central_uart app on the 52840DK the remote does not go into ATE mode. So, I hook up the DK running your central_uart app to Cutecom and I send

    status<CR><LF>

    The central_uart app is clearly getting that exact text in, I can see that by either mirroring the message back to Cutecom or stopping the app with the debugging and looking at what it received.

    Once the status request is received by UART into centrral_uart app, this is  called:

    err = bt_custom_client_send(&custom_client, data_ptr, len);


    I can see with the debugger that

    data_ptr points to:

    "status<CR><LF>"


    len is 8

    err = 0 is the response back from the bt_custom_client_send function which suggests is being sent to the remote unit.

    ble_data_sent
    is getting called , which I think is called when the

    central_uart app gets confirmation that the remote unit has read
    the charctersitic that has been written to locally?

    Is your central_uart app on the DK definitely writing to:

    #define BT_UUID_CUSTOM_SRV_TX_VAL \
    BT_UUID_128_ENCODE(0x651135b0, 0xdfe9, 0x45a2, 0x95c1, 0x46552a8a77e2)

    What do all these functions:

    uint8_t (*received_tx)(struct bt_custom_client *custom, const uint8_t *data, uint16_t len);
    uint8_t (*received_rx)(struct bt_custom_client *custom, const uint8_t *data, uint16_t len);
    void (*sent)(struct bt_custom_client *custom, uint8_t err, const uint8_t *data, uint16_t len);
    void (*unsubscribed)(struct bt_custom_client *custom);

    So and when should they be called? It looks to me,
    (although I might be wrong because all the callbacks
    within structures [sometimes] within further structures really
    are hard to read) that they are set up with

    struct bt_custom_client_init_param init = {
    .cb = {

    .received_tx = ble_data_received_tx,
    .received_rx = ble_data_received_rx,
    .sent = ble_data_sent,
    }
    };


    and

    err = bt_custom_client_init(&custom_client, &init);
    But there are four callbacks in the struct
    struct bt_custom_client_cb
    in custom_client.h and we on;y pass three here.
    So maybe they aren't really connected.

    So, I guess the big question is:

    If the central_uart app is 100% definitely getting the right
    string to send to the remote unit and the central_uart app claims
    that is has sent the string to the remote unit, and I even,
    seemingly get confirmation back that the remote unit has received
    something, how can we ascertain it's actually put the right
    data into the Tx characteristic? I would hook up the other DK and
    see what is received but of course that no longer connects because
    of all the new stuff we have in central now.
  • Sorry for not being more clear. What I meant to say is that the second COM port (/dev/ttyACM1) exposed by the interface MCU isn't available on this board as there are no nRF pins routed to this UART - USB bridge on the board. You can of course hook up an external FTDI device to the pin headers, or simply use the nRF USB.

    I still don't understand this.

    Can we route the

    LOG_ERR etc messages to a specific UART and have one UART for
    control comms to the PC that is being sent on via BLE and another
    purely for debug data, ie the data currently sent to
    LOG_ERR, LOG_WRN and LOG_INF?

    If so, surely we could route it to ttyACM0? I can see ttyAM0 is
    currently spitting out data because it shows exactly the same
    info that I am getting out of the UART I have connected to an
    FTDI lead with:

    CONFIG_LOG_BACKEND_UART=y
« 2 3 4 5 6