Zephyr crashing when I connect to a device

  has been helping me connect to a device and I am finally able to do that, but as soon as I do, Zephyr crashes.

This is with a lightly modified central_uart project

[00:00:00.024,810] <0x1b>[0m<inf> central_uart: Scan module initialized<0x1b>[0m␍␊

[00:00:00.024,810] <0x1b>[0m<inf> central_uart: NUS Client module initialized<0x1b>[0m␍␊

[00:00:00.029,846] <0x1b>[0m<inf> central_uart: Scanning successfully started<0x1b>[0m␍␊

[00:00:00.093,719] <0x1b>[0m<inf> central_uart: Filters matched. Address: E8:A0:DB:24:21:62 (random) connectable: 1<0x1b>[0m␍␊

[00:00:00.289,886] <0x1b>[0m<inf> central_uart: Connected: E8:A0:DB:24:21:62 (random)<0x1b>[0m␍␊

[00:00:00.290,039] <0x1b>[1;33m<wrn> central_uart: Failed to set security: -12<0x1b>[0m␍␊

[00:00:00.290,100] <0x1b>[1;31m<err> central_uart: Stop LE scan failed (err 0)<0x1b>[0m␍␊

[00:00:00.441,497] <0x1b>[0m<inf> central_uart: MTU exchange done<0x1b>[0m␍␊

ASSERTION FAIL [conn] @ WEST_TOPDIR/zephyr/subsys/bluetooth/host/gatt.c:5433␍␊

⇥ invalid parameters␍␊

␍␊

[00:00:01.791,351] <0x1b>[0m<inf> central_uart: Service discovery completed<0x1b>[0m␍␊

[00:00:01.800,994] <0x1b>[1;31m<err> os: r0/a1: 0x00000004 r1/a2: 0x00001539 r2/a3: 0x20000b60<0x1b>[0m␍␊

[00:00:01.800,994] <0x1b>[1;31m<err> os: r3/a4: 0x00000004 r12/ip: 0x00000000 r14/lr: 0x000217df<0x1b>[0m␍␊

[00:00:01.801,025] <0x1b>[1;31m<err> os: xpsr: 0x21000000<0x1b>[0m␍␊

[00:00:01.801,025] <0x1b>[1;31m<err> os: Faulting instruction address (r15/pc): 0x00031072<0x1b>[0m␍␊

[00:00:01.801,086] <0x1b>[1;31m<err> os: >>> ZEPHYR FATAL ERROR 4: Kernel panic on CPU 0<0x1b>[0m␍␊

[00:00:01.801,116] <0x1b>[1;31m<err> os: Current thread: 0x20002568 (BT RX WQ)<0x1b>[0m␍␊

[00:00:01.869,415] <0x1b>[1;31m<err> os: Halting system<0x1b>[0m␍␊

The program crashes here:

 

return snprintk(str, len, "%02X:%02X:%02X:%02X:%02X:%02X (%s)",
addr->a.val[5], addr->a.val[4], addr->a.val[3],
addr->a.val[2], addr->a.val[1], addr->a.val[0], type);

Which is in
addr.h (sic) Quite what code is doing in a header file is beyond me.

The prototype for the function is this:

int bt_addr_le_to_str(const bt_addr_le_t *addr, char *str, size_t len);

The values passed look reasonable to me.
addr->a.val[0]-[5] have the correct MAC address in it
str has (in decimal) 23 0 ... in it so that looks like it's just got a single character and anyway,
it shouldn't make a difference because, surely, that's being written into.

len = 30

type ends up with "random" in it.

I am assuming
snprintk is based on
snprint and so shouldn't copy more than len bytes anyway.

The string being used is that is passed into bt_addr_le_to_str is:
char addr[BT_ADDR_LE_STR_LEN];
and BT_ADDR_LE_STR_LEN is set to 30 so that all stacks up nicely.
Can anyone give me some pointers as to what might be wrong here?
BTW   that little project you sent me does not crash like this, but I can't
see where you set up the characteristics for that project, there is no reference to
BT_GATT_SERVICE_DEFINE anywhere in it. I would happily just mod that if I could work out how you've set up
its service and characteristics.
Thanks


Parents
  • Hello,

    ASSERTION FAIL [conn] @ WEST_TOPDIR/zephyr/subsys/bluetooth/host/gatt.c:5433

    This error message shows that an assertion is raised in gatt.c at this line: https://github.com/nrfconnect/sdk-zephyr/blob/0bc3393fb112ec80ebeab48cd023d69b1e9db757/subsys/bluetooth/host/gatt.c#L5433 which suggests that the 'conn' pointer passed to your bt_gatt_subscribe() call is set to NULL. 

    hat little project you sent me does not crash like this, but I can't
    see where you set up the characteristics for that project, there is no reference to

    I copied the NUS service and NUS client source files and placed them in the project's source directory (I also renamed every occurrence of 'nus' to 'custom').

    Best regards,

    Vidar

  • Morning,

    I did look with the debugger at that pointer and that's not the issue.

    The crash doesn't come until snprintk in bt_addr_le_to_str

    and as I say, in the first post, The values in passed to:

    return snprintk(str, len, "%02X:%02X:%02X:%02X:%02X:%02X (%s)",
    addr->a.val[5], addr->a.val[4], addr->a.val[3],
    addr->a.val[2], addr->a.val[1], addr->a.val[0], type);

    Look fine:



    This smacks more of a stack overflow to me,
    but I've massively increased the main stack size in prj.conf
    to no or little avail.

    The message on crash changes slightly:

    [00:00:07.404,510] <0x1b>[1;31m<err> os: Fault during interrupt handling␊

    <0x1b>[0m␍␊

    [00:00:07.404,571] <0x1b>[1;31m<err> os: Current thread:

    0x20002568 (BT RX WQ)<0x1b>[0m␍␊

    This looks like it;s some sort bluetooth receive issue? However, all I'm doing is

    connecting, no actual data is being sent as yet.

    On the subject of your code, it's not the peripheral_uart I'm trying, it's the

    central_uart which doesn't have those definitions in it. .

     

  • Sorry for not being more clear. What I meant to say is that the second COM port (/dev/ttyACM1) exposed by the interface MCU isn't available on this board as there are no nRF pins routed to this UART - USB bridge on the board. You can of course hook up an external FTDI device to the pin headers, or simply use the nRF USB.

    I still don't understand this.

    Can we route the

    LOG_ERR etc messages to a specific UART and have one UART for
    control comms to the PC that is being sent on via BLE and another
    purely for debug data, ie the data currently sent to
    LOG_ERR, LOG_WRN and LOG_INF?

    If so, surely we could route it to ttyACM0? I can see ttyAM0 is
    currently spitting out data because it shows exactly the same
    info that I am getting out of the UART I have connected to an
    FTDI lead with:

    CONFIG_LOG_BACKEND_UART=y
  • DiBosco said:

    ble_data_sent is getting called , which I think is called when the

    central_uart app gets confirmation that the remote unit has read
    the charctersitic that has been written to locally?

    Unless an ATT error is reported in the callback, the write command will have been sent. Did you check this error code?

    DiBosco said:
    Is your central_uart app on the DK definitely writing to:

    #define BT_UUID_CUSTOM_SRV_TX_VAL \

    Did you change the handle value from RX to TX as mentioned in my previous reply?

    DiBosco said:
    LOG_ERR etc messages to a specific UART and have one UART for
    control comms to the PC that is being sent on via BLE and another
    purely for debug data, ie the data currently sent to
    LOG_ERR, LOG_WRN and LOG_INF?

    If so, surely we could route it to ttyACM0? I can see ttyAM0 is
    currently spitting out data because it shows exactly the same

    You can select which device(s) the UART logger backend should use by modifying the zephyr,console property in the chosen node. The DK board sets this property to zephyr,console = &uart0 by default. Alternatively, if you want to have console (printf, printk) and log (LOG_*) output on separate devices, you can use the zephyr,log-uart property as shown in the Devicetree overlay I shared.  What is not possible is to have the Logger module to print debug messages to one device and only error messages to another. It seems like I misunderstood the question.

  • Unless an ATT error is reported in the callback, the write command will have been sent. Did you check this error code?

    If there was an error it would be spat out by this in ble_data_sent:

    if (err != 0)
    {
    LOG_WRN("ATT error code: 0x%02X", err);
    }

    And I see nothing there. If I stop the program there err is
    optimised out (and yes I have it building for debugging).
    Did you change the handle value from RX to TX as mentioned in my previous reply?
    Hmmm, no I did not, I missed that. <Scrolls back>

    Damnit, yes, that was not changed.

    OK, having changed that it now works in that I can put the
    remote unit into ATE mode. Thank whatever deity you might
    worship for that.

    AND, I get data back.

    That might have really broken the back of it finally.

    I have a couple of things I'm not sure about:

    ble_data_received_rx

    This is where I get stuff back in from the remote unit over BLE

    What is this for?

    ble_data_received_tx

    ------------------------------------------

    Also, one thing here in
    on_received_tx:

    return custom->cb.received_tx(custom, data, length);

    Why does this have to be done with the obfuscation
    of a callback? Why not call

    ble_data_received_rx

    directly? Or even better, simply do all the work in

    ble_data_received_tx

    Without having to call yet another function? Just
    seems like an extra layer of unnecessary complexity and code.

    The callbacks and their associated structs are a nightmare
    for reducing readability.

    Thanks!



  • I'm glad to hear that you are receiving responses now from your peripheral.

    DiBosco said:
    And I see nothing there. If I stop the program there err is
    optimised out (and yes I have it building for debugging).

    '-Og' still enables some optimizations. You can add CONFIG_NO_OPTIMIZATIONS=y to your prj.conf if you want to use '-O0'. Just make sure that you don't override this in your project build configuration in VS code.

    DiBosco said:
    ble_data_received_rx

    This is where I get stuff back in from the remote unit over BLE

    What is this for?

    ble_data_received_tx

    First one is invoked when receiving notifications from the RX characteristic and the other one is when you receive it from the TX characteristic. I don't know if your peripheral will actually send anything on the TX characteristic, but the screenshot from phone showed that both characteristics had the "notify" property. If not, I would have just updated the existing function to use the RX handle.

    DiBosco said:
    Why does this have to be done with the obfuscation
    of a callback? Why not call

    ble_data_received_rx

    directly? Or even better, simply do all the work in

    ble_data_received_tx

    Without having to call yet another function? Just
    seems like an extra layer of unnecessary complexity and code.

    The callbacks and their associated structs are a nightmare
    for reducing readability.

    The goal is to decouple the code and allow callbacks to be registered in the application. It was not designed to contain any application specific logic. I don't see any problems with changing this though. The custom_client.c file I made is more or less a copy of the nrf/subsys/bluetooth/services/nus_client.c file with some renaming and small changes I made along the way to try to match your GATT server.

  • The goal is to decouple the code and allow callbacks to be registered in the application. It was not designed to contain any application specific logic. I don't see any problems with changing this though. The custom_client.c file I made is more or less a copy of the nrf/subsys/bluetooth/services/nus_client.c file with some renaming and small changes I made along the way to try to match your GATT server.

    The thing that  doesn't make sense to me with that is that custom_client is full of custom code anyway, the UUIDs are unique to that header as is

    struct bt_custom_client_handles

    which depends on what characteristics are in a service.
    In other words,it's not a generic file that you can use with any
    service. If it was, I could more see the advantage in it.

    The likelihood is that I will go through and simplify as
    much of it as I can.
    I still don't understand a lot of this, and maybe tidying
    up the code as much as possible will help me with that.

    Thanks again for all the help, I'd never have got this
    working otherwise.
Reply
  • The goal is to decouple the code and allow callbacks to be registered in the application. It was not designed to contain any application specific logic. I don't see any problems with changing this though. The custom_client.c file I made is more or less a copy of the nrf/subsys/bluetooth/services/nus_client.c file with some renaming and small changes I made along the way to try to match your GATT server.

    The thing that  doesn't make sense to me with that is that custom_client is full of custom code anyway, the UUIDs are unique to that header as is

    struct bt_custom_client_handles

    which depends on what characteristics are in a service.
    In other words,it's not a generic file that you can use with any
    service. If it was, I could more see the advantage in it.

    The likelihood is that I will go through and simplify as
    much of it as I can.
    I still don't understand a lot of this, and maybe tidying
    up the code as much as possible will help me with that.

    Thanks again for all the help, I'd never have got this
    working otherwise.
Children
Related