This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

Group address message crashes all nodes?!

Hi there!

Me again... I'm having issues with networks that consist of more than two nodes. When my network consists of three nodes and one node sends a message to another, the node that wasn't addressed, crashes. The network consists of one nRF52840 DK, one nRF52840 Feather and one nRF52832 DK. They are all running FreeRTOS and are built with ARMGCC on Windows 10.

The same happens when a message is sent to all nodes (0xFFFF); they both crash. I see a pattern, but find it hard to point to a cause. It looks like the node crashes when it receives a message that isn't addressed to exactly and only him. Underneath a snippet of output in the terminal from the server; the logging just stops and FreeRTOS freezes.

<info> app: RX: [aop: 0x00C1]

<info> app: RX: Msg
<info> app: Led is now turned off.
<info> app: TX: [aop: 0x00C4]

<info> app: TX: Msg

Underneath a snippet of the terminal from the client side.

<info> app: TX: [aop: 0x00C1] 

<info> app: TX: Msg
<info> app: Set "off" has been called.
<info> app: RX: [aop: 0x00C1]

<info> app: RX: Msg
<info> app: Led is now turned off.
<info> app: TX: [aop: 0x00C4] 

<info> app: TX: Msg
<info> app: RX: [aop: 0x00C4]

<info> app: RX: Msg
<info> app: Server acknowledged.
<info> app: TX: [aop: 0x00C1] 

<info> app: TX: Msg
<info> app: RX: [aop: 0x00C1]

<info> app: RX: Msg
<info> app: Led is now turned off.
<info> app: TX: [aop: 0x00C4]

<info> app: TX: Msg
<info> app: RX: [aop: 0x00C4]

<info> app: RX: Msg
<info> app: Server acknowledged.

When the client sends a message to specifically a unicast address, the server receives and handles it like below.

<info> app: RX: [aop: 0x00C1]

<info> app: RX: Msg
<info> app: Led is now turned off.
<info> app: TX: [aop: 0x00C4]

<info> app: TX: Msg
<info> app: RX: [aop: 0x00C1]

<info> app: RX: Msg
<info> app: Led is now turned off.
<info> app: TX: [aop: 0x00C4]

<info> app: TX: Msg

Furthermore, this problem also occurs with messages from proxy nodes; getting the ttl value or composition data will result in a crash for all other nodes. I have absolutely no idea as to what might cause the described behaviour. Has anyone got an idea of what might cause the problem? The weird thing is that it worked for a good while until i reset all nodes and provisioned them again.

Thanks in advance and kind regards,

Jochem

  • Yep, found it out. Defining the "DEBUG_NRF" or "DEBUG_NRF_USER" macro resulting in the definition of "configASSERT" in FreeRTOSConfig.h results in the "ASSERTION FAILED at :0". When said macro isn't defined, the nodes crash.

    Below the part in FreeRTOSConfig.h it's all about.

    #if defined(DEBUG_NRF) || defined(DEBUG_NRF_USER)
    #include "nrf_mesh_assert.h"
    #define configASSERT( x )                                               NRF_MESH_ASSERT(x)
    #endif

    It's clear that something is wrong, but what? Has anyone got an idea? Now that I think of it, it's also impossible to subscribe to addresses. The app says "Not a subscribe model" when trying to do so. Could it have to do something with that?

    Kind regards,

    Jochem

  • Hi Mttrinh,

    I went back to the first commit I've done and can now say with 100% certainty it has always been this way; I just didn't notice it. Hence, I've come to the conclusion that the FreeRTOS example on GitHub doesn't work with my SDK versions.

    I'll try to revert to V4.1.0 and V16.0.0 to see if the issue is resolved. Gonna take some time though... Have to redo all changes to the SDK's and name conflicts between FreeRTOS and the SDK for Mesh.

    I'll let you know how it goes and if it turns out to be functional.

    Kind regards,

    Jochem

  • It hasn't been clear what is causing the issue but might be something with FreeRTOS and SDK versions like you said. Keep me updated :) 

  • Hi Mttrinh,

    Okay, I'm glad to let you know that it's working as it's supposed to be. None of the above described behaviour occurs anymore. Had to do a rewrite from the bottom up to determine the cause.

    As far as the origin of the problem, I've got some suspicions. In the end, the SDK's didn't turn out to be the problem; everything is working with the most recent (according to the example modified) SDK's. Am very happy with that.

    I've found the following things to cause said behaviour:

    1. The mesh_stack_init_params being defined in a local context resulting in the destruction when the function returns. As is the case when using an object oriented approach and initializing the mesh stack in the constructor.
    2. Using the Idle Handler method instead of the Bearer Event Handler as described in the example. I should note however, that there are a lot of tasks running in our program. It's very plausible the Idle Handler works fine with less demanding tasks.

    To be honest, I think using the SDK with the SDK for Mesh running on FreeRTOS being written in C++ didn't help either. Since it isn't really supported, a mistake or incompatibility can easily slip in...

    All in all I'm glad the problem is solved; it was quite a large one as you can tell. Neither of the ways out (abandoning C++ or FreeRTOS) was an option to me, so it really had to work in the end.

    Anyway, thanks for your support and have a great day!

    Kind regards,

    Jochem

Related