Openthread SED detection of parent

Hi,

I am trying to come up with a different approach to SED reattachment using a backing off MLE session. Is there a way for the SED to keep on the polling even when in detached state? As far as I know, the polling would get ACK/NAK from the parent. So when the SED gets ACK after a period of NAK, that indicates a possible parent come back and the SED can issue a MLE session then. But as I can see, the SED stops polling after detached from the thread network. 

Cheers,

Kaushalya

Parents
  • Also I can see current pulses from the SED like this, even when detached. Could you explain what are they? None of them seem to increase MAC counters, so I think none of these are actual radio tx/rx sessions(?).

    The pulses are like this in detail.

    Also there is another pulse here

    Which looks like this in detail

  • Hi Kaushalya,

    Is there a way for the SED to keep on the polling even when in detached state?

    Polling in a detached state does not make sense, as the SED would not have a parent to poll. When a SED detaches from its parent it will send parent requests until a potential parent responds, then it performs a parent selection and attaches to the best parent. 

    If you want to implement another way to reattach your SED it seems from your description that you will not follow the specification. 

    kaushalyasat said:
    Also I can see current pulses from the SED like this, even when detached. Could you explain what are they? None of them seem to increase MAC counters, so I think none of these are actual radio tx/rx sessions(?).

    I must admit that these pulses are unfamiliar to me. The current level could align with the radio being on, though the actual value is lower than the expected typical value from this table. The SED periodically turning on the radio and not sending parent requests in a detached state is also not expected behaviour. 

    Since you are mentioning the lack of increase for MAC counters during these spikes, I assume that you are already sniffing and analyzing the packets? 

    Are you able to get device logs from the SED? The current consumption will of course increase by enabling logging, but it could help to explain what the spikes are. 

    The 8s interval spikes are strange, and I don't have a first step to begin to investigate those at this point. Sorry about that.

    Please also let me know if you are working with a DK or a custom board. 

    Best regards,

    Maria

  • Hi Maria,

    I have now given up on this MAC level parent detection mechanism.

    I am trying to keep the sensors which have attached and in child state, in child state for longer in the presence of host powering down. The sensor data and polls will get NAK as transmissions fail. The idea is when the host powers up in a hour or two, it will retain the same network credentials and the sensors which are in child state can resume data tx as if nothing happened, without any MLE session.

    I have tried

    1. Enabling child supervision and extending CONFIG_OPENTHREAD_CHILD_SUPERVISION_CHECK_TIMEOUT=3000 and CONFIG_OPENTHREAD_MLE_CHILD_TIMEOUT=3000.
      This didnt work as Meshforwarder detects tx failure and seemingly immediately detaches the child. I was sending conformable packet, but I got the NAK to application layer much later. The detection seems very aggressive like in one tx fail, it detaches. Following is what I see from console.
    [00:00:41.370,300] <inf> [N] Mle-----------: Attach attempt 2, AnyPartition
    [00:00:43.500,854] <inf> [N] Mle-----------: RLOC16 fffe -> dc01
    [00:00:43.500,976] <inf> [N] Mle-----------: Role detached -> child
    [00:00:43.508,239] <inf> coap_client_utils: ------- Sensor Start --------
    [00:00:43.508,239] <inf> coap_client_utils: Ext Addr: e2:59:51:2d:d1:ac:01:37
    [00:00:43.508,270] <inf> coap_client_utils: MAC: f4:ce:36:4b:d5:b9:c6:71
    [00:00:43.508,270] <inf> coap_client_utils: RLOC: dc01
    [00:00:44.069,854] <inf> coap_client: SHT4X: 22.78 Temp. [C] ; 39.90 RH [%]
    [00:00:44.069,885] <inf> coap_client: Temp: 22.78       RH: 39.90
    [00:00:44.074,615] <inf> coap_client_utils: ZS 0, RSSI -75, LQI 3, LQO 2, FW 1241 RET: 0, RLOC: dc01
    [00:01:00.036,407] <dbg> temp_nrf5_mpsl: temp_nrf5_mpsl_sample_fetch: sample: 87
    [00:01:00.036,437] <dbg> temp_nrf5_mpsl: temp_nrf5_mpsl_channel_get: Temperature:21,750000
    [00:01:05.037,567] <dbg> coap_client_utils: parent_monitor_work_handler: Parent status: RESPONSIVE (in CHILD role)
    [00:01:13.514,221] <inf> coap_client_utils: ---- Host ACK -----
    [00:01:13.514,251] <inf> coap_client_utils: Result : 0
    [00:01:13.514,282] <inf> coap_client_utils: ✓ CoAP transmission successful
    [00:01:35.037,689] <dbg> coap_client_utils: parent_monitor_work_handler: Parent status: RESPONSIVE (in CHILD role)
    [00:01:44.083,007] <inf> coap_client: SHT4X: 22.79 Temp. [C] ; 39.88 RH [%]
    [00:01:44.083,038] <inf> coap_client: Temp: 22.79       RH: 39.88
    [00:01:44.087,768] <inf> coap_client_utils: ZS 0, RSSI -75, LQI 3, LQO 2, FW 1241 RET: 0, RLOC: dc01
    [00:01:44.232,666] <inf> [N] MeshForwarder-: Failed to send IPv6 UDP msg, len:103, chksum:2033, ecn:no, to:0xdc00, sec:yes, error:NoAck, prio:normal
    [00:01:44.232,940] <inf> [N] MeshForwarder-:     src:[fdde:ad00:beef:0:dccd:79d4:8f11:f62d]:5683
    [00:01:44.233,184] <inf> [N] MeshForwarder-:     dst:[fdde:ad00:beef:0:8c62:93f7:ebb6:e2c4]:5683
    [00:01:45.612,365] <inf> [N] Mle-----------: Role child -> detached
    [00:01:45.612,579] <inf> [N] Mle-----------: RLOC16 dc01 -> fffe
    [00:01:46.076,538] <inf> coap_client: Paired->Not Connected
    [00:01:46.356,872] <inf> [N] Mle-----------: Attach attempt 1, AnyPartition
    

    How can I relax the MLE detachment in one tx error?

    1. I have added the following in CMakeLists.txt
    OPENTHREAD_CONFIG_FAILED_CHILD_TRANSMISSIONS=100
    
    # Also increase MAC retries
    OPENTHREAD_CONFIG_MAC_DEFAULT_MAX_FRAME_RETRIES_DIRECT=15
    
    # Disable parent search 
    OPENTHREAD_CONFIG_PARENT_SEARCH_ENABLE=0
    
    # Disable link request on timeout 
    OPENTHREAD_CONFIG_MLE_SEND_LINK_REQUEST_ON_ADV_TIMEOUT=0
    

    Nothing seem to help. Still one tx fail, the child detaches.

    Please help.
    Cheers,
    Kaushalya

  • Hi Maria,

    I have tried modifying OPENTHREAD_CONFIG_FAILED_CHILD_TRANSMISSIONS to 100 in [sdk path]\v2.9.0\modules\lib\openthread\src\core\config\misc.h. But still my SED seems detaching within one failed poll cycle.

    How can I keep the SED in child state inspite of transmission failure for a longer time?

    Cheers,

    Kaushalya

  • Hi Kaushalya, 

    Sorry for long delay in replying this. Maria is away and this thread fell into the cracks where we could not see the delay. 

    I am starting to look into the configuration files and to my understanding it seems that the detach is coming from OpenThreads own link failure logic and not from the Kconfigs you are trying to calibrate/tweak.

    In modules/lib/openthread/src/core/thread/mesh_forwarder.hpp the child’s parent link is dropped after kFailedRouterTransmissions consecutive NoAck results. That constant is hard-coded to 4 and is hit quickly when your polls/data all fail while the host is powered  down OPENTHREAD_CONFIG_FAILED_CHILD_TRANSMISSIONS only affects how a parent drops its children; it doesn’t stop the child from detaching itself.

    These thresholds are hardcoded and prebuilt into the Nordic OpenThread library so there is nothing much you can do about it. You can try to do this to make your SED to be attached for longer you need to use opensource OpenThread and need to prestine compile the whole OpenThread sources and not using the Nordic prebuilt library.

    • Rebuild OpenThread from source (not the prebuilt Nordic lib): set CONFIG_OPENTHREAD_NORDIC_LIBRARY=n in your app, then patch modules/lib/openthread/src/core/thread/mesh_forwarder.hpp to raise kFailedRouterTransmissions (e.g., 50 or 100) or wrap it in a macro you define. Example change: static constexpr uint8_t kFailedRouterTransmissions = 50;.
    • Keep your large CONFIG_OPENTHREAD_MLE_CHILD_TIMEOUT so the parent won’t drop the child while the host is down.
    • To avoid burning through the failure counter, back off polling and app traffic when you detect the host is off (e.g., call otLinkSetPollPeriod() to a large interval or pause transmissions). That avoids rapid NoAck accumulation.

    After patching, rebuild so OpenThread is recompiled; then check build/zephyr/.config to confirm CONFIG_OPENTHREAD_NORDIC_LIBRARY is off and verify in logs that detaches no longer occur after a single failed send

    Note:

  • Hi Susheel, Many thanks for the detailed reply. In fact, I have moved on to other tasks and this issue is still holding in the backlog. So I have to refresh myself where I left off. 

    I guess it is ok if I modify these files in my current SDK 2.9.0 to do a temporary workaround? 

    Is there a way to keep these mods across SDK updates to same 2.9.0? i.e. not using SDK 2.10.x but updates to 2.9.0 itself.

    Cheers,

    Kaushalya

  • Yes Kaushalya, you can patch those files in your current NCSv2.9.0 but please make sure that you select CONFIG_OPENTHREAD_NORDIC_LIBRARY=n and prestine build the whole OpenThread sources.

    Also note that a local edit in your SDK repo will be overwritten by a west update or similar commands, so you might have to maintain your own branch which branches out from NCSv2.9.0

Reply Children
No Data
Related