High power usage with an OpenThread SED during the MLE phase

Hi.

Our company has a couple of different custom designed boards using the nRF52840 and powered by a battery. OpenThread is used for networking, and the devices are configured as a sleepy end device (SED). The power consumption is acceptable on average. However, there are periods of high power usage during the mesh link establishment (MLE) procedure when the child device becomes detached from the parent. During this phase, I can see a period of high current that is 3.4 seconds long for each MLE attach attempt. The timing corresponds directly with the MLE attach attempts in the log (I disable logging when doing power measurements in general though).

I have assumed that these high current periods are related to either radio usage or because the nRF52 is being prevented from entering sleep mode. The long periods of high current are problematic for our one design that runs off of a coin cell battery. Is it normal to see this behaviour for OpenThread during MLE?

I am currently using nRF Connect version 2.4.0. I would like to upgrade to version 2.7.0 but I see there is a similar problem related to MLE with an SED, specifically this issue: "KRKNWK-19036: High power consumption after parent loss" https://github.com/openthread/openthread/issues/10302

To reproduce my problem:

  1. Connect a tool such as the PPK2 to measure the current draw from the SED.
  2. Set up a Thread network with one router and the SED being profiled
  3. Power off the router and watch the current usage as the MLE attach attempts occur.

The attach attempts are spaced out such that the retry times keep doubling until the time between attempts is 20 minutes, starting at roughly 0.25 seconds between the first two attempts.

Here is the project configuration file with the OpenThread settings (and other configuration items).

82252.prj.conf

Parents
  • Hi Shawn,

    I have a few different options for what you can do. They are in order of priority:

    1. The known issue you reference has a workaround in the ieee802154 driver. Are you able to reproduce the high power consumption when you apply the workaround?
    2. Have you reproduced the issue on an nRF52840 DK? If not, please try to.
      1. If you don't have an nRF52840 DK available, please let us know which firmware we should have on the nRF52840 DK to reproduce the power consumption issue. This can be a sample from NCS or a minimal project provided by you.
    3. Did you use the PPK2 to do the measurements? If you did, please also share the measurements with us.

    Best regards,

    Maria

  • Hi Maria,

    I was able to update my project to work with nRF Connect SDK version 2.6.1 and re-test the problem. The performance was worse, with the intervals of high power usage longer than they were in v2.4.0 of the SDK. After applying the workaround, the performance appears to be similar to that in v2.4.0.

    I have an nRF52840 DK but have not yet tried to reproduce the problem. I will work on it.

    I have a couple of PPK2 files with measurements from today. They are attached as a zip file.In both cases, the SED has already been commissioned previously. In file 1 (connected_then_router_removed.ppk2), the SED connects, and then I turn off the Thread router at about the 30 second mark. In file 2 (no_thread_router.ppk2) I turn on the SED without having the router powered up.

    In file 1, there is an interval where the current is on average about 15 mA for 1.5 seconds before the connection is successful. After I turn off the router, there are several MLE attachment attempts where the current averages about 2 mA over 3.5 seconds.The first interval where a few attach attempts overlap is longer.

    In file 2, the situation is worse. There seems to be two different cases during the attach attempts with two different average currents. In one case it is about 15 mA and the other it is about 2 mA. In both cases the high power persists for about 3.5 seconds.

    The screenshot below shows the display captured for the "file 2" case.

    Thanks,

    Shawn

    ppk2_files.zip

  • Hi Maria,

    I checked and the configuration option CONFIG_OPENTHREAD_DEFAULT_TX_POWER is set to 0 for both of my projects.

    I don't think the 3.5 seconds is related to the poll period of the MTD. I have the poll period set to 20 seconds for the code running on my board.

    I suspect there is a problem related to the CPU not going to sleep or that the radio receiver is being left on. In the screen shot below (a zoomed section of the CoAP client measurement), the average current is about 6 mA for most of the time during the 3.5 second interval. 6 mA should be roughly the current used by the CPU running, or the radio receiver running. The spikes are probably the transmitter turning on. If I had to guess, I would say the spikes in the last 1.4 seconds is the sending of multicast Discover messages on each of the IEEE802.15.4 channels.

  • Hi Shawn, 

    My apologies for the wait here. 

    I will run a few more tests tomorrow morning to compare coap_client from NCS 2.4.0, NCS 2.6.1 with the workaround, and NCS v2.7.0 where the fix for KRKNWK-19036 is implemented. 

    Best regards,

    Maria

  • Hi Shawn,

    Are you still experiencing this issue?

    If so, have you tried switching to v2.7.0 or cherry-picking the fix for KRKNWK-19036 in the downstream Zephyr repository (commit hash: 6c602a1bbd3b3f7811082bce391c6943663a2c64)?

    Best regards,
    Marte

  • Yes I am still experiencing this issue. I am currently using NCS 2.6.2 which has the fix for KRKNWK-19036 according to the release notes. I believe my problem is different than that one, but it seems like the effect is similar. Our team's product is still in an early stage and we were able to redesign it to change to a 1.6V cell with a different chemistry and lower internal resistance. So now the problem doesn't have as much as an effect on operation and there is no urgent need for a fix.

    Regards,

    Shawn

  • Hi Shawn,

    I ran some tests on my side, and I saw the same as you with v2.6.2 and v2.7.0, so I agree that it is not related to KRKNWK-19036. Otherwise, the current consumption would have been high the entire time, not just for 3.4 seconds.

    The current consumption corresponds well with the typical RX current, so my suspicion is that this is due to the radio being turned on for the reattach attempt.

    Is the redesign a satisfactory solution for you, or is this something you want us to investigate further? I am not sure if this is something we can do anything about, but I can check with the developers if this is still a problem for your design.

    Best regards,
    Marte

Reply
  • Hi Shawn,

    I ran some tests on my side, and I saw the same as you with v2.6.2 and v2.7.0, so I agree that it is not related to KRKNWK-19036. Otherwise, the current consumption would have been high the entire time, not just for 3.4 seconds.

    The current consumption corresponds well with the typical RX current, so my suspicion is that this is due to the radio being turned on for the reattach attempt.

    Is the redesign a satisfactory solution for you, or is this something you want us to investigate further? I am not sure if this is something we can do anything about, but I can check with the developers if this is still a problem for your design.

    Best regards,
    Marte

Children
Related