This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts
This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

nRF9160 goes to sleep and becomes stuck in idle task

Hello,

I am using the Nordic SDK at v1.5.0, to create an application that runs on the Nordic nRF9160. 

In the typical use case scenario, my application is mains powered and once per day is contacted by our remote TCP client, for a readout or sensor data.

However I noticed that when left unattended, after about 1h30 ~2h it appears to go into some sort of power saving (or sleep ?) mode, enters the Zephyr idle task and remains there.

It is also not reachable by our remote client.

I have disabled all power saving options for the modem part, so I do not understand why or how this could be occuring. As far as I can tell, there are no power related registers on the user CPU of the nRF9160 that can be configured to prevent this sleep mode.

So my question is, how can I prevent the application from going into power saving (or sleep) mode ? 

Thanks in advance,

 Nelson Gonçalves

  • Hello Didrik,

    I did all the steps in order to prepare for collecting modem traces described in the link you gave.

    For info, this is my setup:

    - the hardware is the nRF9160 dev board, PCA10090 v1.0.0

    - the modem hardware revision is: nRF9160 SICA B1A

    - the modem firmware version is: mfw_nrf9160_1.3.0

    - I am using the Nordic SDK v1.5.0

    - the board controller nRF52 chip was flashed to the latest version as indicated in the setup guide for collecting traces

    - I am using the iBasis SIM card

    I flashed the application, started the trace collector. Because the IP address is in a private network, I cannot connect to the TCP listening server from my PC. So I just left the device idle, listening for incoming connections.

    At about 50 minutes after startup, the application is constantly notified that there is a new client connection, and when it tried to accept it if fails it the error (9) "Bad Descriptor".

    I am attaching the output of my application, the traces collected from the modem and the application code running on the board. It is almost identical to the one I posted originally, but with some debug information.

    In these traces there is no client connecting to the application, which is not what you originally had asked for. But for the moment I cannot do it with the dev board, and there is the weird behaviour at 50 mins after startup. I noticed this happening twice, so I believe it can be relevant for the issue.

    Best regards,

    Nelson Gonçalves

    modem_and_app_traces.zip

    3652.tcp_listen.zip

  • I've taken a look at your trace, and I believe I have an idea about what happens. That said, there are some details I am not quite sure of, so I'll ask our modem team to have a look at it as well.

    While we wait for their analysis, here is what I believe is happening:

    After the device has run through the initial startup, there isn't much happening for the first ~50 minutes. Then, the device tries to perform a tracking are update. However, the network rejects it, thereby disconnecting the device from the network. The reason given by the network for the rejection is "UE identity cannot be derived by the network (9)".

    As the device is now not connected to the network, the modem closes the socket. This is why you get a POLLHUP.

    The POLLIN looks like a bug to me. I can't see any signs of an incoming connection in the trace. There has been several improvements to the modem library since NCS 1.5.0. Could you try to upgrade to 1.7.0 and see if you still get the POLLINs?

    Edit: Also, looking at your application a bit more, you never check for POLLHUP or POLLERR in the modem socket, so when the socket is closed, the application keeps polling it. This would explain why you get a "bad file descriptor" when you try to read the non-existing client socket.

  • Hello Didrik,

    Thanks for the update. Is there a way to prevent the network from disconnecting the device, or at least auto-reconnecting ?  Is re-open the listening socket sufficient for causing the device to re-register on the network ?

    I had tried to use v1.7.0 on my acutal project (not this demo application) but I could not build my project. So I kept to v1.5.0.  I will use v1.7.0 for this demo app and also fix the handling of the socket events. I will then report back on what happended.

  • Dirdrik,

    A followup question. In LTE networks, 58 min is a typical value for the Traffic Area Update (TAU) timer (T3412). That aligns neatly with the aproximately 50 minutes of inactivity.

    So if the TAU is rejected with code 9, the UE should automatically initiate the attach procedure (see 3GPP TS 24.301). Is that attach failing as well? Why?

  • Here's what the modem team says happens on the LTE network:

    Attach was done to network 206 20 but the cell was lost later. UE searched the same network but it was not anymore found. Instead, modem found network 206 10. Because the network changed and TAI was not in TAI list, modem had to initiate TAU which was then rejected with cause #9. This is normal, the second network did not recognize the UE due to lack of inter-working between the two networks.

    So I was a bit wrong in my interpretation.

    NelsonGoncalves said:
    Is there a way to prevent the network from disconnecting the device, or at least auto-reconnecting ?

    You can't prevent it from disconnecting (as it shouldn't disconnect in the first place). But the modem will automatically try to reconnect. In fact, in the trace you provided, it does reconnect to the network. However, as the TCP socket is already closed by the modem, the application doesn't recover.

    Closing the socket you had, and opening a new socket (after reconnecting to the network) should be sufficient for recovering.

    I am still waiting for their analysis of why the POLLIN flag was set.

Related