nRF7002 cannot connect to SSID after a while

Hi,

As the title states, after a while, the nRF7002 fails to connect SSID after a while. It happens either we perform net_if_up/down or not. I suspect the nRF7002 to be in a bad state because when I measure the consumption with the ppk2, I have around 750µA instead of around 80µA idle on my system, or ~150µA when interface is Up but IDLE

During connect attemp, I have several mA spikes, but not even close to the  ~200mA spikes and ~60mA floor I am supposed to have when connecting/transmitting.

I am sorry I did not captured the trace as I wasn't expecting to post the issue here at first. As it happend after several days of working, I did not managed to attach a debug session to it yet.

Maybe you have some insight on what is going on, or what could be the issue.

Thank you for your help

Parents
  • Hi Benoit

    Håkon is out of office, so I have taken over this case while he's away. Have you also set the RPU recovery propagation delay for example which seems to be required when doing an RPU recovery. https://docs.nordicsemi.com/bundle/ncs-latest/page/nrf/drivers/wifi/nrf70_native.html#kconfig_configuration

    If not it could be that this strange behavior is the device being caught in a reset loop due to the RPU recovery because the app/stack hasn't had enough time to clean up the resources used before rebooting.

    Are you able to see this issue when running one of our sample projects on your custom board? Alternatively, can you see the same issue when using an nRF7002 DK with your custom firmware?

    I think a good first step is to narrow down whether this is a SW or HW issue.

    Best regards,

    Simon

  • Hi,

    Sorry for delay, I was out of office. I could tweak  CONFIG_NRF_WIFI_RPU_RECOVERY_PROPAGATION_DELAY_MS but how much? What are typical values and which value do you suggest? 

    Best regards

  • Sorry for the late reply Benait, Simon is away and I will try to assist you as best as I can.

    You can start with setting CONFIG_NRF_WIFI_RPU_RECOVERY_PROPAGATION_DELAY_MS=2000. This gives the app/stack enough time to release everything cleanly before the RPU tries to come back up. Without this delay, it might try to restart while Zephyr is still cleaning up the previous context, and that can easily explain the weird current profile you’re seeing — kind of half-awake, not fully connecting.

    Also, I’d suggest running the wifi_station sample with and without your firmware on both the DK and your custom board. That might help narrow down whether it’s board-related or something deeper in the app logic or configs.

    Let me know what you see with the delay applied.

  • Hello, okay, I will try 2000ms. I tried 200ms but it doesn't fix the issue. Howerver, I see the default value is 10ms. If 2s is suggested (x200!) would it be relevant to consider to increase this default value significantly?

    Also, I’d suggest running the wifi_station sample with and without your firmware on both the DK and your custom board

    It is not likely a custom board issue, as I am not the only one having this issue. Is it a possiblity for you to consider running your wifi_station sample + your DK for a long test? As it is Nordic's solution, it appears consistent you run this kind of tests to check if everything works as expected. If so, might it be that running a more complex firmware make the issue rise? 

    Thank you for your help

  • By digging a bit more I wanted to see if I could use the command used in this ticket nRF7002 randomly unable to connect to Wifi - Nordic Q&A - Nordic DevZone - Nordic DevZone to recover the RPU, but I see that the function is like the following : 

    1) What is the use of the mutex if it is used only in this function and not in any sample?  

    2) Also :

    If the stack uses the interface every 20 minutes for example, how can it knows if it has to free resources in the meantime ? I don't see any propagation mechanism in this function, nor in any piece of code encapsulated between CONFIG_NRF_WIFI_RPU_RECOVERY clauses that would propagate this info.

    3) How can I call nrf_wifi_util_trigger_rpu_recovery or simillar directly from the code without having to enable and use a shell?

    Thank you for your clarifications

    [EDIT] I updated to ncs 3.0.2 and I see that now the semaphore is used, and the default propagation delay is 2000ms. And in the ticket that is similar to mine, it doesn't work. It means I am back to square one. The only lead I have now is to be able to trigger rpu_recover like the  nrf70 util rpu_recovery_test command but without using shell (we can't afford it in therm of flash and we can't interact with the device when it is in the field). please give instructions. 

Reply
  • By digging a bit more I wanted to see if I could use the command used in this ticket nRF7002 randomly unable to connect to Wifi - Nordic Q&A - Nordic DevZone - Nordic DevZone to recover the RPU, but I see that the function is like the following : 

    1) What is the use of the mutex if it is used only in this function and not in any sample?  

    2) Also :

    If the stack uses the interface every 20 minutes for example, how can it knows if it has to free resources in the meantime ? I don't see any propagation mechanism in this function, nor in any piece of code encapsulated between CONFIG_NRF_WIFI_RPU_RECOVERY clauses that would propagate this info.

    3) How can I call nrf_wifi_util_trigger_rpu_recovery or simillar directly from the code without having to enable and use a shell?

    Thank you for your clarifications

    [EDIT] I updated to ncs 3.0.2 and I see that now the semaphore is used, and the default propagation delay is 2000ms. And in the ticket that is similar to mine, it doesn't work. It means I am back to square one. The only lead I have now is to be able to trigger rpu_recover like the  nrf70 util rpu_recovery_test command but without using shell (we can't afford it in therm of flash and we can't interact with the device when it is in the field). please give instructions. 

Children
No Data
Related