WiFi does not reconnect after net_if_down / net_if_up cycle

Description

I'm working on a battery-powered application where WiFi needs to be shut down between cloud sync intervals. I'm following the same pattern as the wifi/shutdown sample (calling net_if_down() to power off the nRF7002 and net_if_up() to bring it back). After wakeup, WiFi was never reconnecting (no supplicant association logged, NET_EVENT_L4_CONNECTED never fires). I also see an Unbalanced suspend warning during shutdown and high current (~30–40 mA) when the interface is supposed to be off.

[00:00:59.511] pm_device: Unbalanced suspend
[00:00:59.512] wifi_supplicant: Network interface 1 down
[00:00:59.525] queva_connection_manager: Network disconnected

I'm also using the BLE WiFi provisioning service in the firmware.

Issue 1: WiFi never reconnects after net_if_up()

Root cause: After net_if_up(), the supplicant re-adds the interface via add_interface(), but this runs asynchronously on the supplicant's internal iface_wq work queue. My original code called NET_REQUEST_WIFI_CONNECT_STORED immediately after net_if_up(). It returned 0 but the connection never completed, because the supplicant wasn't ready yet and silently dropped the request.

Relying on L2_WIFI_CONNECTIVITY_AUTO_CONNECT alone doesn't help either. conn_mgr fires conn_mgr_if_connect on NET_EVENT_IF_ADMIN_UP, which also races with add_interface() and gets dropped for the same reason.

Fix: Enable CONFIG_WIFI_READY_LIB=y and register a wifi_ready_callback. When the callback fires with wifi_ready=true (NET_EVENT_SUPPLICANT_READY received, meaning add_interface() has fully completed), submit a work item that calls NET_REQUEST_WIFI_CONNECT_STORED. This is the correct point to trigger reconnection.

static void wifi_reconnect_work_fn(struct k_work *work)
{
    struct net_if *iface = net_if_get_first_wifi();
    int err = net_mgmt(NET_REQUEST_WIFI_CONNECT_STORED, iface, NULL, 0);
    if (err && err != -EALREADY) {
        LOG_WRN("Wi-Fi connect stored failed: %d", err);
    }
}
static K_WORK_DEFINE(wifi_reconnect_work, wifi_reconnect_work_fn);

static void wifi_ready_cb(bool wifi_ready)
{
    if (wifi_ready) {
        app_work_submit(&wifi_reconnect_work);
    }
}

After this fix, the reconnection sequence looks like:

[00:08:14.861] Bringing WiFi back up
[00:08:15.114] wifi_supplicant: Network interface 1 up
[00:08:15.183] Wi-Fi supplicant ready          ← wifi_ready_cb fires here
[00:08:20.320] Network connected               ← L4 up ~5s later
[00:08:23.321] Connecting to Azure IoT Hub
[00:08:27.407] AZURE_IOT_HUB_EVT_READY         ← connected ~7s after net_if_up

Issue 2: pm_device Unbalanced suspend warning

Every call to net_if_down() produces this warning. After investigation I traced it to a trailing pm_device_runtime_put_async() call inside finalize_spi_transaction() in spi_nrfx_spim.c, which fires from the DMA completion ISR after net_if_down() has already driven the SPI PM usage count to zero. The PM runtime returns -EALREADY and takes no action.

I tried removing zephyr,pm-device-runtime-auto from both &spi4 and the nrf7002@0 child node in the board DTS (based on the suggestion that the nRF Wi-Fi driver manages the SPI bus power state internally), but the warning persists. This confirms it originates inside the nRF Wi-Fi driver itself, not from our PM configuration.

Remaining questions

  1. Is net_if_down() / net_if_up() the confirmed correct API for periodic WiFi power cycling on nRF7002, or is there a preferred alternative?
  2. Is NET_EVENT_SUPPLICANT_READY (via wifi_ready_callback) the right signal to wait for before issuing NET_REQUEST_WIFI_CONNECT_STORED after net_if_up()? Or should we be waiting for a different event?
  3. Can you confirm whether zephyr,pm-device-runtime-auto should or should not be set on the SPI node used by nRF7002? And is the Unbalanced suspend warning on net_if_down() expected/known behavior?
Related