NRF9151DK autonomous resets when simultaneously starting LTE connection and GPS on modem FW 2.0.2

I have updated a board (NRF9151DK) from modem FW 2.0.1 to 2.0.2. The board is also equipped with nrf7002-EK, and the support for the wifi driver is enabled (SB_CONFIG_WIFI_NRF70, SB_CONFIG_WIFI_NRF70_SCAN_ONLY, haven't verified if this makes a difference).

The previous version of the application works fine, in which we initialize the modem at startup, and simultaneously initiate an LTE connection and activate the internal GPS. On the new 2.0.2 firmware, this simultaneous action results in an autonomous reset, which can be fixed by e.g. adding a 100msec sleep before activating the internal gps. The same problem occurs on other devkits too (observed on 2 out of 2).

I mainly want to notify you of this difference, because I'll not be the only person suffering from this. I leave it up to you to consider if a fix is possible.

I can try out alternative solutions if necessary.

Best regards,

Sebastiaan

Parents
  • Hi

    After a brief look at your project, it seems like your GPS thread completely takes over the main thread, so that the watchdog never gets  fed in the first place. How exactly have you timed these system resets on your end?

    To confirm whether the feeding in the main loop ever is reached, you can try switching it out with a GPIO toggling instead to see if that ever happens. We assume this will be blocked when you run the GPS, and that it just never is fed. Making some changes to the priorities in your application should fix that.

    Best regards,

    Simon

  • I don't agree with your conclusions. The GPS thread does not fully take over CPU control since in the internal loop of "app_gps_internal_get_fix", it periodically blocks on a semaphore. Also, again, this whole program runs fine with modem fw 2.0.1 and watchdog enabled.

    I have even changed the watchdog channel period to 20 seconds (20000 msec) and in combination with modem fw 2.0.1, the watchdog kicks in at the expected time if it is not fed. Also, the callback is called (task_wdt_callback) as expected.

    With modem FW 2.0.2 this is not the case, the watchdog resets way sooner, and a callback is not called. The solutions or fixes that can make the application work with FW 2.0.2 are one of those:

    • Delay of gps thread startup with e.g. 1 second
    • Or changing the sleep of the main loop to something like 50 msec instead of 1 second what it was in the past
      • 90 msec also works fine, 100msec does not. By chance (or not?), this is equal to CONFIG_TASK_WDT_MIN_TIMEOUT.

    So please stop asking me to toggle GPIOs. The main loop is running and kicking, but the problem is that the watchdog decides to reset nevertheless, and it resets way before the configured task period of e.g. 5 seconds or 20 seconds. This behavior only occurs with modem fw 2.0.2, not with fw 2.0.1.

  • I'm just an other user.

    AFAIK the watchdog is located on the app CPU not on the modem. If changing the modem firmware fires that watchdog of the app CPU, it is in my experience still an issue in the app.

    What changing the mfw may cause is a slightly different timing, which then may unexpected block. Your countermeasures are pointing to something like that.

    Anyway, it's your time and so your decision.

  • I agree that it's likely an app issue, from what is described here. But I'm not new to watchdog, not new to zephyr, not new to threads, ... and I can't figure out why the watchdog fires in this case. That's why I raised the ticket in the first place. I've provided a stripped down reproduction sample. If anyone is able to point out where the mistake is (in the app), I'm happy to learn from that. But so far, my feeling is that there's a deeper issue.

Reply
  • I agree that it's likely an app issue, from what is described here. But I'm not new to watchdog, not new to zephyr, not new to threads, ... and I can't figure out why the watchdog fires in this case. That's why I raised the ticket in the first place. I've provided a stripped down reproduction sample. If anyone is able to point out where the mistake is (in the app), I'm happy to learn from that. But so far, my feeling is that there's a deeper issue.

Children
No Data