Buggy behaviour of gnss fix intervals

Hello,

I am testing four Thingy:91s with different setups. They all get periodic fixes, some with and some without A-GPS. I use the nrf_modem_gnss.h lib to set interval similarly to the gnss sample.

My system polls for a semaphore set in case NRF_MODEM_GNSS_EVT_FIX, also similar to the sample, then establishes an MQTT-connections, sends data, and closes the connection.

These units have all been running for over a month.

I've had a weird bug show on three of them now. It happened almost immediately on the first, after a week or so on the second and after over a month now on the third one:

They suddenly deviate from the intervals set and keeps sending data with irregular intervals, mostly 10-15 minutes between, sometimes every few seconds. The following is a snippet from the log of what I received. This particular unit was set up with 18 hour interval:

devName,lat,lon,acc,cnt,met,time,bat
Thingy4,59.682525,10.662728,31,11,GNSS,2022-07-19 15:34:48.710978,3782
Thingy4,0.0,0.0,0,5,GNSS,2022-07-19 15:48:12.262688,3782
Thingy4,59.682517,10.662756,9,19,GNSS,2022-07-19 15:48:31.514295,3782
Thingy4,59.682517,10.662756,9,0,GNSS,2022-07-19 15:48:34.791434,3782
Thingy4,59.682734,10.662883,31,20,GNSS,2022-07-19 16:01:57.522594,3782
Thingy4,59.682413,10.663013,17,24,GNSS,2022-07-19 16:15:31.654655,3790
Thingy4,59.68258,10.662874,11,6,GNSS,2022-07-19 16:28:43.616715,3786
Thingy4,59.682599,10.663027,39,7,GNSS,2022-07-19 16:41:44.757302,3782
Thingy4,0.0,0.0,0,5,GNSS,2022-07-19 16:55:11.643160,3782
Thingy4,59.682506,10.663038,7,17,GNSS,2022-07-19 16:55:30.075524,3782
Thingy4,59.682506,10.663038,7,0,GNSS,2022-07-19 16:55:33.351879,3786
Thingy4,59.682508,10.66305,25,23,GNSS,2022-07-19 17:09:00.687285,3782
Thingy4,59.682404,10.662721,17,24,GNSS,2022-07-19 17:22:31.606709,3782
Thingy4,59.682389,10.663,17,5,GNSS,2022-07-19 17:35:42.687377,3786
Thingy4,59.682486,10.662949,8,19,GNSS,2022-07-19 17:36:01.911524,3790
Thingy4,59.683743,10.686157,1775,0,Cell,2022-07-19 17:36:05.291131,3782
Thingy4,59.682545,10.663071,21,4,GNSS,2022-07-19 17:49:11.531752,3786
Thingy4,59.682452,10.662947,7,19,GNSS,2022-07-19 17:49:30.783164,3786
Thingy4,59.683743,10.686157,1775,0,Cell,2022-07-19 17:49:34.059986,3782

Where "cnt" is roughly the number of seconds it spent searching for a fix (cnt++ in case NRF_MODEM_GNSS_EVT_PVT), and "met" is the method used (gnss or cell, which they fall back to if no fix is produced).

A few observations:
1. It sometimes sends empty data (lon, lat = 0). These always have 4-6 seconds count. I never empty the pvt data, so the modem actually updates the "&pvt_data" through "nrf_modem_gnss_read" in case "NRF_MODEM_GNSS_EVT_FIX" with empty data. 
2. It sometimes sends the same data twice, meaning the pvt_data gets reloaded with the same data and the semaphore gets set again.
3. They sometimes produce a cell positioning immediately after a valid fix. The case NRF_MODEM_GNSS_EVT_SLEEP_AFTER_TIMEOUT submits a thread which starts the cell position request routine. 

I can not understand why the modem would produce empty or double pvt_data, and why the NRF_MODEM_GNSS_EVT_SLEEP_AFTER_TIMEOUT case is called after producing a valid gnss fix. If I reset the devices, they work fine again.

To me, it seems like a modem bug.

I've solved the issue by creating a new looping thread with a sleep time as the interval, which activates modem and requests a single fix, and turning off the gnss module after a fix is produced. This also enables me to have intervals greater than 18h20m. Due to this, I no longer have the same exact source code as the units are running. This solutions works fine, but I thought I should make you aware of this modem behavior nevertheless. 

I can not produce a modem trace of it as it can happen after weeks or months.

If you have any ideas as to why this behavior occurs, or how to fix it, I am curious and interested.

Thank you all!

Parents
  • Hi Torje, 

    Thanks for reporting this potential bug and good to hear you already got a solution for it. Here are some of my thoughts.

    1) How long this behavior will last? or it will keep behaving like this once it starts? I am not sure if it is related to NRF_MODEM_GNSS_EVT_REF_ALT_EXPIRED. GNSS sends the event NRF_MODEM_GNSS_EVT_REF_ALT_EXPIRED when the reference altitude expires. This event can be used to trigger a reference altitude update whenever it is needed.

    2) Sound like it is also related to device hardware since three out of four have this behaviour, but I hope the fourth one was just lucky that did not show this behaviour.

    3) To further confirm if it is modem related, I would like to test with the simplest nRF9160: GNSS sample with a 18h20m interval(CONFIG_GNSS_SAMPLE_PERIODIC_INTERVAL=66000) in Periodic mode.

    Best regards,

    Charlie

Reply
  • Hi Torje, 

    Thanks for reporting this potential bug and good to hear you already got a solution for it. Here are some of my thoughts.

    1) How long this behavior will last? or it will keep behaving like this once it starts? I am not sure if it is related to NRF_MODEM_GNSS_EVT_REF_ALT_EXPIRED. GNSS sends the event NRF_MODEM_GNSS_EVT_REF_ALT_EXPIRED when the reference altitude expires. This event can be used to trigger a reference altitude update whenever it is needed.

    2) Sound like it is also related to device hardware since three out of four have this behaviour, but I hope the fourth one was just lucky that did not show this behaviour.

    3) To further confirm if it is modem related, I would like to test with the simplest nRF9160: GNSS sample with a 18h20m interval(CONFIG_GNSS_SAMPLE_PERIODIC_INTERVAL=66000) in Periodic mode.

    Best regards,

    Charlie

Children
  • Hello Charlie,

    And thank you for answering.

    1) This behavior has lasted both permanently until a power cycle, and temporarily for just 15 minutes-ish. I am not handling the NRF_MODEM_GNSS_EVT_REF_ALT_EXPIRED case in any way. I also have no good theory as to why it sometimes seems to stick, and other times just gets solved.

    2) I would like to emphasize that the bug occurred after a long time, on one of the units it was around one month of uptime before it happened. I'm guessing it just hasn't happened _yet_ on the last one. 

    3) I would also like to try this, but I can't use one of my four test devices for this sole purpose as it would halt development and testing.

    For info, I am using the latest modem firmware on all devices.

    Let me know if there's any more information I can assist you with in reporting this bug.

Related