This post is older than 2 years and might not be relevant anymore
More Info: Consider searching for newer posts

Non-Volatile Storage unexpected behaviour

Hi,

I've been working on Thingy:91 under ncs tag 1.2.0, trying to develop simple application using NVS just to see how it works. Based on the zephyr sample I wrote the following app:

#include <zephyr.h>
#include <power/reboot.h>
#include <device.h>
#include <string.h>
#include <drivers/flash.h>
#include <fs/nvs.h>

static struct nvs_fs fs;


#define ADDRESS_ID 1
#define STRING_ID 2

int rc = 0, cnt = 0, i=0;
char buf[16];
char buf_2[16];
char data[16];
char data_2[16];
struct flash_pages_info info;

void main(void)
{
	printk("Application started \n");

        fs.offset = DT_FLASH_AREA_STORAGE_OFFSET;
	rc = flash_get_page_info_by_offs(device_get_binding(DT_FLASH_DEV_NAME),
					 fs.offset, &info);
	if (rc) {
		printk("Unable to get page info");
	}

	fs.sector_size = info.size;
	fs.sector_count = 3U;

        printk("Initiating \n");
	rc = nvs_init(&fs, DT_FLASH_DEV_NAME);
	if (rc) {
		printk("Flash Init failed\n");
	}
        printk("Flash initiated \n");

        while (1) {

                cnt++;
                sprintf(buf, "L: %d " , cnt);
		nvs_write(&fs, ADDRESS_ID, &buf, strlen(buf)+1);

                sprintf(buf_2, "R: %d ", cnt);
                nvs_write(&fs, STRING_ID, &buf_2, strlen(buf_2)+1); 

                for (i=0; i<=cnt-1; i++){

                        rc = nvs_read_hist(&fs, ADDRESS_ID,
                                           &data,
                                           sizeof(data),
                                           i);

                        if (rc < 0) {
                                printk("Something went wrong \n");
                                break;
                        }

                        printk("%s ... ", data);
               }

               printk("\n");

               for (i=0; i<=cnt-1; i++){

                        rc = nvs_read_hist(&fs, STRING_ID,
                                           &data_2,
                                           sizeof(data_2),
                                           i);

                        if (rc < 0) {
                                printk("Something went wrong \n");
                                break;
                        }

                        printk("%s ... ", data_2);
               }

                k_sleep(K_SECONDS(1));
        }


}

The app worked well few times (meaning multiple uploads with different data written into memory, as well as multiple power off/ons), but at some point it just stopped working - the application went all the way to nvs_init function call and got stuck there forever (I was waiting for one hour and then I gave up). I tried to reupload it many times, reboot the device and none of that helped. Finally I changed the fs.sector_count from 3U to 2U and it started working again. Surprisingly after changing it back to 3U it also worked again with no problem.

I'm trying to understand that behaviour and I am not sure - why did that happen? I know that the flash memory on Thingy has a supposed lifetime of 10 000 erase cycles, but I'm pretty sure I did not exceed it (I guess if I reached the lifetime limit it would not work now - and it does). I read somewhere on the forum that memory erase may take a lot of time - but could it be one hour or more?

Why did the sectors number change made it work again?

I guess it's mostly about understanding the work of flash memory, and I don't have that knowledge here, so I will appreciate the explanation from someone who does. It all comes down to reliability - I don't want the app to hang in the future for the same reason that I do not know.

Regards

Maciek

Parents
  • Hi Maciek,

    I cannot say for sure, but I suspect you have seen the bug that is fixed by this patch.

    Einar

  • I changed those lines, but it didn't help. I've run to the issue again.

    I managed to find out, that the program hangs in function nvs_startup - by a printf debugging I noticed it goes all the way to:

    printf("File: %s, line: %d \n", __FILE__, __LINE__);
    fs->ate_wra = addr - ate_size;
    fs->data_wra = addr & ADDR_SECT_MASK;
    printf("File: %s, line: %d \n", __FILE__, __LINE__);

    (it's around line 603 in nvs.c). Only the first printf is printed here. Do you have any ideas why is that? And why changing the number of sectors is apparently the only thing that helps?

    Edit: sorry for the chaotic reporting, but is as chaotic as the behavior of my application...

    I've run the code again few times, and this time it gets stuck again, but in a function nvs_gc (in nvs.c file), exactly in this while loop:

    while (1) {
                printf("File: %s, line: %d \n", __FILE__, __LINE__);
                wlk_prev_addr = wlk_addr;
                rc = nvs_prev_ate(fs, &wlk_addr, &wlk_ate);
                printf("File: %s, line: %d \n", __FILE__, __LINE__);
                if (rc) {
                    printf("File: %s, line: %d \n", __FILE__, __LINE__);
                    return rc;
                }
                /* if ate with same id is reached we might need to copy.
                 * only consider valid wlk_ate's. Something wrong might
                 * have been written that has the same ate but is
                 * invalid, don't consider these as a match.
                 */
                if ((wlk_ate.id == gc_ate.id) &&
                    (!nvs_ate_crc8_check(&wlk_ate))) {
                    printf("File: %s, line: %d \n", __FILE__, __LINE__);
                    break;
                }
            }

    The output I get is (I added one printf to see the rc value:

    The rc is equal: 255 

    
    
    
    
    File: ZEPHYR_BASE/subsys/fs/nvs/nvs.c, line: 499 
    
    
    
    

    File: ZEPHYR_BASE/subsys/fs/nvs/nvs.c, line: 495
    
    
    
    

    The rc is equal: 69616
    
    
    
    

    File: ZEPHYR_BASE/subsys/fs/nvs/nvs.c, line: 499
    
    
    
    

    File: ZEPHYR_BASE/subsys/fs/nvs/nvs.c, line: 495
    
    
    
    

    The rc is equal: 537031599
    
    
    
    
    File: ZEPHYR_BASE/subsys/fs/nvs/nvs.c, line: 499 
    
    
    
    
    File: ZEPHYR_BASE/subsys/fs/nvs/nvs.c, line: 495 
    
    
    
    

    The rc is equal: 4024
    File: ZEPHYR_BASE/subsys/fs/nvs/nvs.c, line: 499
    
    
    
    
    File: ZEPHYR_BASE/subsys/fs/nvs/nvs.c, line: 495 
    
    
    
    
    The rc is equal: 4032 
    
    
    
    

    File: ZEPHYR_BASE/subsys/fs/nvs/nvs.c, line: 499
    
    
    
    

    File: ZEPHYR_BASE/subsys/fs/nvs/nvs.c, line: 495
    The rc is equal: 4040
    
    
    
    

    File: ZEPHYR_BASE/subsys/fs/nvs/nvs.c, line: 499
    
    
    
    

    File: ZEPHYR_BASE/subsys/fs/nvs/nvs.c, line: 495
    
    
    
    

    The rc is equal: 4048
    
    
    
    

    File: ZEPHYR_BASE/subsys/fs/nvs/nvs.c, line: 499
    
    
    
    
    File: ZEPHYR_BASE/subsys/fs/nvs/nvs.c, line: 495 
    
    
    
    

    The rc is equal: 4056
    
    
    
    

    File: ZEPHYR_BASE/subsys/fs/nvs/nvs.c, line: 499
    
    
    
    

    File: ZEPHYR_BASE/subsys/fs/nvs/nvs.c, line: 495
    
    
    
    

    The rc is equal: 4064
    
    
    
    
    File: ZEPHYR_BASE/subsys/fs/nvs/nvs.c, line: 499 
    
    
    
    

    File: ZEPHYR_BASE/subsys/fs/nvs/nvs.c, line: 495

    and so on.




    
    
  • Hi,

    Running your code I see that the writes always succeed (for as long as I tested anyway), but after just a few minutes, nvs_read_hist() starts returning -2, which is -ENOENT:

    	if (((wlk_addr == fs->ate_wra) && (wlk_ate.id != id)) ||
    	    (wlk_ate.len == 0U) || (cnt_his < cnt)) {
    		return -ENOENT;
    	}

    I don't see any reason for this, though. Have you had any progress?

  • I can distinguish two behaviors that occur in my application.

    - everything works well

    - the application gets stuck on nvs_init()

    And this happens with the same code. The failure in second case, as I described in previous post, happens in the nvs_gc function from zephyr's nvs.c library. It gets stuck in the while loop which I pointed out.

    So I uploaded that code for the first time, it worked few times - few reboots or reuploads of the same code with maybe a different data to write.

    Then after few tries it got stuck on nvs_init (which is before write and read functions), so I tried to reupload or reboot again, and the only thing that made it work was to change number of sectors from 3 to 2. Then after few tries it started getting stuck on nvs_init, and again the only thing that helped was a change of number of sectors (back from 2 to 3).

    To add a bit more chaos to this I will point out that change of number of sectors doesn't always help, and I really can't tell what makes it work again for sure.

    However, I've been trying to run littlefs sample on Thingy91, and I was also not able to initiate flash area right at the beginning - at the returned error was also ENOENT (and in case of littlefs it hasn't worked even once, it always returns ENOENT). Could that be somehow related?

  • Hi,

    Gmaciek said:

    I can distinguish two behaviors that occur in my application.

    - everything works well

    - the application gets stuck on nvs_init()

    And this happens with the same code. The failure in second case, as I described in previous post, happens in the nvs_gc function from zephyr's nvs.c library. It gets stuck in the while loop which I pointed out.

    I see. I do not have any explanation for this, and I am not able to reproduce the behavior you are describing when using your test app. I am testing on a nRF91 DK though, but that should not matter.

    As this seems odd and I cannot reproduce on my side I wonder if you have a second Thingy:91 or nRF91 DK you can test to see if it behaves in the same way?

Reply
  • Hi,

    Gmaciek said:

    I can distinguish two behaviors that occur in my application.

    - everything works well

    - the application gets stuck on nvs_init()

    And this happens with the same code. The failure in second case, as I described in previous post, happens in the nvs_gc function from zephyr's nvs.c library. It gets stuck in the while loop which I pointed out.

    I see. I do not have any explanation for this, and I am not able to reproduce the behavior you are describing when using your test app. I am testing on a nRF91 DK though, but that should not matter.

    As this seems odd and I cannot reproduce on my side I wonder if you have a second Thingy:91 or nRF91 DK you can test to see if it behaves in the same way?

Children
No Data
Related