little fs directory disappears after having used it for ~20-30 hours

Hi,

we are working on a nrf52840 based device, which has an external mx25r16 flash (via SPI), having 3 partitions for: secondary-image, mcu-scatch-partition and application-data-partition with litte-fs.

Our device is based on zephyr 2.4 (...knowing that this is quite old).

Recently we added a flash-file based event-logging feature to our application, where various events are logged to an increasing number of event-log-files (located in one log-subdirectory).

These event-log files are published to some cloud service one-by-one about two times per day and are deleted after successful publishing.

The implementation seems to work perfectly, when having a debug build with various LOG_INF()s.

As soon as running in release build we can observe, that on about 40% of our devices are not sending any event-log-files anymore after having run for about 20-30 hours, even while everything worked fine within the time between reboot and the first occurrence of the problem.

When digging into the problem, we found out, that all of the affected devices do not have the logging-subdirectory anymore, which was created on demand during the app-initialization after reboot.

We reviewed our application very carefully and are sure, that there is no line of application-code which might cause the delete of the logging-subdirectory.

Is there known any filesystem / little fs issue, which might cause this kind of problem?

Any help or suggestion is highly appreciated!

Volker

  • Hi Volker,

    I've not found any reports of similar issues for littlefs. It's also strange that you are not seeing this with your debug build. As a start, I would suggest that you 'diff' the generated .config files from your release and debug builds to see if there are other differences that may be relevant, apart from the logger configurations (heap sizes, etc.). I've also asked internally if anyone has suggestions on how you can troubleshoot this.

    Searching the release notes for "littlefs" gave the following results:

    /zephyr/doc/releases$ grep -r "littlefs"
    release-notes-3.3.rst:- :github:`52886` - tests: subsys: fs: littlefs: filesystem.littlefs.default and filesystem.littlefs.custom fails
    release-notes-3.3.rst:* :github:`52602` - tests: subsys: settings: file_littlefs: system.settings.file_littlefs.raw fails
    release-notes-2.0.rst:* File Systems: Added support for littlefs
    release-notes-2.0.rst:* :github:`18664` - [Coverity CID :203416]Uninitialized variables in /home/aasthagr/zephyrproject-external-coverity-new/zephyrproject/modules/fs/littlefs/lfs.c
    release-notes-2.0.rst:* :github:`18663` - [Coverity CID :203413]Null pointer dereferences in /home/aasthagr/zephyrproject-external-coverity-new/zephyrproject/modules/fs/littlefs/lfs.c
    release-notes-2.0.rst:* :github:`18458` - [Coverity CID :203422]Memory - illegal accesses in /tests/subsys/fs/littlefs/src/testfs_util.c
    release-notes-2.0.rst:* :github:`18392` - [Coverity CID :203494]Integer handling issues in /subsys/fs/littlefs_fs.c
    release-notes-2.0.rst:* :github:`5529` - Explore Little File System (littlefs) support
    release-notes-2.4.rst:* CVE-2020-13599: Security problem with settings and littlefs
    release-notes-2.4.rst:* :github:`28540` - littlefs: MPU FAULT and failed to run
    release-notes-2.4.rst:* :github:`26279` - littlefs: Unable to erase external flash.
    release-notes-2.4.rst:* :github:`25728` - [Coverity CID :210050] Unchecked return value in tests/subsys/settings/littlefs/src/settings_setup_littlefs.c
    release-notes-2.4.rst:* :github:`24111` - drivers: flash: littlefs: add sync to flash API & update LittleFS to use it
    release-notes-2.4.rst:* :github:`22340` - Security problem with settings and littlefs
    release-notes-3.5.rst:  * Added support of mounting littlefs on the block device from the shell/fs.
    release-notes-2.7.rst:* :github:`38202` - mbedtls and littlefs on a STM32L4
    release-notes-2.7.rst:* :github:`38059` - automount configuration in nrf52840dk_nrf52840.overlay causes error: mount point already exists!! in subsys/fs/littlefs sample
    release-notes-2.7.rst:* :github:`36851` - FS logging backend assumes littlefs
    release-notes-2.7.rst:* :github:`32990` - FS/littlefs: it is possible to write to already deleted file
    release-notes-3.2.rst:* :github:`50033` - tests: subsys: fs: littlefs: filesystem.littlefs.custom fails to build
    release-notes-2.1.rst:* :github:`18341` - settings: test setting FS back-end using littlefs
    release-notes-2.3.rst:* :github:`24585` - How to read/write an big(>16K) file in littlefs shell sample on native posix board?
    release-notes-3.0.rst:* :github:`41395` - littlefs(external spi flash) + mcuboot can't get right mount area
    release-notes-3.0.rst:* :github:`36962` - littlefs: Too small heap for file cache (again).
    release-notes-2.5.rst:* :github:`32078` - build error with llvm: samples/subsys/fs/littlefs
    release-notes-2.5.rst:* :github:`31669` - [Coverity CID :215715] Unchecked return value in tests/subsys/fs/littlefs/src/testfs_mount_flags.c
    release-notes-2.5.rst:* :github:`31524` - littlefs: Too small heap for file cache.
    release-notes-2.5.rst:* :github:`28309` - Sample/subsys/fs/littlefs with board=nucleo_f429zi  don't work
    release-notes-2.2.rst:* :github:`8242` - File system (littlefs & FAT) examples
    release-notes-3.1.rst:* :github:`43020` - samples/subsys/fs/littlefs does not work with native_posix board on WSL2
    

    Best regards,

    Vidar

  • Hi Vidar,

    thank you for your suggestions. The diff did not show anything relevant for this issue.

    I would assume that there is happening a race-condition at some point.

    When using a debug build, the timing might be slightly different and that might possibly cause that the problem does not occur anymore.

    Regards

    Volker

  • Hi Volker,

    I see there is a symbol named LFS_THREADSAFE in the littlefs implementation, maybe it would be worth trying to enable that? 

    https://github.com/zephyrproject-rtos/littlefs/commit/00a9ba7826318408d280aafe5dc527a43b2c965d 

    Regards,

    Vidar

  • I searched through our SDK tree (includes the v3.5.99 tag of our Zephyr fork), but I do not see any Kconfig symbol which can be used to enable the thread safe implementation. 

    You may try to include a newer revision of littlefs by changing the revision number in the Zephyr manifest followed by a 'west update': https://github.com/nrfconnect/sdk-zephyr/blob/41095df79d11e081ea96d150fbe3dbd93f73af6c/west.yml#L272 

    Regards,

    Vidar

  • Hi Vidar,
    in the codebase of the zephyr 2.4 I cannot find any LFS_THREADSAFE symbols.

    Therefore I am assuming that the lfs-implementation there is not thread safe.

    So I wrapped all the application code, which modifies the file system. The application code, which does any file-create/file-delete/or open-write-close operation is now within one thread. All open-write-close operations are done in an pseudo atomic manner to avoid the situation, where two files might be open at the same time.

    Surprisingly the problem with the disappearing directory still persists.

    Any more ideas?

    Regards,

    Volker

Related