Zephyr SD Card Remount Issue: fs_unmount vs. Disk Deinitialization Leading to EIO or Blocked Workqueue

Zephyr RTOS SD Card Hot-Plug Re-Mount Issue

I'm implementing SD card hot-plug functionality in a Zephyr RTOS application. My workflow for managing the SD card involves the following Zephyr APIs:

Initialization/Insertion:

  • disk_access_ioctl(DISK_DRIVE_NAME, DISK_IOCTL_CTRL_INIT, NULL)
  • fs_mount(&mp)

Removal:

  • fs_unmount(&mp)

Problem Description:

When an SD card is inserted for the first time or on power-up, the sequence of disk_access_ioctl(DISK_IOCTL_CTRL_INIT) followed by fs_mount() works perfectly.

However, if I remove the SD card (triggering fs_unmount()) and then re-insert it, the re-initialization (disk_access_ioctl(DISK_IOCTL_CTRL_INIT)) succeeds, but the subsequent fs_mount() call fails consistently with an EIO (-5) error. The relevant log output is:

<err> sd: Failed to read from SDMMC -5
<err> sd: Card read failed
<err> fs: fs mount error (-5)

My understanding, based on the Zephyr documentation (e.g., the general usage of fs_unmount and the disk_access API), is that fs_unmount() should sufficiently deinitialize or prepare the disk for a future re-mount. It says I should have to explicitly use DISK_IOCTL_CTRL_DEINIT after fs_unmount.

Additional Observation (if I were to hypothetically force deinitialization):

If I manually deinit using disk_access_ioctl, within a k_work item scheduled on the system workqueue, it causes that k_work item to become permanently blocked or never complete its execution after the first removal. I also tried passing a pointer to a bool set as true as the buf arg but that also causes more errors.

My questions are:

  1. What is the correct Zephyr API sequence or state management required to reliably remount an SD card after it has been fs_unmount'ed and then re-inserted?
  2. Why would disk_access_ioctl(DISK_IOCTL_CTRL_INIT) succeed on re-insertion, but fs_mount() subsequently fail with EIO? What internal states or resources might be improperly released or re-initialized?
  3. Am I using disk_access_ioctlcommandsincorrectly?

Any insights into successfully managing SD card hot-plugging in Zephyr, specifically regarding reliable re-mounting after unmounting, would be greatly appreciated.

Parents
  • Which SDK version are you on? The SD SPI stuff changed quite a bit in 2.9.x compared to earlier versions.

  • Hi,

    There doesn't appear to be a DISK_IOCTL_CTRL_DEINIT explicitly documented or commonly used that would be called after fs_unmount.

    Documentation specifies that de-initializing the disk should be left to filesystem implementations. User application should not need to manually de-initialize the disk. Calling fs_unmount() should be sufficient. However, for hot-pluggable devices, DISK_IOCTL_CTRL_DEINIT might be needed for disk de-initialization. In this case, initialization and de-initialization must be balanced. You can get more information in initializing disks. Information about DISK_IOCTL_CTRL_DEINIT is shown in the disk driver interface API

    Best regards,
    Dejan



  • Sorry my bad. I had tried DISK_IOCTL_CTRL_DEINIT but it causes the system to hang. I have tried it with both passing NULL as buf arg and also pointer to boolean true. In both cases the system workqueue hangs and does not release

  • So this is how my insertion and removal code was implemented. Everything is done in a workqueue.

    On Insert:

    disk_access_ioctl("SD", DISK_IOCTL_CTRL_INIT, NULL);
    fs_mount(&sd_mount);

    On Removal - Issues Encountered:

    Approach 1: Only unmount

    fs_unmount(&sd_mount);

    Result: Error on re-insert:

    <err> sd: Failed to read from SDMMC -5
    <err> sd: Card read failed
    <err> fs: fs mount error (-5)

    Approach 2: Unmount + deinit

    fs_unmount(&sd_mount);
    disk_access_ioctl("SD", DISK_IOCTL_CTRL_DEINIT, NULL);

    Result: System workqueue permanent blockage with logs:

    [00:00:13.367,187] <dbg> sdhc: unmount_filesystem: Starting filesystem unmount
    [00:00:13.367,248] <dbg> sdhc: unmount_filesystem: Disk unmounted from /SD:
    [00:00:27.875,366] <wrn> sd: Card busy when powering off
    [00:00:27.875,396] <dbg> sdhc: unmount_filesystem: Disk deinitialized successfully
    [00:00:27.875,396] <inf> sdhc: SD card removal cleanup completed
    [00:00:27.875,427] <dbg> sdhc: sdhc_work_handler: SDHC work handler completed

    Note: Long delay between unmount and deinit completion

    Approach 3: Force deinit

    bool force = true;
    res = disk_access_ioctl("SD", DISK_IOCTL_CTRL_DEINIT, &force);

    Result: Same permanent workqueue blockage as Approach 2

  • Hi,

    If you unmount and deinit, that is correct but you may have to wait. If this blocks for some reason, then there might be a  problem with either VFS (10% chance) or Disk Access (90%).
    You must provide a user triggered mode where mount points are unmounted and disk is deinitialized, for a card to be removed.

    Best regards,
    Dejan

Reply Children
  • Hi thanks for the reply.

    It is blocking and does not release.

    The proposed user-triggered mode is not possible in a lot of systems, including mine. I expect the user to remove the card and insert it as they please, without unmounting first or deinitializing. This is very common in consumer and embedded devices where users aren’t expected to perform manual unmount procedures - for example, in handheld data loggers or cameras.

    Would you or anyone happen to know how to solve this?

    Thank you! 

    Aanas

  • Hi Aanas,

    Main issue with VFS and Disk Access is that they are disconnected and Disk Access does not pass information to VFS that storage is being removed. This means that if you just remove SD card and there is feedback, for example from presence pin, the driver may be able to de-init card and maybe complete operations in progress, on the device level, not a file system level. That information does not get into VFS so everything that was trying to do writes does not know that storage has been removed, it still sees it mounted and starts to get errors.

    Push-push slots have insertion detection pin and the pin gets disconnected first, before other pins loose contact; this means that a driver, if is able to work with such pin, is able to react to card removal - still, if driver does that, there is no feedback to VFS.

    There might be some time delay between fs_unmount and card being disconnected, because, depending on catching and other stuff, FS driver may take a moment to commit changes it has cashed in ram to SD, also SD will take time to commit them on hardware level and fs_unmount is blocking function and disk_access_write operation is also blocking function.

    Even with sd card detection pin, there is still a problem that you have a few ms to complete operation. In push-push slots, that is quite stable time but still you have to put all the effort to squeeze completion of all operations with this time, and your MCU may be busy. In yank-it-out slots time depends on how fast somebody yanks the card out.

    User can call fs_sync, to try to immediately write info to the disk, but that is file based, which means that user has to keep track of all open files. Zephyr does not, at this point, keep track of all open files. 

    Best regards,
    Dejan

  • Hi Dejan, thanks again for the responding. I am using a push slots that has an insertion detection pin. However it doesnt seem its ever enough time for it to deinit before removal.

    As a second point althought as you mentioned those functions are blocking, they never return. So when that happens the whole system stops forever.

    Now I have solved it using the following:

    1. Did not use either of the following: 

    disk_access_ioctl("SD", DISK_IOCTL_CTRL_INIT, NULL);
    disk_access_ioctl("SD", DISK_IOCTL_CTRL_DEINIT, NULL);

    Earlier I would init the disk, mount, (do stuff) and then on pin triggered removal unmount and deinit. It seems I need to remove the init/deinit them altogether or deinit right after init if I need to access any parameters using the disk_access_ioctl command.

    2. Even with the above solution for some reason everything would get blocked after at unmount. This was resolved once I moved to a lower priority workqueue. I was using the system workqueue before and it would block forever.

    Thanks,

    Aanas

Related