-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A start job is running for Mount ZFS filesystems (time / no limit) and the server hangs forever #14562
Comments
FWIW, I've not experienced this issue on either my laptop or server, both now running Ubuntu 22.04.2 with zfs-2.1.5-1ubuntu6~22.04.1. Both have been upgraded numerous times (not fresh installs). Admittedly, both these machines are booted in a somewhat unusual way (by Ubuntu standards); they're both running ZFS-on-root and are booted through ZFS Boot Menu and dracut (rather than GRUB and initramfs-tools). |
We are experiencing similar issue.
What we discovered today is that probability of running into this issue heavily depends on the instance type. The end of serial console when it gets stuck looks like this (this is in case of zfs-import-scan being active, zfs-import-cache being disabled):
and when it works it usually takes 1-2 seconds finish the import targets:
If time permits we'll try to reproduce on most basic Ubuntu LTS install, but if some of the maintainers want to try it might be worth looking into instances like t3.small or lower. |
We just launched a t3.small based instance on aws using latest ubuntu 22.04 available from the ami launcher - amazon/ubuntu/images/hvm-ssd/ubuntu-jammy-22.04-amd64-server-20230516. Here are the the commands that got executed to set it up:
And then we get this sequence when we issue
|
I have also encountered this situation, I guess the NVME device does dedup and cache, but I don't know why zfs can't mount this NVME device when I restart it |
This is still happening, is there any plan for a resolution? My workaround is to delete /lib/systemd/system/zfs-mount.service because disabling it is not enough as it gets executed regardless |
@ezplanet |
The pool use read-only mode can read data. Then export data. |
If zfs is defekt you can never boot in your system, thats right, but is it ok, that i never can repair or change the disk to repair the system - all settings are gone while only /var/vmail ist gone - it can't be / because we never came to the startblock |
Still an issue in 2024. I am using Mint 21.3. I was playing round with Charmed Kubernetes, which was wiped after I had finished trying it out (it uses lxc containers). However next boot an issue even though I have no zfs filesystems (just ext4). I masked out the zfs-mount.service. Afterwards I ran |
I was facing the same issue after having to reconstruct the boot pool. I found the solution in my case was to edit the |
No it does not, I worked around it including TimeoutSec as follows:
However this is still quite a big issue because when zfs-mount receives an automated ubuntu package update the change is wiped out and the server hangs at the next reboot. Since I use headless servers it means getting them out of the rack and connect keyboard and monitor (which I do not have normally plugged) to regain control.
Is anyone looking into the root cause? Do you need any help with debugging? |
Use |
Thank you! That helps. But it is not the solution. |
System information
Describe the problem you're observing
When a disk or array with a zfs pool is connected the server will NOT boot. It will hang at:
"A start job is running for Mount ZFS filesystems (time / no limit)"
indefinitely
Describe how to reproduce the problem
Install Ubuntu Server 22.04 LTS and import any existing zpool. Reboot. The system hangs as above (tested on 2 different installations, one a fresh install the other an upgrade from Ubuntu 20.04 LTS)
Include any warning/errors/backtraces from the system logs
No relevant logs available. The servers must be hard reset and the hardware including the zfs pool must be disconnected before rebooting to regain control (testing with headless servers).
After further investigation I found that this service is hanging indefinitely on boot whenever a zfs pool is present on any connected drive:
I tried to disable this service but it appears to be running on boot regardless, even if it is marked as "disabled" by "systemd".
By adding the following to the [Service] section: "TimeoutSec=60" the issue is worked around and the servers will boot, however this change is overwritten whenever there are new zfs packages updates.
When the server boots after the timeout the zpool previously imported are found imported and all filesystems are present and mounted (tested only with regular zfs filesystems). I wonder what this zfs-mount.service does.
NOTE: if the zpool drives are physically connected AFTER the server is up and running, then a zpool import gets access to the pool without issues (until a reboot with the physical drives still connected).
The text was updated successfully, but these errors were encountered: