Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Version update for Kernel 6.7 compatibility #15759

Open
Alexandero89 opened this issue Jan 11, 2024 · 48 comments
Open

Version update for Kernel 6.7 compatibility #15759

Alexandero89 opened this issue Jan 11, 2024 · 48 comments
Labels
Type: Feature Feature request or new feature

Comments

@Alexandero89
Copy link

Since the kernel version 6.7 has recently been released and Fedora will probably update to this version soon, I wanted to ask if you could release a small update to reach 6.7 compatibility.

As far as i see the needed changes are already merged? #15681

@Alexandero89 Alexandero89 added the Type: Feature Feature request or new feature label Jan 11, 2024
@Austaras
Copy link

Arch has already released linux 6.7
https://archlinux.org/packages/core/x86_64/linux/

@WonderMr
Copy link

Another vote for 6.7 for ZFS.

@morrolinux
Copy link

Following!

@devsk
Copy link

devsk commented Jan 27, 2024

Any ideas if something is being done for this? Typically, this support has been released way before the kernel releases. We are at 6.7.2 already.

@robn
Copy link
Member

robn commented Jan 27, 2024

Support for 6.7 is already available on master (#15681) and will be in 2.2.3 (#15694).

@devsk
Copy link

devsk commented Jan 27, 2024

Thanks for the update @robn! Is there a timeline for official 2.2.3 release and what's pending?

@devsk
Copy link

devsk commented Feb 4, 2024

I wonder why tagging 2.2.3 is taking so long when 2.2.1 was tagged a month after 2.2.0 on Nov 13th and 2.2.2 was tagged 2 weeks after 2.2.2 on Nov 28. Its been more than 2 months since 2.2.2.

Is there a showstopper blocking issue for 2.2.3? I thought 2.2.x is now a patchset series which would see very minimal changes and not break much, while 2.3 will take the new features etc. Is this not correct interpretation of release versions?

@robn
Copy link
Member

robn commented Feb 4, 2024

A point release has certain expectations of stability and it takes time to test and polish. 2.2.3 is also perhaps a little bigger than a normal point release, as 2.2.1 and 2.2.2 were both a little unusual in that they included rush fixes for critical issues. So we really want to take the time to feel confident with it.

The best thing you can do to help is try the staging branch (#15836) and report back on how it goes.

@devsk
Copy link

devsk commented Feb 5, 2024

@robn Makes sense. But is there a current blockers list for 2.2.3? I think we should have a time line for release. We shouldn't wait to accumulate fixes beyond a cut off.

I don't have a problem trying out 2.2.3-staging. But I would like to stick with gentoo ebuilds instead of doing point in time git based builds and manually copying the modules. Does anybody have a 2.2.3-staging ebuild that they are using?

@toastal
Copy link

toastal commented Feb 5, 2024

@devsk 2.2.3-staging is in nixpkgs-unstable as zfsUnstable. You could try building your kernel + zfs with Nix instead for now if it’s a hassle to do it with Gentoo in the meantime. I have an overlay you can borrow to compile the kernel with gcc for your specific architecture if trying to do the -march=native Gentoo route.

@kerberizer
Copy link

ZFS being my main filesystem for the last 15 years, I have deepest respect for the outstanding effort the developers put into ensuring maximum stability and reliability. In fact, the recent problems with block cloning (and the adventures I had with unreliable RAM a few months ago) only made me appreciate even more how robust ZFS is—which is arguably the single most important characteristic of a filesystem.

Maintaining compatibility with the Linux kernel is also not always the easiest of tasks for different reasons. And this might be somewhat of an understatement.

However, because more streamlined releases would also benefit many users (granted, in enterprise use cases most of us typically use LTS kernels, so these issues usually do not affect us) I've been thinking would it be too cumbersome to have releases with only the changes required for kernel compatibility once a new Linux kernel is released—and leave the rest of the accumulated changes in "master" for the next release? <-- TL;DR

Depending on the status of "master" I guess it might require sometimes significant backporting effort of the kernel-related changes to the latest stable release—which I suppose is the main reason why this isn't done. And it will mean yet another thing to monitor in terms of integration testing.

But there are probably benefits as well—fewer changes in a release usually leads to more easily debuggable problems (or additional noise, of course, because of the additional work to port the changes).

I recognize the intricacies of integrating such a process within the existing development workflow and am curious about the team's perspective on this idea. Needless to say, I understand my suggestion might be oversimplifying these complexities, and—in open-source projects in particular—resources might be rather scarce and thus must be managed very carefully. Even if this means sacrificing otherwise desirable features.

Perhaps there could also be different, more efficient approaches to achieve the same goal of more streamlined kernel-compatibility releases?

@devsk
Copy link

devsk commented Feb 5, 2024

Same here @kerberizer. I started using ZFS as zfs-fuse long time ago and have been one of the earliest adopters of the kernel module from its very birth. Brian used to be the lone warrior at the time!

ZFS has saved me several times! So, I am an extremely thankful user!

@tonyhutter
Copy link
Contributor

But is there a current blockers list for 2.2.3?

@devsk I think we'd like to get #15842 (block cloning fix) in before releasing . Once it's merged into master we'll pull it into 2.2.3.

We shouldn't wait to accumulate fixes beyond a cut off.

We put out point releases as needed rather than following a strict calendar schedule. Typically a new release is driven by a need for new kernel compatibility fixes, or a pile-up of patches in -staging. I would say our typical cadence is every 3-4 months for point releases, and once a year for major releases.

would it be too cumbersome to have releases with only the changes required for kernel compatibility once a new Linux kernel is released—and leave the rest of the accumulated changes in "master" for the next release

I recognize the intricacies of integrating such a process within the existing development workflow and am curious about the team's perspective on this idea. Needless to say, I understand my suggestion might be oversimplifying these complexities, and—in open-source projects in particular—resources might be rather scarce and thus must be managed very carefully. Even if this means sacrificing otherwise desirable features.

@kerberizer It's a fair question, and I think you may have already guessed the answer. In point releases we have:

  1. Linux kernel compatibility fixes
  2. Code cleanups / documentation updates / test case updates
  3. New features
  4. Bug-fixes
  5. FreeBSD updates
  6. Compiler fixes/workarounds

We can of course pick one category to optimize for and get quicker releases for that category. However we do that at the expense of slower releases for the other categories. Every ZFS user is going to have different ideas about what categories are the most important to them. A RHEL 8 user may care a lot about bug-fixes, but not about 6.x kernel compatibility. As a result, we have to split the difference and try to (imperfectly) target a release for all categories.

@devsk
Copy link

devsk commented Feb 5, 2024

@tonyhutter

  1. New features

Doesn't this go against the very idea that patch release should be super duper stable? Adding any new code is potentially a instability exposure. Adding a new feature can be just multiplying that risk by a much bigger factor.

@behlendorf
Copy link
Contributor

@devsk I'd say "new feature" is probably an overstatement. Historically, we're talking about minor improvements in non-critical areas. For example, an additional option to zdb to facilitate debugging, or a new module parameter which is useful but doesn't change the default behavior. This wouldn't include any significant/major features.

@devsk
Copy link

devsk commented Feb 6, 2024

Thanks, @behlendorf! That makes the policy clear!

@Alexandero89
Copy link
Author

Thank you for taking the time to explain your processes and why you do what you do.
And I can totally understand it.

btw a bit offtopic but interesting so I don't have to open an issue next time:
Since #15842 has now been merged into main I would just wait until the official release comes out, but could it possibly be that OpenZFS has discontinued the testing repository for Fedora?

According to http://download.zfsonlinux.org/fedora-testing/ Fedora 34 is the last version for which staging packages were built.
So there is no official way to get staging packages without building them yourself, right?

@tonyhutter
Copy link
Contributor

tonyhutter commented Feb 6, 2024

@Alexandero89 that's correct, you will need to build it from source or wait for 2.2.3 to get #15842. None of the testing repos have -staging code.

Testing repo in a nutshell:

  • Fedora testing - occasionally been used over the years for -rc testing, but it's kind of an unused repo.
  • EL testing - The 2.2.x branch.

The normal EL repo contains the 2.1.x branch.

@tomchiverton
Copy link

I wonder why tagging 2.2.3 is taking so long when 2.2.1 was tagged a month after 2.2.0 on Nov 13th and 2.2.2 was tagged 2 weeks after 2.2.2 on Nov 28. Its been more than 2 months since 2.2.2.

Is there a showstopper blocking issue for 2.2.3? I thought 2.2.x is now a patchset series which would see very minimal changes and not break much, while 2.3 will take the new features etc. Is this not correct interpretation of release versions?

Fedora 38 is now shipping a 6.7.x kernel and this bug makes me nervous about running the update...

@Alexandero89
Copy link
Author

So #15842 is merged, but the tests did not pass competly.
Seems that it's not a zfs problem but a github actions problem.
Either there is a network connectivity issue or just the machine just ran out of disk space ( see github action)

@sabirovrinat85
Copy link

Fedora 38 is now shipping a 6.7.x kernel and this bug makes me nervous about running the update...

maybe applying

echo 'zfs' > /etc/dnf/protected.d/zfs.conf

as guided in openzfs docs help?

@Alexandero89
Copy link
Author

Fedora 38 is now shipping a 6.7.x kernel and this bug makes me nervous about running the update...

maybe applying

echo 'zfs' > /etc/dnf/protected.d/zfs.conf

as guided in openzfs docs help?

Thanks, did not know this is possible.

The only consequence on my system, if there is no compatible zfs yet, is that a new kernel is installed that does not have a zfs. in other words, if you reboot the system and automatically use the new kernel, the system would no longer boot. however, if you have the grub selection with kernels, for example, you can simply select the second last kernel with zfs and everything will continue to work perfectly.
of course this makes the kernel update pointless.

i simply use dnf exclude as long as there is no suitable zfs release:
sudo dnf upgrade --exclude="kernel*"

@tomchiverton
Copy link

echo 'zfs' > /etc/dnf/protected.d/zfs.conf doesn't prevent installing a kernel (like 6.7.x) for which zfs-dkms can not build a module so on reboot all ZFS filesystems fail to mount. It just stops yum/dnf from uninstalling zfs itself.

So even with that, dnf upgrade (without exclude) is a bad idea right now :(

I haven't updated to Fedora 39, but expect that would be a even worse idea than updating the Fedora 38 kernel (from 6.6.x) right now :(

@sabirovrinat85
Copy link

So even with that, dnf upgrade (without exclude) is a bad idea right now :(

so at least for new installations (specifically netinstall) giving an option to select kernel version would be nice to implement?

@tomchiverton
Copy link

That sounds like not-a-zfs question, unless I'm misunderstanding.

@tomchiverton
Copy link

So #15842 is merged, but the tests did not pass competly. .... ust the machine just ran out of disk space ( see github action)

Certainly seems like it just needs some TLC. Can someone poke it back to life ?

Lack of support for 6.7.x is starting to hurt people ...

@GruenSein
Copy link

So #15842 is merged, but the tests did not pass competly. .... ust the machine just ran out of disk space ( see github action)

Certainly seems like it just needs some TLC. Can someone poke it back to life ?

Lack of support for 6.7.x is starting to hurt people ...

Can confirm that a naive dnf upgrade led to an unusable ZFS pool. Same situation as it was when F39 was launched. Makes me consider to migrate to some in-kernel FS...

@toastal
Copy link

toastal commented Feb 12, 2024

Copr offers LTS support for for those on Fedora. 6.6 is an LTS kernel.

@ghost
Copy link

ghost commented Feb 13, 2024

So #15842 is merged, but the tests did not pass competly. .... ust the machine just ran out of disk space ( see github action)

Certainly seems like it just needs some TLC. Can someone poke it back to life ?
Lack of support for 6.7.x is starting to hurt people ...

Can confirm that a naive dnf upgrade led to an unusable ZFS pool. Same situation as it was when F39 was launched. Makes me consider to migrate to some in-kernel FS...

Same issue here. Thankfully still had a 6.6 Kernel installed that I could boot into.

@lloesche
Copy link

Ha, found this thread the same way. Did a dnf update on Fedora 39 and after reboot my ZFS filesystems were gone. No biggy though, just rebooted into an older Kernel and pinned it for now.

And to be fair, had I bothered to scroll up a couple of screens I'd have seen the dkms autoinstall on 6.7.4-200.fc39.x86_64/x86_64 failed for zfs(10) message. Unlucky that dnf doesn't seem to exit with an error when a dkms build fails.

On the other hand, why doesn't zfs-dkms 2.2.2 conflict with kernel versions >= 6.7? The spec file lists some Requires but doesn't prevent my system from upgrading to an incompatible kernel version. Wouldn't a Conflicts: kernel > @[email protected] solve the issue? I feel like this question must have been asked before and I'm missing something very obvious.

@tomchiverton
Copy link

tomchiverton commented Feb 13, 2024 via email

@colemickens
Copy link

It seems like it might be good for DKMS users to open a separate issue specifically tracking that issue, if there isn't one.

@tomchiverton
Copy link

I don't think the OpenZFS team need an issue to tell them they need to better track kernel changes, and sync their releases with those.

@hubick
Copy link

hubick commented Feb 18, 2024

I just lost half my Saturday afternoon after doing a dnf update on Fedora and stumbling through trying to fix my zfs and downgrade various kernel/header/devel/matched packages like an idiot.

Why doesn't every zfs release conflict with the next major kernel version by default, until that kernel is released and zfs can be verified as working, and then issue a package update just to bump the compatibility declaration?

Or, if you can't be bothered to make compatibility releases, I'm fine if zfs just sits there "protected" and prevents all kernel upgrades until the next scheduled zfs point release comes out, which can then also bump compatibility to allow kernel upgrades or not again.

But, please, something, anything, to prevent regular system updates from leaving my zfs inoperable? PLEASE?

@gc-user
Copy link

gc-user commented Feb 18, 2024

tl;dr version:

Why not make a snapshot before updating the kernel on a distro that (apparently) removes the previous kernel during updates?


Up until a few months ago i didn't even know it was possible to "swap out" a kernel on a running machine, i.e. updating the kernel and removing the previous one that the machine that is updated was booted with and is running when the update is performed. Apparently there are distros that do just that.

I these cases, I would not seek to change zfs but ask the distro devs to not do that. Or, even easier, if I was using zfs, e.g. for the reason to have snapshots and BEs, i would just make a snapshot before updating something important like a kernel.

I have been encountered a non-booting kernel v6.7 because the zfs DKMS module wouldn't build successfully. But on the distro I'm using that's not a biggy, esp. in combination with zfsbootmenu. I just rebooted and chose the old kernel of that BE for booting.
But even since before then, I check since probably early last year if zfs is compatible with a new kernel main version before updating - if that new kernel is made the new default kernel on the distro I use. But, I think, that distro doesn't do that as long as zfs isn't yet compatible with a new kernel main version.

I can definitely see it being a helpful "feature" making the zfs package throwing a conflict and preventing upgrading to a new main version kernel if zfs isn't yet proven to be compatible with it - if that is technically possible.
But one can also change one's kernel update routine (snapshots, BEs), or ask the the devs of the distro one is using to not auto-deleted the previous kernel when updating and also maybe wait for zfs compatibility before making a new main version kernel the default kernel. Esp. in the current situation where kernel v6.6 is an lts version that would be a sane approach.

Let's not forget: The whole zfs - kernel "mess" is apparently the fault of Oracle's habit of suing and thus preventing Linus Torvalds from integrating it into the kernel... :-)
Maybe someone knows some higher-up in Oracle and can talk them into signing some "we will never sue over zfs" binding promise...

@GruenSein
Copy link

tl;dr version:

Why not make a snapshot before updating the kernel on a distro that (apparently) removes the previous kernel during updates?

Typically, it is recommended to keep the OS up to date for security patches, driver updates etc. So, installing OS updates is a quite frequent occurrence. And yes, taking a snapshot before any update is possible. Yes, it is possible to then recover from an update that breaks an integral part of your OS. But is this really what updating should require?

I think, it is not unreasonable to expect updates of an OS to not regularly break the system. This is the core principle of any distro: To provide a combination of packages that works instead of requiring the user to manage all possible combinations of software packages on his own.

In this particular case, this is a bit more difficult because the OpenZFS repo is not part of the distro. However, it targets a specific distro (Fedora in my case) but frequently gets out of sync with it. I do not think that it is too much to ask to limit the compatibility of the ZFS packages to the kernels with which it has been tested because it is clear that the distros cannot do this as ZFS is not part of their repositories. At least, this would prevent updates to the kernel until ZFS has caught up.

@AllKind
Copy link
Contributor

AllKind commented Feb 18, 2024

On the other hand, why doesn't zfs-dkms 2.2.2 conflict with kernel versions >= 6.7? The spec file lists some Requires but doesn't prevent my system from upgrading to an incompatible kernel version. Wouldn't a Conflicts: kernel > @[email protected] solve the issue? I feel like this question must have been asked before and I'm missing something very obvious.

Yes, that should work. Guess just nobody did the work to implement that. With a little knowledge with m4 and autoconf it should be not to hard.

1: Grab the max kernel version from META. Put it in a variable to reference in the spec file.
2: Define the conflicts in the rpm spec file. Problem there is, it has to be match the different names of the kernel packages of the various distros.

@digitalsignalperson
Copy link

I wonder if it would be worth doing disk passthrough to a lightweight VM to run ZFS on a stable kernel, maybe using 9p-virtio for direct access to the mountpoints between host/guest, and then the host can be on the bleeding edge kernel without worrying about ZFS breaking. Could also create zpool/zfs command wrappers on the host that ssh or incus/lxd exec them in the VM.

I just lost half my Saturday afternoon after doing a dnf update on Fedora and stumbling through trying to fix my zfs and downgrade various kernel/header/devel/matched packages like an idiot.

Many stay on an LTS kernel to avoid these surprises. I guess fedora doesn't provide one, so maybe it's not a great choice for ZFS. Quoting a reddit comment

there is no official LTS-kernel in Fedora, because the distribution is meant as a testing ground for the latest software.

But, if you wish to pin a kernel version and keep it from updating, just add excludepkgs='kernel* linux-firmware* perf' to your /etc/dnf/dnf.conf

@tomchiverton
Copy link

tomchiverton commented Feb 18, 2024 via email

@digitalsignalperson
Copy link

Would having a ZFS fuse implementation allow decoupling from the kernel version, since it's a userspace driver?
See #8

@grantgray79
Copy link

I just lost half my Saturday afternoon after doing a dnf update on Fedora and stumbling through trying to fix my zfs and downgrade various kernel/header/devel/matched packages like an idiot.

Why doesn't every zfs release conflict with the next major kernel version by default, until that kernel is released and zfs can be verified as working, and then issue a package update just to bump the compatibility declaration?

Or, if you can't be bothered to make compatibility releases, I'm fine if zfs just sits there "protected" and prevents all kernel upgrades until the next scheduled zfs point release comes out, which can then also bump compatibility to allow kernel upgrades or not again.

But, please, something, anything, to prevent regular system updates from leaving my zfs inoperable? PLEASE?

See: /etc/sysconfig/kernel
Set UPDATEDEFAULT=no

Then you can decide for yourself when to activate a new kernel using 'grubby --set-default-index'.

Please be more respectful; there is a sense of entitlement in your comments that may repulse people that might otherwise assist you.

@devsk
Copy link

devsk commented Feb 19, 2024

We can make a policy change and can get ahead of all this, and we have all the pieces that are needed.

We should reserve Z in X.Y.Z purely to track with latest kernel as it evolves from l.m-RC0 to l.m-RCn to l.m.

Typically, as the changes for l.m-RCx make their way upstream, compatibility issues are found. We keep accumulating and stabilizing the compatibility patches until the last transition from l.m-RCn to l.m and then we go from X.Y.Z to X.Y.Z+1 and put it out.

e.g. we could have released 2.2.3 back in December and released along with 6.7. All the patches for 6.7 compatibility were there. The compatibility changes should be low risk but we stabilize as we go from RC0 to RCn to release.

Make all other changes (5 of the 6 that Tony Hutter listed above) in Y of the X.Y.Z.

This way the distros will see linux-l.m release and zfs-X.Y.Z+1 release happen within the same day and can adapt their processes to this schedule.

@robn
Copy link
Member

robn commented Feb 19, 2024

Linux isn't the only platform OpenZFS runs on, and Linux has the development and testing resources for much more frequent releases than OpenZFS ever can. Pegging OpenZFS release to Linux releases seems like it would just create additional work for OpenZFS devs.

Its also worth noting that the compatibility patches aren't always low risk, and aren't always easy to figure out how to do, so no one should imagine its a simple thing to achieve in general.

That said, I could certainly imagine a world where someone released (say) a 2.2.2-compat1 with only OS-compatibility patches. I'd certainly want to see someone(s) step up to create and maintain that though. It also doesn't necessarily have to be done under the auspices of the OpenZFS project proper; you could totally just do it for now, and if its working out, bring it to OpenZFS.

As to modifying the OpenZFS RPM build files to indicate a "maximum version", I would say that if that's as easy as claimed (I honestly have no idea), then someone submitting a PR against OpenZFS with that change would be very welcome.

Of course, distributions that don't use the OpenZFS packaging aren't going to benefit from such a change (Debian/Ubuntu don't use OpenZFS-provided packaging; I don't know what the deal is for Fedora variants). If they do their own thing, then you'll need to file a bug against those distributions.

Finally, I'll note that OpenZFS is pretty conservative about the kernel support it claims. Maybe we can do more to automatically inform local package managers of this fact (ie via "maximum version" metadata), but the fact remains that if you're getting OpenZFS from here and trying to use it against a newer kernel than it claims support for, that's unfortunately on you when something goes wrong. If that's something you're unable to manage yourself, then you may need to consider a distribution with different stability guarantees, or paying someone to support your specific needs.

@devsk
Copy link

devsk commented Feb 19, 2024

Linux isn't the only platform OpenZFS runs on

the compatibility patches aren't always low risk

Both are fair points.

I can't speak to the first one but 2nd one, I can share my observation over the years: the compatibility patches get figured out during l.m-RCx cycle itself most of the times, and are not very intrusive most of the times. I think the team here does an excellent job of coming up with them already!

In case they are intrusive, we can make an exception and hold off on zfs-X.Y.Z+1 beyond linux-l.m. So, in most cases we end up with no breakage. The point is to narrow this window as much as possible.

@robn
Copy link
Member

robn commented Feb 23, 2024

FYI, 2.2.3 is released: https://github.com/openzfs/zfs/releases/tag/zfs-2.2.3

@turowicz
Copy link

FYI, 2.2.3 is released: https://github.com/openzfs/zfs/releases/tag/zfs-2.2.3

is there a .deb?

@gdevenyi
Copy link
Contributor

#10333

@AllKind
Copy link
Contributor

AllKind commented Feb 28, 2024

@turowicz
git clone the zfs github repository and then a few commands described here:
https://openzfs.github.io/openzfs-docs/Developer%20Resources/Custom%20Packages.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Feature Feature request or new feature
Projects
None yet
Development

No branches or pull requests