Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[6.11 w/ CONFIG_RANDSTRUCT=y] sysctl table check failed #16620

Closed
ossimoi opened this issue Oct 7, 2024 · 8 comments · Fixed by #16805
Closed

[6.11 w/ CONFIG_RANDSTRUCT=y] sysctl table check failed #16620

ossimoi opened this issue Oct 7, 2024 · 8 comments · Fixed by #16805
Labels
Type: Building Indicates an issue related to building binaries Type: Defect Incorrect behavior (e.g. crash, hang)

Comments

@ossimoi
Copy link

ossimoi commented Oct 7, 2024

Type Version/Name
Distribution Name Gentoo
Distribution Version 2.15
Kernel Version 6.11.2
Architecture x86_64
OpenZFS Version ab777f4

Modprobing zfs fails with:

sysctl table check failed: kernel/spl/(null) procname is null
sysctl table check failed: kernel/spl/(null) No proc_handler
@ossimoi ossimoi added the Type: Defect Incorrect behavior (e.g. crash, hang) label Oct 7, 2024
@behlendorf behlendorf added the Type: Building Indicates an issue related to building binaries label Oct 7, 2024
@RzTen1
Copy link

RzTen1 commented Oct 21, 2024

It looks like this may be due to CONFIG_RANDSTRUCT being enabled. I had two builds, one with that enabled which generates the failure above and one with it disabled which lets the current build modprobe cleanly. This was working in 6.10 but I'm not certain what changed in 6.11 to break it.

@ossimoi
Copy link
Author

ossimoi commented Oct 21, 2024

It looks like this may be due to CONFIG_RANDSTRUCT being enabled. I had two builds, one with that enabled which generates the failure above and one with it disabled which lets the current build modprobe cleanly. This was working in 6.10 but I'm not certain what changed in 6.11 to break it.

Thanks for the input. Will give it a try without CONFIG_RANDSTRUCT.

@ossimoi
Copy link
Author

ossimoi commented Oct 25, 2024

Works without randstruct. Not a solution, but allows to boot 6.11 for now.

@qjim-github
Copy link

Great. Finally I'm able to use ZFS on 6.11 kernels. Gentoo just removed all 6.10 sources...

@thulle
Copy link

thulle commented Nov 12, 2024

Might be an idea to update the title?
"sysctl table check fails on kernel 6.11 compiled with CONFIG_RANDSTRUCT"

Adding the modprobe error message "Protocol driver not attached" for searchability.

@ossimoi ossimoi changed the title [6.11] sysctl table check failed [6.11 w/ CONFIG_RANDSTRUCT=y] sysctl table check failed Nov 13, 2024
@IvanVolosyuk
Copy link
Contributor

This is kinda unfortunate that we have to sacrifice security feature for compatibility with ZFS, especially as it was working before on linux-6.6.x. I can try to bisect this to see what kernel commit triggers the issue.

@IvanVolosyuk
Copy link
Contributor

Bisected to this commit:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=d7a76ec87195ced6910b0ca10ca133bb316c90f5

commit d7a76ec87195ced6910b0ca10ca133bb316c90f5
Author: Joel Granados <[email protected]>
Date:   Tue Jun 4 08:29:21 2024 +0200

    sysctl: Remove check for sentinel element in ctl_table arrays
    
    Use ARRAY_SIZE exclusively by removing the check to ->procname in the
    stopping criteria of the loops traversing ctl_table arrays. This commit
    finalizes the removal of the sentinel elements at the end of ctl_table
    arrays which reduces the build time size and run time memory bloat by
    ~64 bytes per sentinel (further information Link :
    https://lore.kernel.org/all/ZO5Yx5JFogGi%[email protected]/)
    
    Remove the entry->procname evaluation from the for loop stopping
    criteria in sysctl and sysctl_net.
    
    Signed-off-by: Joel Granados <[email protected]>

 fs/proc/proc_sysctl.c |  2 +-
 net/sysctl_net.c      | 11 ++---------

I can see openzfs still uses sentinel records at the end of each proc directory, e.g.
https://github.com/openzfs/zfs/blob/master/module/os/linux/spl/spl-proc.c#L400

  • Before this commit ZFS-2.3.0-rc3 doesn't crash on startup with root on ZFS
  • After the commit it crashes with CONFIG_RANDSTRUCT_PERFORMANCE=y with the error: sysctl table check failed: kernel/spl/(null) No proc_handler
  • With CONFIG_RANDSTRUCT_NONE=y the crash doesn't happen and kernel with root on ZFS boots normally.

[    0.504772] zbud: loaded
[    0.505476] sysctl table check failed: kernel/spl/(null) No proc_handler
[    0.525384] BUG: unable to handle page fault for address: ffffffffffffff00
[    0.525496] #PF: supervisor read access in kernel mode
[    0.525496] #PF: error_code(0x0000) - not-present page
[    0.525496] PGD 20a37067 P4D 20a37067 PUD 20a39067 PMD 0
[    0.525496] Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
[    0.525496] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G                T  6.10.0-rc2-x86_64+ #22
[    0.525496] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.0-20240908_211521-localhost 04/01/2014
[    0.525496] RIP: 0010:strncmp+0x1e/0x40
[    0.525496] Code: 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 48 85 d2 74 24 31 c0 eb 0d 84 c9 74 1c 48 83 c0 01 48 39 d0 74 13 0f b6 0c 07 <3a> 0c 06 74 ea 19 c0 83 c8 01 e90
[    0.525496] RSP: 0018:ffffb4a040013da8 EFLAGS: 00010246
[    0.525496] RAX: 0000000000000000 RBX: ffffffffffffff00 RCX: 000000000000007a
[    0.525496] RDX: 00000000000000ff RSI: ffffffffffffff00 RDI: ffff97ce0227fea7
[    0.525496] RBP: ffff97ce0227fda8 R08: 0000000000000000 R09: ffff97ce0227fda8
[    0.525496] R10: 00000000000000ff R11: fefefefefefefeff R12: ffff97ce0227fea7
[    0.525496] R13: 00000000000001a4 R14: ffff97ce00d22940 R15: 0000000000000000
[    0.525496] FS:  0000000000000000(0000) GS:ffff97d16f800000(0000) knlGS:0000000000000000
[    0.525496] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.525496] CR2: ffffffffffffff00 CR3: 0000000020a32000 CR4: 0000000000750ef0
[    0.525496] PKRU: 55555554
[    0.525496] Call Trace:
[    0.525496]  <TASK>
[    0.525496]  ? __die+0x23/0x70
[    0.525496]  ? page_fault_oops+0x149/0x4e0
[    0.525496]  ? search_bpf_extables+0x5f/0x80
[    0.525496]  ? srso_alias_return_thunk+0x5/0xfbef5
[    0.525496]  ? fixup_exception+0x26/0x2e0
[    0.525496]  ? exc_page_fault+0x165/0x170
[    0.525496]  ? asm_exc_page_fault+0x26/0x30
[    0.525496]  ? strncmp+0x1e/0x40
[    0.525496]  ? srso_alias_return_thunk+0x5/0xfbef5
[    0.525496]  kstat_proc_entry_install+0x8e/0x3f0
[    0.525496]  fletcher_4_init+0x133/0x150
[    0.525496]  ? __pfx_openzfs_init+0x10/0x10
[    0.525496]  zcommon_init+0xe/0x20
[    0.525496]  openzfs_init+0xf/0xd0
[    0.525496]  ? __pfx_openzfs_init+0x10/0x10
[    0.525496]  do_one_initcall+0x45/0x300
[    0.525496]  kernel_init_freeable+0x30a/0x450
[    0.525496]  ? __pfx_kernel_init+0x10/0x10
[    0.525496]  kernel_init+0x1a/0x1d0
[    0.525496]  ret_from_fork+0x31/0x50
[    0.525496]  ? __pfx_kernel_init+0x10/0x10
[    0.525496]  ret_from_fork_asm+0x1a/0x30
[    0.525496]  </TASK>
[    0.525496] Modules linked in:
[    0.525496] CR2: ffffffffffffff00
[    0.525496] ---[ end trace 0000000000000000 ]---
[    0.525496] RIP: 0010:strncmp+0x1e/0x40
[    0.525496] Code: 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 48 85 d2 74 24 31 c0 eb 0d 84 c9 74 1c 48 83 c0 01 48 39 d0 74 13 0f b6 0c 07 <3a> 0c 06 74 ea 19 c0 83 c8 01 e90
[    0.525496] RSP: 0018:ffffb4a040013da8 EFLAGS: 00010246
[    0.525496] RAX: 0000000000000000 RBX: ffffffffffffff00 RCX: 000000000000007a
[    0.525496] RDX: 00000000000000ff RSI: ffffffffffffff00 RDI: ffff97ce0227fea7
[    0.525496] RBP: ffff97ce0227fda8 R08: 0000000000000000 R09: ffff97ce0227fda8
[    0.525496] R10: 00000000000000ff R11: fefefefefefefeff R12: ffff97ce0227fea7
[    0.525496] R13: 00000000000001a4 R14: ffff97ce00d22940 R15: 0000000000000000
[    0.525496] FS:  0000000000000000(0000) GS:ffff97d16f800000(0000) knlGS:0000000000000000
[    0.525496] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.525496] CR2: ffffffffffffff00 CR3: 0000000020a32000 CR4: 0000000000750ef0
[    0.525496] PKRU: 55555554
[    0.525496] note: swapper/0[1] exited with irqs disabled
[    0.588723] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009

@IvanVolosyuk
Copy link
Contributor

IvanVolosyuk commented Nov 25, 2024

I tested that removal of sentinel records is indeed fixes the issue on 6.11:
https://github.com/IvanVolosyuk/zfs/commits/sentinel-test/

There should be proper guard for this changes as 6.10 still relies on sentinels. @robn do you know how to add the guard? I can add kernel version check, but it will not work for intermediate commits.

IvanVolosyuk added a commit to IvanVolosyuk/zfs that referenced this issue Nov 25, 2024
Kernel 6.11-rc1 and later no longer use proc sentinel and zfs module
will fail loading if compiled with CONFIG_RANDSTRUCT_PERFORMANCE=y

The change makes the sentinels used only on earlier kernel versions.
Ideally we should check for sentinels instead of kernel version.

Signed-off-by: Ivan Volosyuk <[email protected]>
Closes openzfs#16620
IvanVolosyuk added a commit to IvanVolosyuk/zfs that referenced this issue Nov 25, 2024
Kernel 6.11-rc1 and later no longer use proc sentinels and zfs module
will fail loading if compiled with CONFIG_RANDSTRUCT_PERFORMANCE=y

The change restricts use of the sentinel values to earlier kernel
versions. Ideally we should check for sentinels instead of kernel
version.

Signed-off-by: Ivan Volosyuk <[email protected]>
Closes openzfs#16620
IvanVolosyuk added a commit to IvanVolosyuk/zfs that referenced this issue Nov 25, 2024
Kernel 6.11-rc1 and later no longer use proc sentinels and zfs module
will fail loading if compiled with CONFIG_RANDSTRUCT_PERFORMANCE=y

The change restricts use of the sentinel values to earlier kernel
versions. Ideally we should check for sentinels instead of kernel
version.

Closes openzfs#16620
Signed-off-by: Ivan Volosyuk <[email protected]>
IvanVolosyuk added a commit to IvanVolosyuk/zfs that referenced this issue Nov 25, 2024
Kernel 6.11-rc1 and later no longer use proc sentinels and zfs module
will fail loading if compiled with CONFIG_RANDSTRUCT_PERFORMANCE=y

The change restricts use of the sentinel values to earlier kernel
versions. Ideally we should check for sentinels instead of kernel
version.

Closes openzfs#16620
Signed-off-by: Ivan Volosyuk <[email protected]>
IvanVolosyuk added a commit to IvanVolosyuk/zfs that referenced this issue Nov 25, 2024
Kernel 6.11-rc1 and later no longer use proc sentinels and zfs module
will fail loading if compiled with CONFIG_RANDSTRUCT_PERFORMANCE=y

The change restricts use of the sentinel values to earlier kernel
versions. Ideally we should check for sentinels instead of kernel
version.

Closes openzfs#16620
Signed-off-by: Ivan Volosyuk <[email protected]>
@amotin amotin closed this as completed in f29dcc2 Nov 30, 2024
behlendorf pushed a commit to behlendorf/zfs that referenced this issue Dec 3, 2024
Adjust the m4 function to mimic sentinel we use in spl-proc.c
This fixes the detection on kernels compiled with CONFIG_RANDSTRUCT=y

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Rob Norris <[email protected]>
Reviewed-by: Pavel Snajdr <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Signed-off-by: Ivan Volosyuk <[email protected]>
Closes: openzfs#16620
Closes: openzfs#16805
arter97 pushed a commit to arter97/zfs that referenced this issue Dec 9, 2024
Adjust the m4 function to mimic sentinel we use in spl-proc.c
This fixes the detection on kernels compiled with CONFIG_RANDSTRUCT=y

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Rob Norris <[email protected]>
Reviewed-by: Pavel Snajdr <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Signed-off-by: Ivan Volosyuk <[email protected]>
Closes: openzfs#16620
Closes: openzfs#16805
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Building Indicates an issue related to building binaries Type: Defect Incorrect behavior (e.g. crash, hang)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants