Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document issue with multiple threads #9

Closed
NicolasT opened this issue Aug 19, 2022 · 2 comments · Fixed by #12
Closed

Document issue with multiple threads #9

NicolasT opened this issue Aug 19, 2022 · 2 comments · Fixed by #12
Labels
documentation Improvements or additions to documentation enhancement New feature or request

Comments

@NicolasT
Copy link
Owner

NicolasT commented Aug 19, 2022

In Linux, security contexts and related things are kept per-thread by the kernel, even though users (and POSIX) expect many of them to be per-process. As an example, when one calls setuid(...), one expects the whole process, including all its threads (some the application author may not even know to exist) to now run as the new UID, not only the thread performing the syscall.

Within Glibc and other libc's, there's code in place to make setuid, setgid and other calls to behave as expected/required: instead of simply invoking the syscall in the calling thread, the syscall is invoked in all threads of the process, through a bunch of highly tricky code involving signals and whatnot. This is known as the setxid issue.

The Landlock API has the very same problem with landlock_restrict_self (and its prerequisite prctl(PR_SET_NO_NEW_PRIVS, ...)): the restrictions are applied only to the thread invoking landlock_restrict_self, not all pre-existing threads in a process. Hence, even after invoking landlock_restrict_self, some other threads would still be able to access files etc. which were supposed to be restricted.

The Glibc machinery to run some syscall in all threads is not exposed, and can hence not be repurposed. There's a library, libpsx, part of libcap (which struggles from the exact same setxid problem) which provides a user-facing API to run some syscalls in all pre-existing threads, relying on some linker functionality to hook into pthread_create. However, when attempting to use libpsx with a GHC Haskell program, things don't work out, potentially due to how libpsx and the GHC RTS interact, or maybe some bug(s) in the RTS, not expecting some function calls to be interrupted by SIGSYS.

In Golang, the syscall.AllThreadsSyscall function was added to invoke some syscall in all OS threads managed by the Go runtime. If GHC were to have a similar feature, the setxid problem of landlock_restrict_self could be fixed trivially (assuming no OS threads were created using other means).

Alternatively, if Glibc gets built-in bindings for landlock_restrict_self which uses the setxid functionality under the hood, we could use this instead of invoking the syscall directly.

For now, this issue is documented in the API docs, and the library will throw an exception when using landlock with the threaded RTS.

See https://github.com/NicolasT/landlock-hs/blob/f22b7e4450991f7cdbec37271f56550c1d747b10/test/ThreadedScenario.hs for a scenario exposing the issue (when using the threaded RTS).

See also a related article by Kazu Yamamoto at https://kazu-yamamoto.hatenablog.jp/entry/2020/12/04/141308.

See: golang/go#1435
See: https://ewontfix.com/17/
See: https://sites.google.com/site/fullycapable/who-ordered-libpsx
See: https://git.kernel.org/pub/scm/libs/libcap/libcap.git/tree/psx

@NicolasT NicolasT added documentation Improvements or additions to documentation enhancement New feature or request labels Aug 19, 2022
NicolasT added a commit that referenced this issue Aug 20, 2022
When running with the threaded RTS, the test currently fails, which is
expected: we can't support this RTS for now, see #9.

See: #9
NicolasT added a commit that referenced this issue Aug 20, 2022
When running with the threaded RTS, the test currently fails, which is
expected: we can't support this RTS for now, see #9.

See: #9
NicolasT added a commit that referenced this issue Aug 20, 2022
These API-impacting changes are no true work-around for the
`setxid`-style issue of `landlock_restrict_self` and `prctl`, but
instead cause the `landlock` function to throw an exception when used
with the threaded RTS, until a real fix can be put in place.

Fixes: #9
@NicolasT NicolasT linked a pull request Aug 20, 2022 that will close this issue
@NicolasT
Copy link
Owner Author

After some more research, it turns out the GHC threaded RTS and libpsx are not compatible because GHC creates a thread (the ticker) which explicitly disables all signals (see the use of sigfillset in rts/posix/ticker/Pthread.c). Since this also preempts delivery of SIGSYS as used by libpsx, we end up with a deadlock.

There's a work-around: wrapping sigfillset, and ensure SIGSYS is not set in the sigset_t, similar to how libpsx requires pthread_create to be wrapped. This could, obviously, cause weird runtime issues if SIGSYS is used for other purposes...

@NicolasT
Copy link
Owner Author

It's possible to use libpsx with the GHC RTS. However, while working on a package exposing this functionality, I ran into hard-to-debug issues in the GitHub Actions CI environment, which I could not reproduce at all on my Fedora 36 host.

Turns out the version of libcap shipped with Ubuntu (up to at least Jammy 22.04) and current Debian, i.e., libcap 2.44, comes with an utterly buggy libpsx when using psx_syscall6: instead of passing the requested syscall number and arguments to syscall, it passes junk, due to a bug fixed in https://git.kernel.org/pub/scm/libs/libcap/libcap.git/commit/?id=e7e0e1b9e2cf3378d329174ed5b0c716b0539c72 which is included in libcap 2.46 and later.

Hence, this won't work unless

  • we include a working version of libpsx in the build
  • we somehow assert a working version of libpsx is linked in
  • we write a libpsx equivalent as part of this library

NicolasT added a commit that referenced this issue Aug 23, 2022
This is an optional feature, behind a flag, whose default is however
`True`, since `libpsx` as shipped on current Ubuntu and Debian is
broken: these ship with `libcap` 2.44, a version of the software in
which `libpsx`'s `psx_syscall6` incorrectly handles syscall arguments.
This was fixed in `libcap` 2.46 and later only.

Furthermore, this old version of `libpsx` uses `SIGRTMAX` instead of
`SIGSYS` as the signaling signal, which isn't handled in our
`sigfillset` wrapper. The "currect" signal isn't exposed in any of the
`libpsx` header files, so being compatible is difficult.

Hence, this library now, by default, compiles and links in a bundled
version of `libpsx`. A system-installed one can still be used (e.g., on
Fedora systems) by disabling the flag.

See: #9 (comment)
See: https://git.kernel.org/pub/scm/libs/libcap/libcap.git/commit/?id=e7e0e1b9e2cf3378d329174ed5b0c716b0539c72
NicolasT added a commit that referenced this issue Aug 23, 2022
This is an optional feature, behind a flag, whose default is however
`True`, since `libpsx` as shipped on current Ubuntu and Debian is
broken: these ship with `libcap` 2.44, a version of the software in
which `libpsx`'s `psx_syscall6` incorrectly handles syscall arguments.
This was fixed in `libcap` 2.46 and later only.

Furthermore, this old version of `libpsx` uses `SIGRTMAX` instead of
`SIGSYS` as the signaling signal, which isn't handled in our
`sigfillset` wrapper. The "currect" signal isn't exposed in any of the
`libpsx` header files, so being compatible is difficult.

Hence, this library now, by default, compiles and links in a bundled
version of `libpsx`. A system-installed one can still be used (e.g., on
Fedora systems) by disabling the flag.

See: #9 (comment)
See: https://git.kernel.org/pub/scm/libs/libcap/libcap.git/commit/?id=e7e0e1b9e2cf3378d329174ed5b0c716b0539c72
NicolasT added a commit that referenced this issue Aug 23, 2022
This is an optional feature, behind a flag, whose default is however
`True`, since `libpsx` as shipped on current Ubuntu and Debian is
broken: these ship with `libcap` 2.44, a version of the software in
which `libpsx`'s `psx_syscall6` incorrectly handles syscall arguments.
This was fixed in `libcap` 2.46 and later only.

Furthermore, this old version of `libpsx` uses `SIGRTMAX` instead of
`SIGSYS` as the signaling signal, which isn't handled in our
`sigfillset` wrapper. The "currect" signal isn't exposed in any of the
`libpsx` header files, so being compatible is difficult.

Hence, this library now, by default, compiles and links in a bundled
version of `libpsx`. A system-installed one can still be used (e.g., on
Fedora systems) by disabling the flag.

See: #9 (comment)
See: https://git.kernel.org/pub/scm/libs/libcap/libcap.git/commit/?id=e7e0e1b9e2cf3378d329174ed5b0c716b0539c72
NicolasT added a commit that referenced this issue Aug 24, 2022
This is an optional feature, behind a flag, whose default is however
`True`, since `libpsx` as shipped on current Ubuntu and Debian is
broken: these ship with `libcap` 2.44, a version of the software in
which `libpsx`'s `psx_syscall6` incorrectly handles syscall arguments.
This was fixed in `libcap` 2.46 and later only.

Furthermore, this old version of `libpsx` uses `SIGRTMAX` instead of
`SIGSYS` as the signaling signal, which isn't handled in our
`sigfillset` wrapper. The "currect" signal isn't exposed in any of the
`libpsx` header files, so being compatible is difficult.

Hence, this library now, by default, compiles and links in a bundled
version of `libpsx`. A system-installed one can still be used (e.g., on
Fedora systems) by disabling the flag.

See: #9 (comment)
See: https://git.kernel.org/pub/scm/libs/libcap/libcap.git/commit/?id=e7e0e1b9e2cf3378d329174ed5b0c716b0539c72
NicolasT added a commit that referenced this issue Aug 24, 2022
In order to be resilient against `setxid`-style issues when invoking
`prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0)` or `landlock_restrict_self`,
whilst still be compatible with the threaded GHC RTS, the `landlock`
library now depends on the `psx` library which embeds `libpsx` in the
running process. The relevant library functions are then used to ensure
the necessary syscalls are invoked in all process threads.

This reverts the "throw-in-threaded-rts"-behaviour of before (which was
never in a released version of the library).

The `psx` library is a non-optional dependency. Even though it's not
strictly required when using the single-threaded RTS, it's more simple
to have it enabled in all cases.

See: #9
See: https://git.kernel.org/pub/scm/libs/libcap/libcap.git/tree/psx
See: https://hackage.haskell.org/package/psx (eventually)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant