-
Notifications
You must be signed in to change notification settings - Fork 524
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot execute binaries stored in an NFS Server running on a Bottlerocket node #4116
Cannot execute binaries stored in an NFS Server running on a Bottlerocket node #4116
Comments
Thanks for the report; I am investigating, and I will let you know what I find out. In the meantime, I can offer some other persistent storage options, in case any of them would be helpful. You mention both self-hosted NFS and EFS. A few other possibilities you might consider:
|
You may be running into a variation of the behavior discussed here:
One way to work around this might be to mount in a directory from the host's |
If you can, we'd love to hear back how these suggestions are working (or not working) for you. Thanks! |
Hi! Sorry for the late reply, for some reason, GitHub decided that I did not want to receive emails about this issue 🤦. |
Unfortunately, we still get the same issue mounting from
The file:
It still seems that |
I need to set up a repro case locally to try to understand what's going on with SELinux, but I expect it'll need a policy fix on the Bottlerocket side. |
Hi Ben! I was wondering if there's anything I could do to help repro this issue locally, or if I can help with my existing repro at all? |
Hey Liam - I've been able to repro the issue using the steps you provided. Thanks for the detailed instructions. Despite what I wrote earlier, there doesn't seem to be any overlayfs involvement here.
I prodded at it with ftrace:
... and it just looks like a straightforward SELinux permission check failure, where
Unfortunately I'm still not sure on what the best way to fix this is. |
Thanks for the update, Ben! At this point, I think the only way this would work is if I don't know enough about SELinux to tell if that's a terrible idea or not, or if that's even possible. I think we can probably close this for now, with the understanding that userspace NFS implementations are preferred. |
I have a couple ideas that I'd like to explore, so I'm happy to keep it open until there's some kind of resolution. For the first idea: the My other idea is to allow
That would have the property that |
That's an ingenious way to solve the problem! Hopefully it works - I think it's a better fix than forcing NFS to use Thanks for your help with this, by the way. Investigating this problem has opened my eyes a lot to how |
I have this working now, or at least I think I do. I need to write some additional test cases but hope to have the policy change up for review soon. One surprise was that the kernel will silently fall back to the current label in some cases, per this code in I caught this when running an automated test that checks for a container escape via Fortunately SELinux also has an "execute with no transition" permission ( If I still find gaps, then I'll need to fix this in a different way, probably by using |
Similar issue.. Switched to Ganesha User Space NFS server. Works great. |
The fix for this should be coming in 1.27.0, which is expected to be released this week. |
Thanks so much, Ben! Really appreciate your work on this, and it was great to see the SELinux changes that were required to get this working, it's really helped me understand transitions a lot more! If you happen to be in SLC for Kubecon, let me know and I'll pop by and say thanks in person :) |
Image I'm using:
Bottlerocket OS 1.20.4 (aws-k8s-1.30)
Context:
We have some software that runs multiple pods for multiple stages in a pipeline. To be able to complete this dynamically and allow retries on specific steps, we spawn short-lived pods that connect to an NFS server running in-cluster for its ephemeral data. A typical installation would have the orchestrator and the NFS server to begin with. When the orchestrator receives a piece of work, it will:
The NFS server is a simple variant of this alpine server.
What I expected to happen:
When running an NFS Server in a container in bottlerocket, you are able to execute files on the share from a mount in a different container.
What actually happened:
The
NFSD
process is deniedexecute
access. This is exhibited in this AVC denial log:From what I can tell, this is because the process is running as a
kernel
task, even though it's actually exposing data from a share from a container. My current line of thinking is that this is because it's a privileged container and actually hooking into the kernel-level support. Thenfsd
processes have thesystem_u:system_r:kernel_t:s0
SELinux context, and are not children of the NFS server pod.What I've tried to do to work around the problem:
I've attempted to work around this problem by using EFS rather than locally hosting, but when using access points and dynamically provisioned volumes,
chmod
commands getpermission denied
, which fails many scripts (and eventar
in some cases).How to reproduce the problem:
To reproduce the problem, you can create the resources I've added below in a Kubernetes cluster that is running Bottlerocket OS
1.20.4
. I have been doing this in an AWS EKS cluster.You will be able to see the logs after running
logdog
from the admin container in the node running the NFS server, not the nfs-client pod. To run this reproduction, you will also need the NFS CSI driver, which you can install using helm:If you deploy this outside of the
default
namespace, please adjust the server URL to instead point to the namespace you're deploying to - replacenfs.default.svc.cluster.local
withnfs.<your-namespace>.svc.cluster.local
.Resources:
NFS Server
PV/PVC
Client Pod
The text was updated successfully, but these errors were encountered: