-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KEP 1981: Windows privileged container KEP updates for alpha #2288
KEP 1981: Windows privileged container KEP updates for alpha #2288
Conversation
/sig api-machinery |
keps/sig-windows/1981-windows-privileged-container-support/README.md
Outdated
Show resolved
Hide resolved
keps/sig-windows/1981-windows-privileged-container-support/README.md
Outdated
Show resolved
Hide resolved
keps/sig-windows/1981-windows-privileged-container-support/README.md
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggest adding sig-auth as a participating sig, since we have an interest in the securitycontext aspects of the pod.
cc @tallclair for pod security standards intersection
cc @IanColdwater @tabbysable for podsecuritypolicy intersection
|
||
A new boolean field named `privileged` will be added to [WindowsSecurityContextOptions](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.20/#windowssecuritycontextoptions-v1-core). | ||
|
||
On Windows, all containers in a pod mush be privileged. Because of this behavior and because `WindowsSecurityContextOptions` already exists on both `PodSecurityContext` and `Container.SecurityContext` Windows containers will use this new field instead of re-using the existing `privileged` field which only exists on `SecurityContext`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding the field to WindowsSecurityContextOptions
so it is set at the pod level could be ok, but I would recommend requiring (in validation) that pods that set spec.securityContext.windowsOptions.privileged=true
also set securityContext.privileged=true
on all containers. Policy tools already look at the container field... letting that be false
while adding another field that makes a pod privileged will confuse them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It we only require/validate
windowsOptions.privileged = true
is set on all containers (in addition to the pod-level field) would that be sufficient for policy tools?
no existing policy tools would know about the new field... it seems misleading to allow privileged containers that don't set the existing privileged field in the API
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@liggitt If we require that securityContext.privileged=true
is set for all containers what are your thoughts on the current behavior where pod-wide WindowsSecurityContextOptions are applied to all containers if not present?
Would it be ok if the pod-wide WindowsSecuriyContextOptions.privileged=true
and each container only sets securityOptions.privileged=true
for example would this be OK
spec:
securityContext:
windowsOptions:
privileged: true
containers:
- name: foo
securityContext:
privileged: true
or if the pod-wide privileged field is true ensure each container sets securityOptions.privileged=true
and also explicitly sets securityContext.windowsOptions.privlieged
?
example:
spec:
securityContext:
windowsOptions:
privileged: true
containers:
- name: foo
securityContext:
privileged: true
windowsOptions:
privileged: true
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed below, both myself and random-liu feel it would be a better user experience if we didn't rely on existing securityContext.privileged field.
Hopefully with sufficient documentation and announcement policy tools can learn about the new windowsOption.hostProcess flag for Windows containers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, some more context in #2288 (comment)
keps/sig-windows/1981-windows-privileged-container-support/README.md
Outdated
Show resolved
Hide resolved
keps/sig-windows/1981-windows-privileged-container-support/README.md
Outdated
Show resolved
Hide resolved
keps/sig-windows/1981-windows-privileged-container-support/README.md
Outdated
Show resolved
Hide resolved
|
||
In beta there is a possibility to enable the privileged container to be a part of a different network component. If this feature is enabled we will use the existing Pod HostNetwork field to enable/disable. | ||
- this pod must run on a windows host, and kubelets must reject it if not on windows hosts | ||
- all pods marked privileged on windows must have host network enabled, if not the pod does not validate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems good
keps/sig-windows/1981-windows-privileged-container-support/README.md
Outdated
Show resolved
Hide resolved
- OS support: 1809/Windows 2019 LTSC and 2004 | ||
- Containerd: v1.5 | ||
- Kubernetes Target 1.22 or later | ||
- OS support: 1809/Windows 2019 LTSC and 2004 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what version of Windows SAC? cc @jeremyje
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any and all versions of Windows that support k8s + containerd would support this.
Will update the enhancement.
@derekwaynecarr @dchen1107 @mrunalp @mikebrow Would any of you be able to help review the proposed CRI changes? |
@liggitt I think I addressed most of your feedback. Can you take another look and let me know if I missed anything? |
Also, we have a proof-concept for Windows privileged container working today. Currently to test this out you would need the following:
|
cb0a8b6
to
3a2f272
Compare
the podspec bits lgtm for sig-auth |
lgtm for sig-node |
/lgtm |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dchen1107, deads2k, marosset The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/hold cancel |
See kubernetes/enhancements#2288 for more background. There's been asks for Windows privileged containers, or something analagous to it, for quite some time. While in the Linux world this can be achieved just be loosening some of the security restrictions normally in place for containers, this isn't as easy on Windows for many reasons. There's no such thing as just mounting in /dev for the easy example. The model we've landed on to support something akin to privileged containers on Windows is to keep using the container layer technology we currently use for Windows Server and Hyper-V isolated containers, and to simply have the runtime manage a process, or set of processes, in a job object as the container. The work for host process containers, which is the name we've coined these under, is open source and lives here: https://github.com/microsoft/hcsshim/tree/master/internal/jobcontainers This approach covers all of the use cases we've currently heard that privileged containers would be useful for. Some of these include configuring network settings, administrative tasks, viewing/manipulating storage devices, and the ability to simplify running daemons that need host access (kube-proxy) on Windows. Without these changes we'd likely set an annotation to specify that the runtime should create one of these containers, which isn't ideal. As for the one optional field, this is really the only thing that actually differs/isn't configurable for normal Windows Server Containers. With HostProcess containers the final writable layer (volume) for the container is mounted on the host so it's accessible and viewable without enumerating the volumes on the host and trying to correlate what volume is the containers. This is contrary to Windows Server Containers, where the volume is never mounted to a directory anywhere, although it's still accesible from the host for the curious. Signed-off-by: Daniel Canter <[email protected]>
See kubernetes/enhancements#2288 for more background. There's been asks for Windows privileged containers, or something analagous to it, for quite some time. While in the Linux world this can be achieved just be loosening some of the security restrictions normally in place for containers, this isn't as easy on Windows for many reasons. There's no such thing as just mounting in /dev for the easy example. The model we've landed on to support something akin to privileged containers on Windows is to keep using the container layer technology we currently use for Windows Server and Hyper-V isolated containers, and to simply have the runtime manage a process, or set of processes, in a job object as the container. The work for job containers, which is the name we've coined these under internally, is open source and lives here: https://github.com/microsoft/hcsshim/tree/master/internal/jobcontainers Keep in mind the name chosen for the cri API and the user facing k8s was chosen to be named HostProcess. This approach covers all of the use cases we've currently heard that privileged containers would be useful for. Some of these include configuring network settings, administrative tasks, viewing/manipulating storage devices, and the ability to simplify running daemons that need host access (kube-proxy) on Windows. Without these changes we'd likely set an annotation to specify that the runtime should create one of these containers, which isn't ideal. As for the one optional field, this is really the only thing that actually differs/isn't configurable for normal Windows Server Containers. With job containers the final writable layer (volume) for the container is mounted on the host so it's accessible and viewable without enumerating the volumes on the host and trying to correlate what volume is the containers. This is contrary to Windows Server Containers, where the volume is never mounted to a directory anywhere, although it's still accesible from the host for the curious. Signed-off-by: Daniel Canter <[email protected]>
See kubernetes/enhancements#2288 for more background. To avoid any confusion here the name chosen for this container type for the cri API and the user facing k8s settings is HostProcess containers Internally we've coined these as job containers but it's referring to the same type of container, the cri HostProcess field being set would be our key to fill in the JobContainer field on the runtime spec for example. There's been asks for Windows privileged containers, or something analagous to it, for quite some time. While in the Linux world this can be achieved just be loosening some of the security restrictions normally in place for containers, this isn't as easy on Windows for many reasons. There's no such thing as just mounting in /dev for the easy example. The model we've landed on to support something akin to privileged containers on Windows is to keep using the container layer technology we currently use for Windows Server and Hyper-V isolated containers, and to simply have the runtime manage a process, or set of processes, in a job object as the container. The work for job containers is open source and lives here: https://github.com/microsoft/hcsshim/tree/master/internal/jobcontainers This approach covers all of the use cases we've currently heard that privileged containers would be useful for. Some of these include configuring network settings, administrative tasks, viewing/manipulating storage devices, and the ability to simplify running daemons that need host access (kube-proxy) on Windows. Without these changes we'd likely set an annotation to specify that the runtime should create one of these containers, which isn't ideal. As for the one optional field, this is really the only thing that actually differs/isn't configurable for normal Windows Server Containers. With job containers the final writable layer (volume) for the container is mounted on the host so it's accessible and viewable without enumerating the volumes on the host and trying to correlate what volume is the containers. This is contrary to Windows Server Containers, where the volume is never mounted to a directory anywhere, although it's still accesible from the host for the curious. Signed-off-by: Daniel Canter <[email protected]>
See kubernetes/enhancements#2288 for more background. To avoid any confusion here the name chosen for this container type for the cri API and the user facing k8s settings is HostProcess containers Internally we've coined these as job containers but it's referring to the same type of container, the cri HostProcess field being set would be our key to fill in the JobContainer field on the runtime spec for example. There's been asks for Windows privileged containers, or something analagous to it, for quite some time. While in the Linux world this can be achieved just be loosening some of the security restrictions normally in place for containers, this isn't as easy on Windows for many reasons. There's no such thing as just mounting in /dev for the easy example. The model we've landed on to support something akin to privileged containers on Windows is to keep using the container layer technology we currently use for Windows Server and Hyper-V isolated containers, and to simply have the runtime manage a process, or set of processes, in a job object as the container. The work for job containers is open source and lives here: https://github.com/microsoft/hcsshim/tree/master/internal/jobcontainers This approach covers all of the use cases we've currently heard that privileged containers would be useful for. Some of these include configuring network settings, administrative tasks, viewing/manipulating storage devices, and the ability to simplify running daemons that need host access (kube-proxy) on Windows. Without these changes we'd likely set an annotation to specify that the runtime should create one of these containers, which isn't ideal. As for the one optional field, this is really the only thing that actually differs/isn't configurable for normal Windows Server Containers. With job containers the final writable layer (volume) for the container is mounted on the host so it's accessible and viewable without enumerating the volumes on the host and trying to correlate what volume is the containers. This is contrary to Windows Server Containers, where the volume is never mounted to a directory anywhere, although it's still accesible from the host for the curious. Signed-off-by: Daniel Canter <[email protected]>
See kubernetes/enhancements#2288 for more background. To avoid any confusion here the name chosen for this container type for the cri API and the user facing k8s settings is HostProcess containers Internally we've coined these as job containers but it's referring to the same type of container, the cri HostProcess field being set would be our key to fill in the JobContainer field on the runtime spec for example. There's been asks for Windows privileged containers, or something analagous to it, for quite some time. While in the Linux world this can be achieved just be loosening some of the security restrictions normally in place for containers, this isn't as easy on Windows for many reasons. There's no such thing as just mounting in /dev for the easy example. The model we've landed on to support something akin to privileged containers on Windows is to keep using the container layer technology we currently use for Windows Server and Hyper-V isolated containers, and to simply have the runtime manage a process, or set of processes, in a job object as the container. The work for job containers is open source and lives here: https://github.com/microsoft/hcsshim/tree/master/internal/jobcontainers This approach covers all of the use cases we've currently heard that privileged containers would be useful for. Some of these include configuring network settings, administrative tasks, viewing/manipulating storage devices, and the ability to simplify running daemons that need host access (kube-proxy) on Windows. Without these changes we'd likely set an annotation to specify that the runtime should create one of these containers, which isn't ideal. As for the one optional field, this is really the only thing that actually differs/isn't configurable for normal Windows Server Containers. With job containers the final writable layer (volume) for the container is mounted on the host so it's accessible and viewable without enumerating the volumes on the host and trying to correlate what volume is the containers. This is contrary to Windows Server Containers, where the volume is never mounted to a directory anywhere, although it's still accesible from the host for the curious. Signed-off-by: Daniel Canter <[email protected]>
See kubernetes/enhancements#2288 for more background. To avoid any confusion here the name chosen for this container type for the cri API and the user facing k8s settings is HostProcess containers Internally we've coined these as job containers but it's referring to the same type of container, we'd just like to keep the name the same as we use internally at the OCI level and in our code. The cri HostProcess field being set would be our key to fill in the WindowsJobContainer field on the runtime spec for example. There's been asks for Windows privileged containers, or something analagous to it, for quite some time. While in the Linux world this can be achieved just be loosening some of the security restrictions normally in place for containers, this isn't as easy on Windows for many reasons. There's no such thing as just mounting in /dev for the easy example. The model we've landed on to support something akin to privileged containers on Windows is to keep using the container layer technology we currently use for Windows Server and Hyper-V isolated containers, and to simply have the runtime manage a process, or set of processes, in a job object as the container. The work for job containers is open source and lives here: https://github.com/microsoft/hcsshim/tree/master/internal/jobcontainers This approach covers all of the use cases we've currently heard that privileged containers would be useful for. Some of these include configuring network settings, administrative tasks, viewing/manipulating storage devices, and the ability to simplify running daemons that need host access (kube-proxy) on Windows. Without these changes we'd likely set an annotation to specify that the runtime should create one of these containers, which isn't ideal. As for the one optional field, this is really the only thing that actually differs/isn't configurable for normal Windows Server Containers. With job containers the final writable layer (volume) for the container is mounted on the host so it's accessible and viewable without enumerating the volumes on the host and trying to correlate what volume is the containers. This is contrary to Windows Server Containers, where the volume is never mounted to a directory anywhere, although it's still accesible from the host for the curious. Signed-off-by: Daniel Canter <[email protected]>
See kubernetes/enhancements#2288 for more background. To avoid any confusion here the name chosen for this container type for the cri API and the user facing k8s settings is HostProcess containers Internally we've coined these as job containers but it's referring to the same type of container, we'd just like to keep the name the same as we use internally at the OCI level and in our code. The cri HostProcess field being set would be our key to fill in the WindowsJobContainer field on the runtime spec for example. There's been asks for Windows privileged containers, or something analogous to it, for quite some time. While in the Linux world this can be achieved just be loosening some of the security restrictions normally in place for containers, this isn't as easy on Windows for many reasons. There's no such thing as just mounting in /dev for the easy example. The model we've landed on to support something akin to privileged containers on Windows is to keep using the container layer technology we currently use for Windows Server and Hyper-V isolated containers, and to simply have the runtime manage a process, or set of processes, in a job object as the container. The work for job containers is open source and lives here: https://github.com/microsoft/hcsshim/tree/master/internal/jobcontainers This approach covers all of the use cases we've currently heard that privileged containers would be useful for. Some of these include configuring network settings, administrative tasks, viewing/manipulating storage devices, and the ability to simplify running daemons that need host access (kube-proxy) on Windows. Without these changes we'd likely set an annotation to specify that the runtime should create one of these containers, which isn't ideal. As for the one optional field, this is really the only thing that actually differs/isn't configurable for normal Windows Server Containers. With job containers the final writable layer (volume) for the container is mounted on the host so it's accessible and viewable without enumerating the volumes on the host and trying to correlate what volume is the containers. This is contrary to Windows Server Containers, where the volume is never mounted to a directory anywhere, although it's still accesible from the host for the curious. Signed-off-by: Daniel Canter <[email protected]>
See kubernetes/enhancements#2288 for more background. To avoid any confusion here the name chosen for this container type for the cri API and the user facing k8s settings is HostProcess containers Internally we've coined these as job containers but it's referring to the same type of container, we'd just like to keep the name the same as we use internally at the OCI level and in our code. The cri HostProcess field being set would be our key to fill in the WindowsJobContainer field on the runtime spec for example. There's been asks for Windows privileged containers, or something analogous to it, for quite some time. While in the Linux world this can be achieved just be loosening some of the security restrictions normally in place for containers, this isn't as easy on Windows for many reasons. There's no such thing as just mounting in /dev for the easy example. The model we've landed on to support something akin to privileged containers on Windows is to keep using the container layer technology we currently use for Windows Server and Hyper-V isolated containers, and to simply have the runtime manage a process, or set of processes, in a job object as the container. The work for job containers is open source and lives here: https://github.com/microsoft/hcsshim/tree/master/internal/jobcontainers This approach covers all of the use cases we've currently heard that privileged containers would be useful for. Some of these include configuring network settings, administrative tasks, viewing/manipulating storage devices, and the ability to simplify running daemons that need host access (kube-proxy) on Windows. Without these changes we'd likely set an annotation to specify that the runtime should create one of these containers, which isn't ideal. As for the one optional field, this is really the only thing that actually differs/isn't configurable for normal Windows Server Containers. With job containers the final writable layer (volume) for the container is mounted on the host so it's accessible and viewable without enumerating the volumes on the host and trying to correlate what volume is the containers. This is contrary to Windows Server Containers, where the volume is never mounted to a directory anywhere, although it's still accesible from the host for the curious. Signed-off-by: Daniel Canter <[email protected]>
KEP to add support for Windows privileged containers.
This was merged as provisional in v.120.
SIG-Windows would like to target an alpha release for v1.21.
k/enhancements issue #1981
LGTM's: