-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updating Windows KEP for GA #729
Changes from all commits
79ce009
5f73943
eedb967
d940cb5
0a5d694
ff1a31e
b13160c
849dcde
6be3e1b
83f66eb
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4,18 +4,24 @@ authors: | |
- "@astrieanna" | ||
- "@benmoss" | ||
- "@patricklang" | ||
- "@michmike" | ||
owning-sig: sig-windows | ||
participating-sigs: | ||
- sig-architecture | ||
- sig-node | ||
reviewers: | ||
- sig-architecture | ||
- sig-node | ||
- sig-testing | ||
- sig-release | ||
approvers: | ||
- "@bgrant0607" | ||
- "@michmike" | ||
- "@patricklang" | ||
- "@spiffxp" | ||
editor: TBD | ||
creation-date: 2018-11-29 | ||
last-updated: 2019-01-21 | ||
last-updated: 2019-01-25 | ||
status: provisional | ||
--- | ||
|
||
|
@@ -55,39 +61,52 @@ There is strong interest in the community for adding support for workloads runni | |
|
||
## Motivation | ||
|
||
Windows-native workloads still account for a significant portion of the enterprise software space. While containerization technologies emerged first in the UNIX ecosystem, Microsoft has made investments in recent years to enable support for containers in its Windows OS. As users of Windows increasingly turn to containers as the preferred abstraction for running software, the Kubernetes ecosystem stands to benefit by becoming a cross-platform cluster manager. | ||
Windows-based workloads still account for a significant portion of the enterprise software space. While containerization technologies emerged first in the UNIX ecosystem, Microsoft has made investments in recent years to enable support for containers in its Windows OS. As users of Windows increasingly turn to containers as the preferred abstraction for running software and modernizing existing applications, the Kubernetes ecosystem stands to benefit by becoming a cross-platform cluster manager. | ||
|
||
### Goals | ||
|
||
- Enable users to run nodes on Windows servers | ||
- Enable users to schedule Windows Server containers in Kubernetes through the introduction of support for Windows compute nodes | ||
- Document the differences and limitations compared to Linux | ||
- Test results added to testgrid to prevent regression of functionality | ||
- Create a test suite in testgrid to maintain high quality for this feature and prevent regression of functionality | ||
|
||
### Non-Goals | ||
|
||
- Adding Windows support to all projects in the Kubernetes ecosystem (Cluster Lifecycle, etc) | ||
- Enable the Kubernetes master components to run on Windows | ||
|
||
## Proposal | ||
|
||
As of 29-11-2018 much of the work for enabling Windows nodes has already been completed. Both `kubelet` and `kube-proxy` have been adapted to work on Windows Server, and so the first goal of this KEP is largely already complete. | ||
|
||
### What works today | ||
- Windows-based containers can be created by kubelet, [provided the host OS version matches the container base image](https://docs.microsoft.com/en-us/virtualization/windowscontainers/deploy-containers/version-compatibility) | ||
- ConfigMap, Secrets: as environment variables or volumes | ||
- Pod (single or multiple containers per Pod with process isolation), Deployment, ReplicaSet | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's confusing to mention Deployment and ReplicaSet here, and DaemonSet and StatefulSet below. Please discuss all the workload controllers adjacent to one another. Do Job and CronJob have any issues? If not, please list them with ReplicaSet and Deployment. |
||
- Services types NodePort, ClusterIP, LoadBalancer, and ExternalName | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Headless services? Are there any DNS differences? |
||
- ConfigMap, Secrets: as environment variables or volumes | ||
- Resource limits | ||
- Pod & container metrics | ||
- Pod networking with [Azure-CNI](https://github.com/Azure/azure-container-networking/blob/master/docs/cni.md), [OVN-Kubernetes](https://github.com/openvswitch/ovn-kubernetes), [two CNI meta-plugins](https://github.com/containernetworking/plugins), [Flannel](https://github.com/coreos/flannel) and [Calico](https://github.com/projectcalico/calico) | ||
- Horizontal Pod Autoscaling | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Are system OOMs reported? Are there notable differences in Pod Status fields? |
||
- Windows Server 2019 is the only Windows operating system we will support at GA timeframe. Note above that the host operating system version and the container base image need to match. This is a Windows limitation we cannot overcome. | ||
- Customers can deploy a heterogeneous cluster, with Windows and Linux compute nodes side-by-side and schedule Docker containers on both operating systems. Of course, Windows Server containers have to be scheduled on Windows and Linux containers on Linux | ||
- Out-of-tree Pod networking with [Azure-CNI](https://github.com/Azure/azure-container-networking/blob/master/docs/cni.md), [OVN-Kubernetes](https://github.com/openvswitch/ovn-kubernetes), [two CNI meta-plugins](https://github.com/containernetworking/plugins), [Flannel (VXLAN and Host-Gateway)](https://github.com/coreos/flannel) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Isn't VXLAN support only in 1903 currently? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @astrieanna by the time we GA, it will be supported for Server 2019 |
||
- Dockershim CRI | ||
- Many<sup id="a1">[1]</sup> of the e2e conformance tests when run with [alternate Windows-based images](https://hub.docker.com/r/e2eteam/) which are being moved to [kubernetes-sigs/windows-testing](https://www.github.com/kubernetes-sigs/windows-testing) | ||
- Persistent storage: FlexVolume with [SMB + iSCSI](https://github.com/Microsoft/K8s-Storage-Plugins/tree/master/flexvolume/windows), and in-tree AzureFile and AzureDisk providers | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Some questions/notes from the storage perspective:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For NFS, just came across kubernetes/kubernetes#56188 (comment). So sounds like NFS [#4 above] is beyond scope. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. we will try to get answers to your questions |
||
- Windows Server containers can take advantage of StatefulSet functionality for stateful applications and distributed systems | ||
- Windows Pods can take advantage of DaemonSet, with the exception that privileged containers are not supported on Windows (more on that below) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Above you mentioned "Windows server containers" and here "Windows pods". Is there any difference in meaning between the two? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. no difference. i will update the naming to be consistent. |
||
|
||
### What will work eventually | ||
- Group Managed Service Accounts, a way to assign an Active Directory identity to a Windows container, is forthcoming with KEP `Windows Group Managed Service Accounts for Container Identity` | ||
- `kubectl port-forward` hasn't been implemented due to lack of an `nsenter` equivalent to run a process inside a network namespace. | ||
- CRIs other than Dockershim: CRI-containerd support is forthcoming | ||
- Some kubeadm work was done in the past to add Windows nodes to Kubernetes, but that effort has been dormant since. We will need to revisit that work and complete it in the future. | ||
- Calico CNI for Pod networking | ||
- Hyper-V isolation (Currently this is limited to 1 container per Pod and is an alpha feature) | ||
- It is unclear if the RuntimeClass proposal from sig-node will simplify scheduled Windows containers. we will work with sig-node on this. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If this is still not well understood I don't think it needs to be included here There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. folks from sig-architecture will likely ask about this, which is why i included here. indicating we will do more work on this. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. sounds good There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My meta-point here is that Windows stable shouldn't require supporting an alpha or beta feature. We should continue working on a plan for this alongside SIG-Node. I think this is ok as-is There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it might be clearer if the section was renamed to "Windows Node Roadmap" to make it explicit that the eventually is beyond the scope of GA There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I asked about RuntimeClass back in November. :-) @craiglpeters has a good point. I assume "eventually" is post-GA for all of these? |
||
|
||
### What will never work (without underlying OS changes) | ||
- Certain Pod functionality | ||
- Privileged containers | ||
- Privileged containers and other Pod security context privilege and access control settings | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I asked a bunch of other questions on the original KEP PR: There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @bgrant0607 , which linux capabilities specifically do you mean? these ones? https://kubernetes.io/docs/tasks/configure-pod-container/security-context/ |
||
- Reservations are not enforced by the OS, but overprovisioning could be blocked with `--enforce-node-allocatable=pods` (pending: tests needed) | ||
- Certain volume mappings | ||
- Single file & subpath volume mounting | ||
|
@@ -96,8 +115,9 @@ As of 29-11-2018 much of the work for enabling Windows nodes has already been co | |
- readOnly root filesystem. Mapped volumes still support readOnly | ||
- Termination Message - these require single file mappings | ||
- CSI plugins, which require privileged containers | ||
- Host networking is not available in Windows | ||
- [Some parts of the V1 API](https://github.com/kubernetes/kubernetes/issues/70604) | ||
- Overlay networking support in Windows Server 1803 is not fully functional using the `win-overlay` CNI plugin. Specifically service IPs do not work on Windows nodes. This is currently specific to `win-overlay` - other CNI plugins (OVS, AzureCNI) work. | ||
- Overlay networking support in Windows Server 1803 is not fully functional using the `win-overlay` CNI plugin. Specifically service IPs do not work on Windows nodes. This is currently specific to `win-overlay`; other CNI plugins (OVS, AzureCNI) work. Since Windows Server 1803 is not supported for GA, this is mostly not applicable. We left it here since it impacts beta | ||
|
||
### Relevant resources/conversations | ||
|
||
|
@@ -110,13 +130,51 @@ As of 29-11-2018 much of the work for enabling Windows nodes has already been co | |
|
||
**Second class support**: Kubernetes contributors are likely to be thinking of Linux-based solutions to problems, as Linux remains the primary OS supported. Keeping Windows support working will be an ongoing burden potentially limiting the pace of development. | ||
|
||
**User experience**: Users today will need to use some combination of taints and node selectors in order to keep Linux and Windows workloads separated. In the best case this imposes a burden only on Windows users, but this is still less than ideal. | ||
**User experience**: Users today will need to use some combination of taints and node selectors in order to keep Linux and Windows workloads separated. In the best case this imposes a burden only on Windows users, but this is still less than ideal. The recommended approach is outlined below | ||
|
||
## Graduation Criteria | ||
#### Ensuring OS-specific workloads land on appropriate container host | ||
As you can see below, we plan to document how Windows containers can be scheduled on the appropriate host using Taints and Tolerations. All nodes today have the following default labels | ||
- beta.kubernetes.io/os = [windows|linux] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's worth noting the promotion of these to stable: |
||
- beta.kubernetes.io/arch = [amd64|arm64|...] | ||
|
||
If a deployment does not specify a nodeSelector like `"beta.kubernetes.io/os": windows`, it is possible the Pods can be scheduled on any host, Windows or Linux. This can be problematic since a Windows container can only run on Windows and a Linux container can only run on Linux. The best practice we will recommend is to use a nodeSelector. | ||
|
||
## Implementation History | ||
However, we understand that in certain cases customers have a pre-existing large number of deployments for Linux containers. Since they will not want to change all deployments to add nodeSelectors, the alternative is to use Taints. Because the kubelet can set Taints during registration, it could easily be modified to automatically add a taint when running on Windows only (`“--register-with-taints=’os=Win1809:NoSchedule’” `). By adding a taint to all Windows nodes, nothing will be scheduled on them (that includes existing Linux Pods). In order for a Windows Pod to be scheduled on a Windows node, it would need both the nodeSelector to choose Windows, and a toleration. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's not just deployments, but also ecosystem off-the-shelf configurations, such as community Helm charts, and programmatic pod generation cases, such as with Operators. I think taints are going to be needed in most cases. |
||
``` | ||
nodeSelector: | ||
"beta.kubernetes.io/os": windows | ||
tolerations: | ||
- key: "os" | ||
operator: "Equal" | ||
Value: “Win1809” | ||
effect: "NoSchedule" | ||
``` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Because Windows containers are specific to the os version, does it make sense to have the taint/toleration include the windows version? While only 2019 is supported at GA, eventually there will be more versions of windows support (as new Windows versions are released). A version-specific taint could help containers land on the right nodes. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We were going to add that in the docs, but i made the change here as well for additional clarity |
||
|
||
## Graduation Criteria | ||
- All features and functionality under `What works today` is fully tested and vetted to be working by SIG-Windows | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this section complete, or is @craiglpeters still working on it? My previous comment: There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. i made some more edits now that i will pushing through |
||
- SIG-Windows has high confidence to the stability and reliability of Windows Server containers on Kubernetes | ||
- 100% green/passing conformance tests that are applicable to Windows (see the Testing Plan section for details on these tests) | ||
- Comprehensive documentation that includes but is not limited to the following sections. Documentation will reside at https://kubernetes.io/docs | ||
1. Outline of Windows Server containers on Kubernetes | ||
2. Getting Started Guide, including Prerequisites | ||
3. How to deploy Windows nodes in Kubernetes | ||
4. Overview of Networking on Windows | ||
5. Links to documentation on how to deploy and use CNI plugins for Windows (example for OVN - https://github.com/openvswitch/ovn-kubernetes/tree/master/contrib) | ||
6. Links to documentation on how to deploy Windows nodes for public cloud providers or other Kubernetes distributions (example for Rancher - https://rancher.com/docs//rancher/v2.x/en/cluster-provisioning/rke-clusters/windows-clusters/) | ||
7. How to schedule Windows Server containers, including examples | ||
8. Advanced: How to use metrics and the Horizontal Pod Autoscaler | ||
9. Advanced: How to use Group Managed Service Accounts | ||
10. Advanced: How to use Taints and Tolerations for a heterogeneous compute cluster (Windows + Linux) | ||
11. Advanced: How to use Hyper-V isolation (not a stable feature yet) | ||
12. Advanced: How to build Kubernetes for Windows from source | ||
13. Supported functionality (with examples where appropriate) | ||
14. Known Limitations | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Are there any node addons work, such as node problem detector? |
||
15. Unsupported functionality | ||
16. Resources for contributing and getting help - Includes troubleshooting help and links to additional troubleshooting guides like https://docs.microsoft.com/en-us/virtualization/windowscontainers/kubernetes/common-problems | ||
|
||
## Implementation History | ||
- Alpha was released with Kubernetes v.1.5 | ||
- Beta was released with Kubernetes v.1.9 | ||
|
||
## Testing Plan | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is supporting LCOW a non goal?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for now yes. i will clarify