From b8d483fa1a6f87d0f3df285a52c00b79f703ad5c Mon Sep 17 00:00:00 2001 From: "W. Trevor King" Date: Tue, 1 Oct 2019 11:57:40 -0700 Subject: [PATCH] docs/user/*/install_upi: Explicitly set control-plane unschedulable We grew replicas-zeroing in c22d042 (docs/user/aws/install_upi: Add 'sed' call to zero compute replicas, 2019-05-02, #1649) to set the stage for changing the 'replicas: 0' semantics from "we'll make you some dummy MachineSets" to "we won't make you MachineSets". But that hasn't happened yet, and since 64f96df (scheduler: Use schedulable masters if no compute hosts defined, 2019-07-16, #2004) 'replicas: 0' for compute has also meant "add the 'worker' role to control-plane nodes". That leads to racy problems when ingress comes through a load balancer, because Kubernetes load balancers exclude control-plane nodes from their target set [1,2] (although this may get relaxed soonish [3]). If the router pods get scheduled on the control plane machines due to the 'worker' role, they are not reachable from the load balancer and ingress routing breaks [4]. Seth says: > pod nodeSelectors are not like taints/tolerations. They only have > effect at scheduling time. They are not continually enforced. which means that attempting to address this issue as a day-2 operation would mean removing the 'worker' role from the control-plane nodes and then manually evicting the router pods to force rescheduling. So until we get the changes from [3], we can either drop the zeroing [5] or adjust the scheduler configuration to remove the effect of the zeroing. In both cases, this is a change we'll want to revert later once we bump Kubernetes to pick up a fix for the service load-balancer targets. [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1671136#c1 [2]: https://github.com/kubernetes/kubernetes/issues/65618 [3]: https://bugzilla.redhat.com/show_bug.cgi?id=1744370#c6 [4]: https://bugzilla.redhat.com/show_bug.cgi?id=1755073 [5]: https://github.com/openshift/installer/pull/2402/ --- docs/user/aws/install_upi.md | 16 ++++++++++++++++ docs/user/gcp/install_upi.md | 16 ++++++++++++++++ 2 files changed, 32 insertions(+) diff --git a/docs/user/aws/install_upi.md b/docs/user/aws/install_upi.md index 5e8c6dc5efa..80f8c5d1feb 100644 --- a/docs/user/aws/install_upi.md +++ b/docs/user/aws/install_upi.md @@ -50,6 +50,21 @@ $ rm -f openshift/99_openshift-cluster-api_master-machines-*.yaml openshift/99_o You are free to leave the compute MachineSets in if you want to create compute machines via the machine API, but if you do you may need to update the various references (`subnet`, etc.) to match your environment. +### Make control-plane nodes unschedulable + +Currently [emptying the compute pools](#empty-compute-pools) makes control-plane nodes schedulable. +But due to a [Kubernetes limitation][kubernetes-service-load-balancers-exclude-masters], router pods running on control-plane nodes will not be reachable by the ingress load balancer. +Update the scheduler configuration to keep router pods and other workloads off the control-plane nodes: + +```sh +python -c ' +import yaml; +path = "manifests/cluster-scheduler-02-config.yml" +data = yaml.load(open(path)); +data["spec"]["mastersSchedulable"] = False; +open(path, "w").write(yaml.dump(data, default_flow_style=False))' +``` + ### Remove DNS Zones If you don't want [the ingress operator][ingress-operator] to create DNS records on your behalf, remove the `privateZone` and `publicZone` sections from the DNS configuration: @@ -341,6 +356,7 @@ prometheus-k8s-openshift-monitoring.apps.your.cluster.domain.example.com [cloudformation]: https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/Welcome.html [delete-stack]: https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/cfn-console-delete-stack.html [ingress-operator]: https://github.com/openshift/cluster-ingress-operator +[kubernetes-service-load-balancers-exclude-masters]: https://github.com/kubernetes/kubernetes/issues/65618 [machine-api-operator]: https://github.com/openshift/machine-api-operator [route53-alias]: https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/resource-record-sets-choosing-alias-non-alias.html [route53-zones-for-load-balancers]: https://docs.aws.amazon.com/general/latest/gr/rande.html#elb_region diff --git a/docs/user/gcp/install_upi.md b/docs/user/gcp/install_upi.md index 578134e066e..d6e735620ce 100644 --- a/docs/user/gcp/install_upi.md +++ b/docs/user/gcp/install_upi.md @@ -78,6 +78,21 @@ If you do not want the cluster to provision compute machines, remove the compute rm -f openshift/99_openshift-cluster-api_worker-machineset-*.yaml ``` +### Make control-plane nodes unschedulable + +Currently [emptying the compute pools](#empty-compute-pools) makes control-plane nodes schedulable. +But due to a [Kubernetes limitation][kubernetes-service-load-balancers-exclude-masters], router pods running on control-plane nodes will not be reachable by the ingress load balancer. +Update the scheduler configuration to keep router pods and other workloads off the control-plane nodes: + +```sh +python -c ' +import yaml; +path = "manifests/cluster-scheduler-02-config.yml" +data = yaml.load(open(path)); +data["spec"]["mastersSchedulable"] = False; +open(path, "w").write(yaml.dump(data, default_flow_style=False))' +``` + ### Remove DNS Zones (Optional) If you don't want [the ingress operator][ingress-operator] to create DNS records on your behalf, remove the `privateZone` and `publicZone` sections from the DNS configuration. @@ -682,4 +697,5 @@ openshift-service-catalog-controller-manager-operator openshift-service-catalo [deploymentmanager]: https://cloud.google.com/deployment-manager/docs [ingress-operator]: https://github.com/openshift/cluster-ingress-operator +[kubernetes-service-load-balancers-exclude-masters]: https://github.com/kubernetes/kubernetes/issues/65618 [machine-api-operator]: https://github.com/openshift/machine-api-operator