Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
docs/user/*/install_upi: Explicitly set control-plane unschedulable
We grew replicas-zeroing in c22d042 (docs/user/aws/install_upi: Add 'sed' call to zero compute replicas, 2019-05-02, openshift#1649) to set the stage for changing the 'replicas: 0' semantics from "we'll make you some dummy MachineSets" to "we won't make you MachineSets". But that hasn't happened yet, and since 64f96df (scheduler: Use schedulable masters if no compute hosts defined, 2019-07-16, openshift#2004) 'replicas: 0' for compute has also meant "add the 'worker' role to control-plane nodes". That leads to racy problems when ingress comes through a load balancer, because Kubernetes load balancers exclude control-plane nodes from their target set [1,2] (although this may get relaxed soonish [3]). If the router pods get scheduled on the control plane machines due to the 'worker' role, they are not reachable from the load balancer and ingress routing breaks [4]. Seth says: > pod nodeSelectors are not like taints/tolerations. They only have > effect at scheduling time. They are not continually enforced. which means that attempting to address this issue as a day-2 operation would mean removing the 'worker' role from the control-plane nodes and then manually evicting the router pods to force rescheduling. So until we get the changes from [3], we can either drop the zeroing [5] or adjust the scheduler configuration to remove the effect of the zeroing. In both cases, this is a change we'll want to revert later once we bump Kubernetes to pick up a fix for the service load-balancer targets. [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1671136#c1 [2]: kubernetes/kubernetes#65618 [3]: https://bugzilla.redhat.com/show_bug.cgi?id=1744370#c6 [4]: https://bugzilla.redhat.com/show_bug.cgi?id=1755073 [5]: openshift#2402
- Loading branch information