Skip to content

Commit

Permalink
Release notes for Calico Pod CIDR changes made in kubernetes#2768
Browse files Browse the repository at this point in the history
Also document the migration procedure necessary for existing calico
clusters
  • Loading branch information
ottoyiu committed Jul 13, 2017
1 parent 87ad3ee commit 171f258
Show file tree
Hide file tree
Showing 4 changed files with 202 additions and 0 deletions.
17 changes: 17 additions & 0 deletions docs/calico_cidr_migration/create_migration_manifest.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
#!/bin/bash
set -e

type jq >/dev/null 2>&1 || { echo >&2 "This script requires jq but it's not installed. Aborting."; exit 1; }
type kops >/dev/null 2>&1 || { echo >&2 "This script requires kops but it's not installed. Aborting."; exit 1; }
[ -z "$NAME" ] && echo "Please set NAME to the name of your cluster you wish to perform this migration against." && exit 1;

export MIGRATION_TEMPLATE="jobs.yaml.template"
export MIGRATION_MANIFEST="jobs.yaml"
export NON_MASQUERADE_CIDR="`kops get cluster $NAME -o json --full | jq .spec.nonMasqueradeCIDR --raw-output`"
export POD_CIDR="`kops get cluster $NAME -o json --full | jq .spec.kubeControllerManager.clusterCIDR --raw-output`"
cp ${MIGRATION_TEMPLATE} ${MIGRATION_MANIFEST}
sed -i -e "s@{{NON_MASQUERADE_CIDR}}@${NON_MASQUERADE_CIDR}@g" ${MIGRATION_MANIFEST}
sed -i -e "s@{{POD_CIDR}}@${POD_CIDR}@g" ${MIGRATION_MANIFEST}

echo "jobs.yaml created. Please run: "
echo "kubectl apply -f jobs.yaml"
108 changes: 108 additions & 0 deletions docs/calico_cidr_migration/jobs.yaml.template
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
# This ConfigMap is used in the creation of a new Calico IP Pool.
kind: ConfigMap
apiVersion: v1
metadata:
name: calico-config-ippool
namespace: kube-system
data:
# The default IP Pool to be created for the cluster.
# Pod IP addresses will be assigned from this pool.
ippool.yaml: |
apiVersion: v1
kind: ipPool
metadata:
cidr: {{POD_CIDR}}
spec:
ipip:
enabled: true
mode: cross-subnet
nat-outgoing: true
---
## This manifest deploys a Job which adds a new ippool to calico
apiVersion: batch/v1
kind: Job
metadata:
name: configure-calico-ippool
namespace: kube-system
labels:
k8s-app: calico
role.kubernetes.io/networking: "1"
spec:
template:
metadata:
name: configure-calico-ippool
annotations:
scheduler.alpha.kubernetes.io/critical-pod: ''
spec:
hostNetwork: true
serviceAccountName: calico
tolerations:
- key: node-role.kubernetes.io/master
effect: NoSchedule
- key: CriticalAddonsOnly
operator: Exists
restartPolicy: OnFailure
containers:
- name: configure-calico
image: calico/ctl:v1.2.1
args:
- apply
- -f
- /etc/config/calico/ippool.yaml
volumeMounts:
- name: config-volume
mountPath: /etc/config
env:
# The location of the etcd cluster.
- name: ETCD_ENDPOINTS
valueFrom:
configMapKeyRef:
name: calico-config
key: etcd_endpoints
volumes:
- name: config-volume
configMap:
name: calico-config-ippool
items:
- key: ippool.yaml
path: calico/ippool.yaml

---
## This manifest deploys a Job which deletes the old ippool from calico
apiVersion: batch/v1
kind: Job
metadata:
name: configure-calico-ippool-remove
namespace: kube-system
labels:
k8s-app: calico
role.kubernetes.io/networking: "1"
spec:
template:
metadata:
name: configure-calico-ippool-remove
annotations:
scheduler.alpha.kubernetes.io/critical-pod: ''
spec:
hostNetwork: true
serviceAccountName: calico
tolerations:
- key: node-role.kubernetes.io/master
effect: NoSchedule
- key: CriticalAddonsOnly
operator: Exists
restartPolicy: OnFailure
containers:
- name: configure-calico
image: calico/ctl:v1.2.1
args:
- delete
- ipPool
- {{NON_MASQUERADE_CIDR}}
env:
# The location of the etcd cluster.
- name: ETCD_ENDPOINTS
valueFrom:
configMapKeyRef:
name: calico-config
key: etcd_endpoints
5 changes: 5 additions & 0 deletions docs/releases/1.7-NOTES.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,8 @@ _This is a WIP document describing changes to the upcoming kops 1.7 release_
# Significant changes

* Default disk size increased to 64GB (masters) and 128GB (nodes). This does have a higher cost, but also gives us more inodes & more iops (and more disk space, of course!)
* Calico now configured with the correct pod CIDR: #2768. Please refer to the *Required Actions* section for details regarding this.

# Required Actions

* Existing Calico users on clusters that were created prior to kops 1.7 are suspectible to IP conflict between Pods and Services due to an overlap of the two IP ranges. Migration to a new Pod CIDR is recommended, and is a manual procedure due to risk of potential downtime during this operation. For the migration procedure, please refer to [this document](../upgrade_from_kops_1.6_to_1.7_calico_cidr_migration.md).
72 changes: 72 additions & 0 deletions docs/upgrade_from_kops_1.6_to_1.7_calico_cidr_migration.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
# Calico Pod CIDR Migration Procedure
Prior to kops 1.7, calico, and other CNI providers was misconfigured to use the
`.NonMasqueradeCIDR` field as the CIDR range for Pod IPs. As a result, IP
conflict may occur when a Service is allocated an IP that has already been
assigned to a Pod, or vice versa. To prevent this from occuring, manual steps
are necessary before upgrading your cluster using kops 1.7 onwards.


## Background
The field in the clusterSpec, `.NonMasqueradeCIDR`, captures the IP
range of the cluster.

Within this IP range, smaller IP ranges are then carved out for:
* Service IPs - as defined as `.serviceClusterIPRange`
* Pod IPs - as defined as `.kubeControllerManager.clusterCIDR`

It was found out in Issue [#1171](https://github.com/kubernetes/kops/issues/1171),
that weave and calico was misconfigured to use the wider IP range rather than
the range dedicated to Pods only. This was fixed in PR [#2717](https://github.com/kubernetes/kops/pull/2717)
and [#2768](https://github.com/kubernetes/kops/pull/2768) for the two CNIs, by
switching over to using the `.kubeControllerManager.clusterCIDR` field instead.

With the `--ip-alloc-range` flag changes for weave, it effectively creates a
new network. Pods in the existing network will not necessarily be able to talk
to those in the new network. Restarting of all nodes will need to happen
to guarantee that all Pods spin up with IPs in the new network. See [here](
https://github.com/weaveworks/weave/issues/2874) for more details.

Just like weave, the above change alone is not enough to mitigate the problem
on existing clusters running calico. Effectively, a new network will need to be
created first (in the form of an IP Pool in Calico terms), and a restart of all
nodes will need to happen to have Pods be allocated the proper IP addresses.

## Prerequisites

* `kops` >= 1.7
* `jq` for retrieving the field values from the clusterSpec
* Kubernetes cluster with calico as the CNI, created prior to kops 1.7
* Scheduled maintenance window, this procedure *WILL* result in cluster degregation.

## Procedure
**WARNING** - This procedure will cause disruption to Pods running on the cluster.
New Pods may not be able to resolve DNS through kube-dns or other services
through its service IP during the rolling restart.
Attempt this migration procedure on a staging cluster prior to doing this in production.

---
Calico only uses the `CALICO_IPV4POOL_CIDR` to create a default IPv4 pool if a
pool doesn't exist already:
https://github.com/projectcalico/calicoctl/blob/v1.3.0/calico_node/startup/startup.go#L463

Therefore, we need to run two jobs. We have provided a manifest and a bash script.
job create a new IPv4 pool that we want, and one deletes the existing IP
pool that we no longer want. This is to be executed after an
`kops update cluster --yes` using kops 1.7 and beyond,
and before a `kops rolling-upgrade cluster`.

1. Using kops >= 1.7, update your cluster using `kops update cluster [--yes]`.
2. Specify your cluster name in a `NAME` variable and run the following bash script:
```bash
export NAME="YOUR_CLUSTER_NAME"
wget {{FILL_IN_GITHUB_RAW_URL_HERE_AFTER_FIRST_COMMIT}} -O create_migration_manifest.sh
chmod +x create_migration_manifest.sh
./create_migration_manifest.sh
```
This will create a `jobs.yaml` manifest file that is used by the next step.
3. Make sure the current-context in the kubeconfig is the cluster you want to perform this migration.
Run the job: `kubectl apply -f jobs.yaml`
4. Run `kops rolling-upgrade --force --yes` to initiate a rolling restart on the cluster.
This forces a restart of all nodes in the cluster.

That's it, you should see new Pods be allocated IPs in the new IP range!

0 comments on commit 171f258

Please sign in to comment.