-
Notifications
You must be signed in to change notification settings - Fork 743
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add zero downtime deployments tutorial
- Loading branch information
1 parent
e2abcd1
commit b5adee2
Showing
2 changed files
with
176 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,175 @@ | ||
# Zero downtime deployments | ||
|
||
This is a list of things you should consider when dealing with a high traffic production environment if you want to | ||
minimise the impact of rolling updates and downscaling. | ||
|
||
### Deployment strategy | ||
|
||
Limit the number of unavailable pods during a rolling update: | ||
|
||
```yaml | ||
apiVersion: apps/v1 | ||
kind: Deployment | ||
spec: | ||
progressDeadlineSeconds: 120 | ||
strategy: | ||
type: RollingUpdate | ||
rollingUpdate: | ||
maxUnavailable: 0 | ||
``` | ||
The default progress deadline for a deployment is ten minutes. | ||
You should consider adjusting this value to make the deployment process fail faster. | ||
### Liveness health check | ||
You application should expose a HTTP endpoint that Kubernetes can call to determine if | ||
your app transitioned to a broken state from which it can't recover and needs to be restarted. | ||
```yaml | ||
readinessProbe: | ||
exec: | ||
command: | ||
- wget | ||
- --quiet | ||
- --tries=1 | ||
- --timeout=4 | ||
- --spider | ||
- http://localhost:8080/healthz | ||
timeoutSeconds: 5 | ||
initialDelaySeconds: 5 | ||
``` | ||
If you've enabled mTLS, you'll have to use `exec` for liveness and readiness checks since | ||
kubelet is not part of the service mesh and doesn't have access to the TLS cert. | ||
|
||
### Readiness health check | ||
|
||
You application should expose a HTTP endpoint that Kubernetes can call to determine if | ||
your app is ready to receive traffic. | ||
|
||
```yaml | ||
livenessProbe: | ||
exec: | ||
command: | ||
- wget | ||
- --quiet | ||
- --tries=1 | ||
- --timeout=4 | ||
- --spider | ||
- http://localhost:8080/readyz | ||
timeoutSeconds: 5 | ||
initialDelaySeconds: 5 | ||
periodSeconds: 5 | ||
``` | ||
|
||
If your app depends on external services, you should check if those services are available before allowing Kubernetes | ||
to route traffic to an app instance. Keep in mind that the Envoy sidecar can have a slower startup than your app. | ||
This means that on application start you should retry for at least a couple of seconds any external connection. | ||
|
||
### Graceful shutdown | ||
|
||
Before a pod gets terminated, Kubernetes sends a `SIGTERM` signal to every container and waits for period of | ||
time (30s by default) for all containers to exit gracefully. If your app doesn't handle the `SIGTERM` signal or if it | ||
doesn't exit within the grace period, Kubernetes will kill the container and any inflight requests that your app is | ||
processing will fail. | ||
|
||
```yaml | ||
apiVersion: apps/v1 | ||
kind: Deployment | ||
spec: | ||
template: | ||
spec: | ||
terminationGracePeriodSeconds: 60 | ||
containers: | ||
- name: app | ||
lifecycle: | ||
preStop: | ||
exec: | ||
command: | ||
- sleep | ||
- "10" | ||
``` | ||
|
||
Your app container should have a `preStop` hook that delays the container shutdown. | ||
This will allow the service mesh to drain the traffic and remove this pod from all other Envoy sidecars before your app | ||
becomes unavailable. | ||
|
||
### Resource requests and limits | ||
|
||
Setting CPU and memory requests/limits for all workloads is a mandatory step if you're running a production system. | ||
Without limits your nodes could run out of memory or become unresponsive due to CPU exhausting. | ||
Without CPU and memory requests, | ||
the Kubernetes scheduler will not be able to make decisions about which nodes to place pods on. | ||
|
||
```yaml | ||
apiVersion: apps/v1 | ||
kind: Deployment | ||
spec: | ||
template: | ||
spec: | ||
containers: | ||
- name: app | ||
resources: | ||
limits: | ||
cpu: 1000m | ||
memory: 1Gi | ||
requests: | ||
cpu: 100m | ||
memory: 128Mi | ||
``` | ||
|
||
Note that without resource requests the horizontal pod autoscaler can't determine when to scale your app. | ||
|
||
### Autoscaling | ||
|
||
A production environment should be able to handle traffic bursts without impacting the quality of service. | ||
This can be achieved with Kubernetes autoscaling capabilities. | ||
Autoscaling in Kubernetes has two dimensions: the Cluster Autoscaler that deals with node scaling operations and | ||
the Horizontal Pod Autoscaler that automatically scales the number of pods in a deployment. | ||
|
||
```yaml | ||
apiVersion: autoscaling/v2beta1 | ||
kind: HorizontalPodAutoscaler | ||
spec: | ||
scaleTargetRef: | ||
apiVersion: apps/v1 | ||
kind: Deployment | ||
name: app | ||
minReplicas: 2 | ||
maxReplicas: 4 | ||
metrics: | ||
- type: Resource | ||
resource: | ||
name: cpu | ||
targetAverageValue: 900m | ||
- type: Resource | ||
resource: | ||
name: memory | ||
targetAverageValue: 768Mi | ||
``` | ||
|
||
The above HPA ensures your app will be scaled up before the pods reach the CPU or memory limits. | ||
|
||
### Ingress retries | ||
|
||
To minimise the impact of downscaling operations you can make use of Envoy retry capabilities. | ||
|
||
```yaml | ||
apiVersion: flagger.app/v1alpha3 | ||
kind: Canary | ||
spec: | ||
service: | ||
port: 9898 | ||
gateways: | ||
- public-gateway.istio-system.svc.cluster.local | ||
hosts: | ||
- app.example.com | ||
appendHeaders: | ||
x-envoy-upstream-rq-timeout-ms: "15000" | ||
x-envoy-max-retries: "10" | ||
x-envoy-retry-on: "gateway-error,connect-failure,refused-stream" | ||
``` | ||
|
||
When the HPA scales down your app, your users could run into 503 errors. | ||
The above configuration will make Envoy retry the HTTP requests that failed due to gateway errors. |