Skip to content
This repository has been archived by the owner on Jun 29, 2022. It is now read-only.

Commit

Permalink
terraform: add reusable worker and controller modules
Browse files Browse the repository at this point in the history
This commit adds 3 Terraform modules, which aims to eventually
deduplicate Ignition configuration between all the platforms, by
providing extendable base configuration.

For now, only Tinkerbell platform will be consuming those modules.
Eventually, we should migrate all platforms to use it, however this is
not trivial task, as we lack tests on this level, so it must be done
carefully.

Signed-off-by: Mateusz Gozdek <[email protected]>
  • Loading branch information
invidian committed Oct 14, 2020
1 parent 43fe5a2 commit 6ffbdca
Show file tree
Hide file tree
Showing 28 changed files with 1,085 additions and 0 deletions.
15 changes: 15 additions & 0 deletions assets/terraform-modules/controller/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# Controller Terraform module

This Terraform module aims to be a reusable module for generating controller nodes Ignition
configuration.

It build on top of [node](../node) module and adds some controller-specific settings on top of it,
like:
- extra `kubelet.service` dependencies etc.
- bootkube script and systemd unit
- etcd scripts and units
- controller labels
- controller taints

Additionally, it exposes various input variables, which allow to add platform-specific changes to the
ignition configuration.
1 change: 1 addition & 0 deletions assets/terraform-modules/controller/bootstrap-tokens.tf
61 changes: 61 additions & 0 deletions assets/terraform-modules/controller/main.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
locals {
kubelet_require_kubeconfig = <<EOF
systemd:
units:
- name: kubelet.service
dropins:
- name: 10-controller.conf
contents: |
[Service]
ConditionPathExists=/etc/kubernetes/kubeconfig
ExecStartPre=/bin/mkdir -p /etc/kubernetes/checkpoint-secrets
ExecStartPre=/bin/mkdir -p /etc/kubernetes/inactive-manifests
EOF

bootkube = templatefile("${path.module}/templates/bootkube.yaml.tmpl", {
bootkube_rkt_extra_args = var.bootkube_rkt_extra_args
bootkube_image_name = var.bootkube_image_name
bootkube_image_tag = var.bootkube_image_tag
kubelet_image_name = var.kubelet_image_name
kubelet_image_tag = var.kubelet_image_tag
})

etcd_servers = [for i in range(var.controllers_count) : format("%s-etcd%d.%s", var.cluster_name, i, var.dns_zone)]

etcd = templatefile("${path.module}/templates/etcd.yaml.tmpl", {
etcd_name = "etcd${var.count_index}"
etcd_domain = "${var.cluster_name}-etcd${var.count_index}.${var.dns_zone}"

# etcd0=https://cluster-etcd0.example.com,etcd1=https://cluster-etcd1.example.com,...
etcd_initial_cluster = join(",", [for i, server in local.etcd_servers : format("etcd%d=https://%s:2380", i, server)])
})

snippets = [
local.kubelet_require_kubeconfig,
local.bootkube,
local.etcd,
]
}

data "ct_config" "config" {
pretty_print = false

content = templatefile("${path.module}/templates/node.yaml.tmpl", {
ssh_keys = jsonencode(var.ssh_keys)
cluster_dns_service_ip = var.cluster_dns_service_ip
cluster_domain_suffix = var.cluster_domain_suffix
kubelet_image_name = var.kubelet_image_name
kubelet_image_tag = var.kubelet_image_tag
host_dns_ip = var.host_dns_ip
kubelet_rkt_extra_args = []
kubelet_labels = {
"node.kubernetes.io/master" = "",
"node.kubernetes.io/controller" = "true",
}
kubelet_taints = {
"node-role.kubernetes.io/master" = ":NoSchedule"
}
})

snippets = concat(local.snippets, var.clc_snippets)
}
1 change: 1 addition & 0 deletions assets/terraform-modules/controller/output-node.tf
7 changes: 7 additions & 0 deletions assets/terraform-modules/controller/output.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
output "bootstrap_kubeconfig" {
value = local.bootstrap_kubeconfig
}

output "etcd_servers" {
value = local.etcd_servers
}
39 changes: 39 additions & 0 deletions assets/terraform-modules/controller/templates/bootkube.yaml.tmpl
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
systemd:
units:
- name: bootkube.service
contents: |
[Unit]
Description=Bootstrap a Kubernetes cluster
ConditionPathExists=!/opt/bootkube/init_bootkube.done
[Service]
Type=oneshot
RemainAfterExit=true
WorkingDirectory=/opt/bootkube
ExecStart=/opt/bootkube/bootkube-start
ExecStartPost=/bin/touch /opt/bootkube/init_bootkube.done
[Install]
WantedBy=multi-user.target
storage:
files:
- path: /opt/bootkube/bootkube-start
filesystem: root
mode: 0544
user:
id: 500
group:
id: 500
contents:
inline: |
#!/bin/bash
# Wrapper for bootkube start
set -e
# Pre-pull hyperkube image because when it is later pulled but takes too long it times out
docker pull ${kubelet_image_name}:${kubelet_image_tag}
# Move experimental manifests
[ -n "$(ls /opt/bootkube/assets/manifests-*/* 2>/dev/null)" ] && mv /opt/bootkube/assets/manifests-*/* /opt/bootkube/assets/manifests && rm -rf /opt/bootkube/assets/manifests-*
exec docker run \
-v /opt/bootkube/assets:/assets:ro \
-v /etc/kubernetes:/etc/kubernetes:rw \
--network=host \
${bootkube_image_name}:${bootkube_image_tag} \
/bootkube start --asset-dir=/assets
117 changes: 117 additions & 0 deletions assets/terraform-modules/controller/templates/etcd.yaml.tmpl
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
systemd:
units:
- name: etcd.service
enable: true
contents: |
[Unit]
Description=etcd (System Application Container)
Documentation=https://github.com/etcd-io/etcd
Wants=docker.service
After=docker.service
[Service]
Type=simple
Restart=always
RestartSec=5s
TimeoutStartSec=0
LimitNOFILE=40000
EnvironmentFile=/etc/kubernetes/etcd.env
ExecStartPre=-docker rm -f etcd
ExecStartPre=sh -c "docker run -d \
--name=etcd \
--restart=unless-stopped \
--log-driver=journald \
--network=host \
-u $(id -u \"$${ETCD_USER}\"):$(id -u \"$${ETCD_USER}\") \
-v $${ETCD_DATA_DIR}:$${ETCD_DATA_DIR}:rw \
-v $${ETCD_SSL_DIR}:$${ETCD_SSL_DIR}:ro \
--env-file /etc/kubernetes/etcd.env \
$${ETCD_IMAGE_URL}:$${ETCD_IMAGE_TAG}"
ExecStart=docker logs -f etcd
ExecStop=docker stop etcd
ExecStopPost=docker rm etcd
ExecStopPost=-/opt/etcd-rejoin
[Install]
WantedBy=multi-user.target
storage:
files:
- path: /etc/kubernetes/etcd.env
filesystem: root
mode: 0644
contents:
inline: |
ETCD_IMAGE_TAG=v3.4.13
ETCD_IMAGE_URL=quay.io/coreos/etcd
ETCD_SSL_DIR=/etc/ssl/etcd
ETCD_DATA_DIR=/var/lib/etcd
ETCD_USER=etcd
ETCD_NAME=${etcd_name}
ETCD_ADVERTISE_CLIENT_URLS=https://${etcd_domain}:2379
ETCD_INITIAL_ADVERTISE_PEER_URLS=https://${etcd_domain}:2380
ETCD_LISTEN_CLIENT_URLS=https://0.0.0.0:2379
ETCD_LISTEN_PEER_URLS=https://0.0.0.0:2380
ETCD_LISTEN_METRICS_URLS=http://0.0.0.0:2381
ETCD_INITIAL_CLUSTER=${etcd_initial_cluster}
ETCD_STRICT_RECONFIG_CHECK=true
ETCD_TRUSTED_CA_FILE=/etc/ssl/etcd/etcd/server-ca.crt
ETCD_CERT_FILE=/etc/ssl/etcd/etcd/server.crt
ETCD_KEY_FILE=/etc/ssl/etcd/etcd/server.key
ETCD_CLIENT_CERT_AUTH=true
ETCD_PEER_TRUSTED_CA_FILE=/etc/ssl/etcd/etcd/peer-ca.crt
ETCD_PEER_CERT_FILE=/etc/ssl/etcd/etcd/peer.crt
ETCD_PEER_KEY_FILE=/etc/ssl/etcd/etcd/peer.key
ETCD_PEER_CLIENT_CERT_AUTH=true
- path: /etc/tmpfiles.d/etcd-wrapper.conf
filesystem: root
mode: 0644
contents:
inline: |
d /var/lib/etcd 0700 etcd etcd - -
- path: /opt/etcd-rejoin
filesystem: root
mode: 0555
contents:
inline: |
#!/bin/bash
set -eou pipefail
# Rejoin a cluster as fresh node when etcd cannot join
# (e.g., after repovisioning, crashing or node being down).
# Set ExecStopPost=-/opt/etcd-rejoin to run when etcd failed and
# use env vars of etcd-member.service.
# Skip if not provisioned
if [ ! -d "/etc/ssl/etcd/" ]; then exit 0; fi
# or got stopped.
if [ "$EXIT_CODE" = "killed" ]; then exit 0; fi
now=$(date +%s)
if [ -f /var/lib/etcd-last-fail ]; then
last=$(cat /var/lib/etcd-last-fail)
else
last=0
fi
echo "$now" > /var/lib/etcd-last-fail
let "d = $now - $last"
# Skip and restart regularly if it does not fail within 120s.
if [ "$d" -gt 120 ]; then exit 0; fi
export ETCDCTL_API=3
urls=$(echo "$ETCD_INITIAL_CLUSTER" | tr "," "\n" | cut -d "=" -f 2 | tr "\n" "," | head -c -1)
# $$ for terraform
endpoints="$${urls//2380/2379}"
ARGS="--cacert=/etc/ssl/etcd/etcd-client-ca.crt --cert=/etc/ssl/etcd/etcd-client.crt --key=/etc/ssl/etcd/etcd-client.key --endpoints=$endpoints"
# Check if unhealthy (should be because etcd is not running)
unhealty=$((etcdctl endpoint health $ARGS 2> /dev/stdout | grep "is unhealthy" | grep "$ETCD_NAME") || true)
if [ -z "$unhealty" ]; then exit 0; fi
# Remove old ID if still exists
ID=$((etcdctl member list $ARGS | grep "$ETCD_NAME" | cut -d "," -f 1) || true)
if [ ! -z "$ID" ]; then
etcdctl member remove "$ID" $ARGS
fi
# Re-add as new member
etcdctl member add "$ETCD_NAME" --peer-urls="$ETCD_INITIAL_ADVERTISE_PEER_URLS" $ARGS
# Join fresh without state
mv /var/lib/etcd "/var/lib/etcd-bkp-$(date +%s)" || true
if [ -z "$(grep ETCD_INITIAL_CLUSTER_STATE=existing /etc/systemd/system/etcd-member.service.d/40-etcd-cluster.conf)" ]; then
echo 'Environment="ETCD_INITIAL_CLUSTER_STATE=existing"' >> /etc/systemd/system/etcd-member.service.d/40-etcd-cluster.conf
# Apply change
systemctl daemon-reload
fi
# Restart unit (yes, within itself)
systemctl restart etcd-member &
91 changes: 91 additions & 0 deletions assets/terraform-modules/controller/variables.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
variable "dns_zone" {
type = string
}

variable "bootkube_rkt_extra_args" {
description = "Extra parameters to pass to bootkube rkt container"
type = list(string)
default = []
}

variable "bootkube_image_name" {
description = "Docker image name to use for rkt container running bootkube"
type = string
default = "quay.io/kinvolk/bootkube"
}

variable "bootkube_image_tag" {
description = "Docker image tag to use for rkt container running bootkube"
type = string
default = "v0.14.0-helm-ec64535-amd64"
}

# Duplicated variable.
variable "cluster_name" {
description = "Cluster name"
type = string
}

variable "cluster_domain_suffix" {
type = string
description = "Cluster domain suffix. Passed to kubelet as --cluster_domain flag."
default = "cluster.local"
}

# Required variables.
variable "ssh_keys" {
type = list(string)
description = "List of SSH public keys for user `core`. Each element must be specified in a valid OpenSSH public key format, as defined in RFC 4253 Section 6.6, e.g. 'ssh-rsa AAAAB3N...'."
default = []
}

# Optional variables.
variable "count_index" {
type = number
description = "Index passed as count.index from count on module."
}

variable "controllers_count" {
type = number
description = "Number of controller nodes in the cluster."
}

variable "cluster_dns_service_ip" {
type = string
description = "IP address of cluster DNS Service. Passed to kubelet as --cluster_dns parameter."
default = "10.3.0.10"
}

variable "clc_snippets" {
type = list(string)
description = "Extra CLC snippets to include in the configuration."
default = []
}

variable "kubelet_image_name" {
type = string
description = "Source of kubelet Docker image"
default = "quay.io/poseidon/kubelet"
}

variable "kubelet_image_tag" {
type = string
description = "Tag for kubelet Docker image"
default = "v1.18.8"
}

variable "host_dns_ip" {
type = string
description = "IP address of DNS server to configure on the nodes."
default = "8.8.8.8"
}

variable "apiserver" {
type = string
description = "FQDN or IP address for kubelet to use for talking to Kubernetes API server."
}

variable "ca_cert" {
type = string
description = "Kubernetes CA certificate in PEM format for bootstrap kubeconfig for kubelet."
}
16 changes: 16 additions & 0 deletions assets/terraform-modules/controller/versions.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# Terraform version and plugin versions

terraform {
required_version = ">= 0.13"

required_providers {
ct = {
source = "poseidon/ct"
version = "0.6.1"
}
template = {
source = "hashicorp/template"
version = "2.1.2"
}
}
}
12 changes: 12 additions & 0 deletions assets/terraform-modules/node/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# Node Terraform module

This Terraform module aims to be a base for [worker](../worker) and [controller](../controller) Terraform
modules by providing common parts of the Ignition configuration.

It is not instantiated by those modules, but linked using symlinks, to avoid instantiating deeply nested
Terraform modules.

Additionally, it exposes various input variables, which allow to add worker and controller specific changes
to the configuration.

The main usecase is to provide extra snippets using `clc_snippets` variable.
Loading

0 comments on commit 6ffbdca

Please sign in to comment.