Skip to content

Latest commit

 

History

History
626 lines (448 loc) · 18.6 KB

13.md

File metadata and controls

626 lines (448 loc) · 18.6 KB

Kubernetes v1.13 安装

通过kubeadm搭建一个单master节点的k8s集群

规划

服务器

机器:

hostname ip role
host01 10.11.11.11 master
host02 10.11.11.12 node1
host02 10.11.11.13 node2

版本

软件版本:

  • OS: centos 7
  • Docker: 18.03.0-ce (已提前安装好)
  • Kubernetes: 1.13.2

组件

master: apiserver, scheduler, controller-manager

node: kubelet, proxy

addons:

  • pod 网络: Flannel
  • dns: CoreDNS v1.13开始默认开启使用
  • ui: dashboard
  • monitor: heasper
  • monitor: Prometheus + Grafana (TODO:)

网络

cluster 网络: 192.66.0.0/6

pod 网络: 192.88.0.0/16

安装准备

Before you begin

firewall

禁用防火墙

systemctl stop firewalld
systemctl disable firewalld

selinux

禁用SELINUX

setenforce 0

vi /etc/selinux/config
SELINUX=disabled

swap

关闭swap

swapoff -a

/etc/fstab 注释掉 swap

iptables Chain FORWARD

iptables --list 检查一下Chain FORWARD策略)为ACCEPT, Docker从1.13版本开始调整了默认的防火墙规则,禁用了iptables filter表中FOWARD链,这样会引起Kubernetes集群中跨node的pod无法通信,如果不对,在各个Docker节点修改docker服务配置:

/usr/lib/systemd/system/docker.service 增加

ExecStartPost=/usr/sbin/iptables -P FORWARD ACCEPT

重启docker

systemctl daemon-reload
systemctl restart docker

yum install

增加阿里云的kubernetes yum源

cat <<EOF > /etc/yum.repos.d/kubernetes-aliyun.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
EOF

yum update

查看版本

yum list | grep kube

cockpit-kubernetes.x86_64               176-4.el7.centos                extras  
cri-tools.x86_64                        1.12.0-0                        kubernetes
kubeadm.x86_64                          1.13.2-0                        kubernetes
kubectl.x86_64                          1.13.2-0                        kubernetes
kubelet.x86_64                          1.13.2-0                        kubernetes
kubernetes.x86_64                       1.5.2-0.7.git269f928.el7        extras  
kubernetes-ansible.noarch               0.6.0-0.1.gitd65ebd5.el7        epel    
kubernetes-client.x86_64                1.5.2-0.7.git269f928.el7        extras  
kubernetes-cni.x86_64                   0.6.0-0                         kubernetes
kubernetes-master.x86_64                1.5.2-0.7.git269f928.el7        extras  
kubernetes-node.x86_64                  1.5.2-0.7.git269f928.el7        extras  
rkt.x86_64                              1.27.0-1                        kubernetes
rsyslog-mmkubernetes.x86_64             8.24.0-34.el7                   os      

在所有服务器上安装

yum install -y kubelet-1.13.2 kubeadm-1.13.2 kubectl-1.13.2
systemctl enable kubelet && systemctl start kubelet

iptables 网桥

RHEL/CentOS7 有issue反映会出现iptables被忽略的情况,配置网桥参数,使得流量不会绕过iptable

cat <<EOF >  /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
sysctl --system

然后kubelet就开始每几秒重启一次,因为在等待kubeadm的指令。

cgroup

确保kubelet使用的cgroup driver与 Docker的一致, kubelet默认是 cgroupfs

查看docker的cgroup docker info |grep cgroup

输出 Cgroup Driver: cgroupfs

如果不是cgroupfs, 参考官方文档操作

docker images

查看指定版本的 kubernetes 所需镜像

kubeadm config images list --kubernetes-version v1.13.2

k8s.gcr.io/kube-apiserver:v1.13.2
k8s.gcr.io/kube-controller-manager:v1.13.2
k8s.gcr.io/kube-scheduler:v1.13.2
k8s.gcr.io/kube-proxy:v1.13.2
k8s.gcr.io/pause:3.1
k8s.gcr.io/etcd:3.2.24
k8s.gcr.io/coredns:1.2.6

如果服务器不能F墙,则需要在每台机器上手动从其他镜像源下载并打好tag

docker pull mirrorgooglecontainers/kube-apiserver:v1.13.2
docker tag mirrorgooglecontainers/kube-apiserver:v1.13.2 k8s.gcr.io/kube-apiserver:v1.13.2

docker pull mirrorgooglecontainers/kube-controller-manager:v1.13.2
docker tag mirrorgooglecontainers/kube-controller-manager:v1.13.2 k8s.gcr.io/kube-controller-manager:v1.13.2

docker pull mirrorgooglecontainers/kube-scheduler:v1.13.2
docker tag mirrorgooglecontainers/kube-scheduler:v1.13.2 k8s.gcr.io/kube-scheduler:v1.13.2

docker pull mirrorgooglecontainers/kube-proxy:v1.13.2
docker tag mirrorgooglecontainers/kube-proxy:v1.13.2 k8s.gcr.io/kube-proxy:v1.13.2

docker pull mirrorgooglecontainers/pause:3.1
docker tag mirrorgooglecontainers/pause:3.1 k8s.gcr.io/pause:3.1

docker pull mirrorgooglecontainers/etcd:3.2.24
docker tag mirrorgooglecontainers/etcd:3.2.24 k8s.gcr.io/etcd:3.2.24

docker pull netonline/coredns:1.2.6
docker tag netonline/coredns:1.2.6 k8s.gcr.io/coredns:1.2.6

初始化集群

在master服务器执行

kubeadm 支持的参数

kubeadm init \
--kubernetes-version=v1.13.2 \
--service-cidr=192.66.0.0/16 \
--pod-network-cidr=192.88.0.0/16 \
--apiserver-advertise-address=10.11.11.11

会输出安装过程

[init] Using Kubernetes version: v1.13.2
[preflight] Running pre-flight checks
	[WARNING Service-Docker]: docker service is not enabled, please run 'systemctl enable docker.service'
	[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 18.03.1-ce. Latest validated version: 18.06
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Activating the kubelet service
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [host01 localhost] and IPs [10.11.11.11 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [host01 localhost] and IPs [10.11.11.11 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [host01 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [192.66.0.1 10.11.11.11]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 20.001856 seconds
[uploadconfig] storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.13" in namespace kube-system with the configuration for the kubelets in the cluster
[patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "host01" as an annotation
[mark-control-plane] Marking the node host01 as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node host01 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: gdh6ih.abw6qwc1ox16e7a3
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstraptoken] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstraptoken] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstraptoken] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstraptoken] creating the "cluster-info" ConfigMap in the "kube-public" namespace
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes master has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any number of machines by running the following on each node
as root:

  kubeadm join 10.11.11.11:6443 --token tvgqy4.dh2ocb4sanqwwfs8 --discovery-token-ca-cert-hash sha256:c113734c9333cf088b66c2058fd79925b765790f3420acdb36c490837975bc90

如果需要kubelet在非root用户也能执行,执行以下

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

或者在root用户下执行export KUBECONFIG=/etc/kubernetes/admin.conf

查看集群状态 kubectl get cs

NAME                 STATUS    MESSAGE              ERROR
scheduler            Healthy   ok                   
controller-manager   Healthy   ok                   
etcd-0               Healthy   {"health": "true"}   

安装flannel插件

wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

删除所有kind: DaemonSet项只保留amd64的一项

并修改该项的imageimage: quay.io/coreos/flannel:v0.10.0-amd64(要翻墙)为:

jmgao1983/flannel:v0.10.0-amd64 (2处)

"Network": "10.244.0.0/16"改为"Network": "192.88.0.0/16"

创建 kubectl create -f kube-flannel.yml

查看 kubectl get daemonset --all-namespaces

添加node节点

master kubectl get pod --all-namespaces -o wide 确认所有pods都是Running状态

master上创建新密钥

kubeadm token create

inbhxg.5v4valxaaa0ggv4w

master上查看密钥的 hash 值

openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null |  openssl dgst -sha256 -hex | sed 's/^.* //'

c113734c9333cf088b66c2058fd79925b765790f3420acdb36c490837975bc90

node节点上执行添加节点

kubeadm join 10.11.11.11:6443 --token ujpk9a.e4md2z6c6qm9up3a --discovery-token-ca-cert-hash sha256:c113734c9333cf088b66c2058fd79925b765790f3420acdb36c490837975bc90

在master查看nodes

kubectl get nodes

NAME                     STATUS   ROLES    AGE   VERSION
host02   Ready    <none>   17m   v1.13.2
host01   Ready    master   44m   v1.13.2
t-shhq-data-art-1        Ready    <none>   13s   v1.13.2

测试dns

kubectl run curl --image=radial/busyboxplus:curl -it

进入后执行

nslookup kubernetes.default

Server:    192.66.0.10
Address 1: 192.66.0.10 kube-dns.kube-system.svc.cluster.local

Name:      kubernetes.default
Address 1: 192.66.0.1 kubernetes.default.svc.cluster.local

安装dashboard插件

官方安装文档

wget https://raw.githubusercontent.com/kubernetes/dashboard/v1.10.1/src/deploy/recommended/kubernetes-dashboard.yaml

镜像k8s.gcr.io/kubernetes-dashboard-amd64:v1.10.1需要翻墙,修改yaml中的image:

image: siriuszg/kubernetes-dashboard-amd64:v1.10.1

暴露端口

kind: Service
apiVersion: v1
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard
  namespace: kube-system
spec:
  type: NodePort
  ports:
    - port: 443
      targetPort: 8443
      nodePort: 30001
  selector:
    k8s-app: kubernetes-dashboard

新建 kubernetes-dashboard-admin.rbac.yaml

apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard-admin
  namespace: kube-system

---

apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: kubernetes-dashboard-admin
  labels:
    k8s-app: kubernetes-dashboard
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
subjects:
  - kind: ServiceAccount
    name: kubernetes-dashboard-admin
    namespace: kube-system

kubectl create -f .

访问 https://NODEIP:30001

查询token kubectl -n kube-system describe secret $(kubectl -n kube-system get secret | grep admin | awk '{print $1}')

使用token登陆

dashboard

安装heapster插件(已废弃)

Heapster 已在 v1.11 中弃用,仅推荐 <= v1.7 版本使用heasper

Heapster Deprecation Timeline

图形化的集群度量指标信息,需要安装一个dashboard插件:Heapster

下载yaml文件

wget https://raw.githubusercontent.com/kubernetes/heapster/master/deploy/kube-config/influxdb/grafana.yaml
wget https://raw.githubusercontent.com/kubernetes/heapster/master/deploy/kube-config/influxdb/heapster.yaml
wget https://raw.githubusercontent.com/kubernetes/heapster/master/deploy/kube-config/influxdb/influxdb.yaml
wget https://raw.githubusercontent.com/kubernetes/heapster/master/deploy/kube-config/rbac/heapster-rbac.yaml

修改镜像:

k8s.gcr.io/heapster-grafana-amd64:v5.0.4
fishchen/heapster-grafana-amd64:v5.0.4

k8s.gcr.io/heapster-amd64:v1.5.4
fishchen/heapster-amd64:v1.5.4

k8s.gcr.io/heapster-influxdb-amd64:v1.5.2
fishchen/heapster-influxdb-amd64:v1.5.2

heapster 报错

Error in scraping containers from kubelet:10.11.11.13:10255: failed to get all container stats from Kubelet URL "http://10.11.11.13:10255/stats/container/": Post http://10.11.11.13:10255/stats/container/: dial tcp 10.11.11.13:10255: getsockopt: connection refused

修改heapster.yaml

#- --source=kubernetes:https://kubernetes.default
- --source=kubernetes.summary_api:''?kubeletPort=10250&kubeletHttps=true&insecure=true

更新 kubectl apply -f .

报错

Error in scraping containers from kubelet_summary:10.11.11.11:10250: request failed - "403 Forbidden", response: "Forbidden (user=system:serviceaccount:kube-system:heapster, verb=get, resource=nodes, subresource=stats)"

需要给system:heapster角色增加nodes/stats权限,这里偷个懒直接换成system:kubelet-api-admin角色

修改heapster-rbac.yaml

kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: heapster
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:kubelet-api-admin
subjects:
- kind: ServiceAccount
  name: heapster
  namespace: kube-system

删除重新创建

kubectl delete -f .
kubectl create -f .

在dashboard可以看到图像化的资源使用情况

heapster

高可用

官方文档给出两种高可用方案,一种是堆叠master节点,一种是部署额外的etcd集群。

通过kubeadm方式安装只会运行一个单节点的etcd,和一个无状态的kube-apiserver服务。

stacked etcd 拓扑图:

stacked etcd

external etcd 拓扑图:

stacked etcd

另外要注意kube-dns/coredns需要运行多个实例, 只运行了一个实例的话会有单点风险。

补充

重置安装环境

可以使用 kubeadm reset 重置安装环境

# 清理cni和flannel
ifconfig cni0 down
ip link delete cni0
ifconfig flannel.1 down
ip link delete flannel.1
rm -rf /var/lib/cni/

清除iptables

iptables -F && iptables -X && iptables -F -t nat && iptables -X -t nat

清除网桥

ip link del flannel.1

重启docker服务 systemctl restart docker

移除node

master执行

kubectl get nodes

NAME                     STATUS     ROLES    AGE     VERSION
host02   Ready      <none>   8m35s   v1.13.2
host01   NotReady   master   9m29s   v1.13.2
kubectl drain host02 --delete-local-data --force --ignore-daemonsets
kubectl delete node host02

node上执行

kubeadm reset

ifconfig cni0 down
ip link delete cni0
ifconfig flannel.1 down
ip link delete flannel.1
rm -rf /var/lib/cni/