Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RKE cluster is stuck in "Removing member from etcd cluster" when removing/adding a node #3354

Closed
vynguyenchantal opened this issue Sep 6, 2023 · 5 comments

Comments

@vynguyenchantal
Copy link

vynguyenchantal commented Sep 6, 2023

RKE version: 1.3.15

Docker version: (docker version,docker info preferred)

docker version
Client:
 Version:         1.13.1
 API version:     1.26
 Package version: docker-1.13.1-208.git7d71120.el7_9.x86_64
 Go version:      go1.10.3
 Git commit:      7d71120/1.13.1
 Built:           Fri Jun  4 10:20:12 2021
 OS/Arch:         linux/amd64

Server:
 Version:         1.13.1
 API version:     1.26 (minimum version 1.12)
 Package version: docker-1.13.1-208.git7d71120.el7_9.x86_64
 Go version:      go1.10.3
 Git commit:      7d71120/1.13.1
 Built:           Fri Jun  4 10:20:12 2021
 OS/Arch:         linux/amd64
 Experimental:    false

docker info
Containers: 49
 Running: 23
 Paused: 0
 Stopped: 26
Images: 21
Server Version: 1.13.1
Storage Driver: overlay2
 Backing Filesystem: xfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: systemd
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
 Authorization: rhel-push-plugin
Swarm: inactive
Runtimes: docker-runc runc
Default Runtime: docker-runc
Init Binary: /usr/libexec/docker/docker-init-current
containerd version:  (expected: aa8187dbd3b7ad67d8e5e3a15115d3eef43a7ed1)
runc version: 66aedde759f33c190954815fb765eedc1d782dd9 (expected: 9df8b306d01f59d3a8029be411de015b7304dd8f)
init version: fec3683b971d9c3ef73f284f176672c44b448662 (expected: 949e6facb77383876aeff8a6944dde66b3089574)
Security Options:
 seccomp
  WARNING: You're not using the default seccomp profile
  Profile: /etc/docker/seccomp.json
 selinux
Kernel Version: 3.10.0-1160.76.1.el7.x86_64
Operating System: Red Hat Enterprise Linux Server 7.9 (Maipo)
OSType: linux
Architecture: x86_64
Number of Docker Hooks: 3
CPUs: 2
Total Memory: 15.22 GiB
Name: ip-10-15-1-61
ID: GYON:UNLC:RKKM:ZFAH:5VAC:2GSD:WEJ4:67MJ:SFME:GYZU:WHMS:UY7V
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://registry.access.redhat.com/v1/
Experimental: false
Insecure Registries:
 harbor.gimsd6.internal.udev.nga.mil:5000
 127.0.0.0/8
Live Restore Enabled: false
Registries: registry.access.redhat.com (secure), registry.redhat.io (secure), docker.io (secure), docker.io (secure)

Operating system and kernel: (cat /etc/os-release, uname -r preferred)

cat /etc/os-release
NAME="Red Hat Enterprise Linux Server"
VERSION="7.9 (Maipo)"
ID="rhel"
ID_LIKE="fedora"
VARIANT="Server"
VARIANT_ID="server"
VERSION_ID="7.9"
PRETTY_NAME="Red Hat Enterprise Linux Server 7.9 (Maipo)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:redhat:enterprise_linux:7.9:GA:server"
HOME_URL="https://www.redhat.com/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"

REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 7"
REDHAT_BUGZILLA_PRODUCT_VERSION=7.9
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="7.9"

uname -r
3.10.0-1160.76.1.el7.x86_64

Type/provider of hosts: (VirtualBox/Bare-metal/AWS/GCE/DO) AWS

cluster.yml file:

nodes:
- address: x.x.x.x
  user: rke-user
  role: [ "controlplane", "etcd", "worker" ]
  ssh_key_path: /root/myenv/rkeuser.pem
- address: x.x.x.x
  user: rke-user
  role: [ "controlplane", "etcd", "worker" ]
  ssh_key_path: /root/myenv/rkeuser.pem
- address: x.x.x.x
  user: rke-user
  role: [ "controlplane", "etcd", "worker" ]
  ssh_key_path: /root/myenv/rkeuser.pem

addon_job_timeout: 30

cluster_name: local-rancher

enable_cri_dockerd: false

ssh_agent_auth: false

kubernetes_version: v1.23.10-rancher1-1

ignore_docker_version: true

private_registries:
  - url: x.x.x.x:xxxx
    is_default: true

authentication:
  type: /v3/schemas/authnConfig
  strategy: "x509"
  sans:
    - "x.x.x.x"

authorization:
  type: /v3/schemas/authzConfig
  mode: rbac

dns:
  provider: coredns
  upstreamnameservers: [x.x.x.x, x.x.x.x]
  update_strategy:
    strategy: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1
      maxSurge: 25%

ingress:
  provider: "nginx"
  options:
    use-forwarded-headers: "true"
  extra_args:
    default-ssl-certificate: "cattle-system/tls-rancher-ingress"
  update_strategy:
    strategy: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1

monitoring:
  provider: metrics-server
  update_strategy:
    strategy: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1
      maxSurge: 25%

network:
  plugin: calico
  update_strategy:
    strategy: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1

# Configuration for the essential components of Kubernetes.
services:
  etcd:
    backup_config:
      enabled: true     # enables recurring etcd snapshots
      interval_hours: 6 # time increment between snapshots
      retention: 30     # time in days before snapshot purge
      s3backupconfig:
         bucket_name: "myenv-rancher"
         endpoint: "s3-us-gov-west-1.amazonaws.com"
         region: "us-gov-west-1"
         folder: "myenv/local-cluster/etcd-snapshots"
         custom_ca: "xxx"
    extra_args:
      election-timeout: "5000"
      heartbeat-interval: "500"

  kube-api:
    pod_security_policy: false
    extra_args:
      external-hostname: x.x.x.x
  kubelet:
    fail_swap_on: false
    extra_args:
      kube-reserved: "cpu=500m,memory=1Gi"
      system-reserved: "cpu=500m,memory=4Gi"

upgrade_strategy:
  max_unavailable_controlplane: 1
  drain: true
  node_drain_input:
    force: true
    ignore_daemonsets: true
    delete_local_data: true
    grace_period: -1
    timeout: 900

addons: |-
  ---
  apiVersion: v1
  kind: Namespace
  metadata:
    name: cattle-system
  ---
  apiVersion: v1
  kind: Secret
  metadata:
    name: tls-rancher-ingress
    namespace: cattle-system
  type: Opaque
  data:
    tls.crt: xxx
    tls.key: xxx
  ---
  apiVersion: v1
  kind: Secret
  metadata:
    name: tls-ca
    namespace: cattle-system
  type: Opaque
  data:
    cacerts.pem: xxx

Steps to Reproduce: See rancher/rancher#42684
We need to rolling patch the Local cluster with a new AMI. This is what I did:

  1. Provision a NEW EC2 instance
  2. Update cluster.yaml removing the OLD node and adding the NEW node
  3. Run rke up --config cluster.yaml --update-only

Results: See rancher/rancher#42684

  1. When it works - the OLD node removed from the cluster and the NEW node added
TestHelper :: INFO    :: 2023-09-07 16:07:59,019 :: Generating rancher-cluster.yaml
TestHelper :: INFO    :: 2023-09-07 16:07:59,019 :: Performing the 'rke up' command to update the RKE-provisioned Rancher 'local' cluster.
time="2023-09-07T16:07:59Z" level=info msg="Running RKE version: v1.3.15"
time="2023-09-07T16:07:59Z" level=info msg="Initiating Kubernetes cluster"
time="2023-09-07T16:07:59Z" level=info msg="[certificates] GenerateServingCertificate is disabled, checking if there are unused kubelet certificates"
time="2023-09-07T16:07:59Z" level=info msg="[certificates] Generating Kubernetes API server certificates"
time="2023-09-07T16:07:59Z" level=info msg="[certificates] Generating admin certificates and kubeconfig"
time="2023-09-07T16:07:59Z" level=info msg="[certificates] Generating kube-etcd-x-x-x-61 certificate and key"
time="2023-09-07T16:07:59Z" level=info msg="[certificates] Generating kube-etcd-x-x-x-11 certificate and key"
time="2023-09-07T16:07:59Z" level=info msg="[certificates] Generating kube-etcd-x-x-x-155 certificate and key"
time="2023-09-07T16:07:59Z" level=info msg="[certificates] Deleting unused certificate: kube-etcd-x-x-x-197"
time="2023-09-07T16:07:59Z" level=info msg="Successfully Deployed state file at [/root/myenv/rancher-cluster.rkestate]"
time="2023-09-07T16:07:59Z" level=info msg="Building Kubernetes cluster"
time="2023-09-07T16:07:59Z" level=info msg="[dialer] Setup tunnel for host [x.x.x.155]"
time="2023-09-07T16:07:59Z" level=info msg="[dialer] Setup tunnel for host [x.x.x.61]"
time="2023-09-07T16:07:59Z" level=info msg="[dialer] Setup tunnel for host [x.x.x.11]"
time="2023-09-07T16:07:59Z" level=info msg="[network] Deploying port listener containers"
time="2023-09-07T16:07:59Z" level=info msg="Pulling image [harbor:5000/rancher/rke-tools:v0.1.87] on host [x.x.x.11], try #1"
time="2023-09-07T16:07:59Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.61]"
time="2023-09-07T16:07:59Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.155]"
time="2023-09-07T16:07:59Z" level=info msg="Starting container [rke-etcd-port-listener] on host [x.x.x.61], try #1"
time="2023-09-07T16:07:59Z" level=info msg="Starting container [rke-etcd-port-listener] on host [x.x.x.155], try #1"
time="2023-09-07T16:08:05Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.11]"
time="2023-09-07T16:08:06Z" level=info msg="Starting container [rke-etcd-port-listener] on host [x.x.x.11], try #1"
time="2023-09-07T16:08:07Z" level=info msg="[network] Successfully started [rke-etcd-port-listener] container on host [x.x.x.11]"
time="2023-09-07T16:08:07Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.11]"
time="2023-09-07T16:08:07Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.155]"
time="2023-09-07T16:08:07Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.61]"
time="2023-09-07T16:08:07Z" level=info msg="Starting container [rke-cp-port-listener] on host [x.x.x.11], try #1"
time="2023-09-07T16:08:07Z" level=info msg="Starting container [rke-cp-port-listener] on host [x.x.x.155], try #1"
time="2023-09-07T16:08:07Z" level=info msg="Starting container [rke-cp-port-listener] on host [x.x.x.61], try #1"
time="2023-09-07T16:08:07Z" level=info msg="[network] Successfully started [rke-cp-port-listener] container on host [x.x.x.11]"
time="2023-09-07T16:08:07Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.11]"
time="2023-09-07T16:08:07Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.155]"
time="2023-09-07T16:08:07Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.61]"
time="2023-09-07T16:08:07Z" level=info msg="Starting container [rke-worker-port-listener] on host [x.x.x.11], try #1"
time="2023-09-07T16:08:08Z" level=info msg="Starting container [rke-worker-port-listener] on host [x.x.x.155], try #1"
time="2023-09-07T16:08:08Z" level=info msg="Starting container [rke-worker-port-listener] on host [x.x.x.61], try #1"
time="2023-09-07T16:08:08Z" level=info msg="[network] Successfully started [rke-worker-port-listener] container on host [x.x.x.11]"
time="2023-09-07T16:08:08Z" level=info msg="[network] Port listener containers deployed successfully"
time="2023-09-07T16:08:08Z" level=info msg="[network] Running etcd <-> etcd port checks"
time="2023-09-07T16:08:08Z" level=info msg="[network] Checking if host [x.x.x.61] can connect to host(s) [x.x.x.61 x.x.x.11 x.x.x.155] on port(s) [2379 2380], try #1"
time="2023-09-07T16:08:08Z" level=info msg="[network] Checking if host [x.x.x.11] can connect to host(s) [x.x.x.61 x.x.x.11 x.x.x.155] on port(s) [2379 2380], try #1"
time="2023-09-07T16:08:08Z" level=info msg="[network] Checking if host [x.x.x.155] can connect to host(s) [x.x.x.61 x.x.x.11 x.x.x.155] on port(s) [2379 2380], try #1"
time="2023-09-07T16:08:08Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.11]"
time="2023-09-07T16:08:08Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.155]"
time="2023-09-07T16:08:08Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.61]"
time="2023-09-07T16:08:08Z" level=info msg="Starting container [rke-port-checker] on host [x.x.x.11], try #1"
time="2023-09-07T16:08:08Z" level=info msg="Starting container [rke-port-checker] on host [x.x.x.155], try #1"
time="2023-09-07T16:08:08Z" level=info msg="Starting container [rke-port-checker] on host [x.x.x.61], try #1"
time="2023-09-07T16:08:08Z" level=info msg="[network] Successfully started [rke-port-checker] container on host [x.x.x.11]"
time="2023-09-07T16:08:08Z" level=info msg="Removing container [rke-port-checker] on host [x.x.x.11], try #1"
time="2023-09-07T16:08:08Z" level=info msg="[network] Successfully started [rke-port-checker] container on host [x.x.x.155]"
time="2023-09-07T16:08:08Z" level=info msg="Removing container [rke-port-checker] on host [x.x.x.155], try #1"
time="2023-09-07T16:08:08Z" level=info msg="[network] Successfully started [rke-port-checker] container on host [x.x.x.61]"
time="2023-09-07T16:08:09Z" level=info msg="Removing container [rke-port-checker] on host [x.x.x.61], try #1"
time="2023-09-07T16:08:09Z" level=info msg="[network] Running control plane -> etcd port checks"
time="2023-09-07T16:08:09Z" level=info msg="[network] Checking if host [x.x.x.11] can connect to host(s) [x.x.x.61 x.x.x.11 x.x.x.155] on port(s) [2379], try #1"
time="2023-09-07T16:08:09Z" level=info msg="[network] Checking if host [x.x.x.61] can connect to host(s) [x.x.x.61 x.x.x.11 x.x.x.155] on port(s) [2379], try #1"
time="2023-09-07T16:08:09Z" level=info msg="[network] Checking if host [x.x.x.155] can connect to host(s) [x.x.x.61 x.x.x.11 x.x.x.155] on port(s) [2379], try #1"
time="2023-09-07T16:08:09Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.61]"
time="2023-09-07T16:08:09Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.11]"
time="2023-09-07T16:08:09Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.155]"
time="2023-09-07T16:08:09Z" level=info msg="Starting container [rke-port-checker] on host [x.x.x.61], try #1"
time="2023-09-07T16:08:09Z" level=info msg="Starting container [rke-port-checker] on host [x.x.x.11], try #1"
time="2023-09-07T16:08:09Z" level=info msg="Starting container [rke-port-checker] on host [x.x.x.155], try #1"
time="2023-09-07T16:08:09Z" level=info msg="[network] Successfully started [rke-port-checker] container on host [x.x.x.61]"
time="2023-09-07T16:08:09Z" level=info msg="[network] Successfully started [rke-port-checker] container on host [x.x.x.11]"
time="2023-09-07T16:08:09Z" level=info msg="[network] Successfully started [rke-port-checker] container on host [x.x.x.155]"
time="2023-09-07T16:08:09Z" level=info msg="Removing container [rke-port-checker] on host [x.x.x.11], try #1"
time="2023-09-07T16:08:09Z" level=info msg="Removing container [rke-port-checker] on host [x.x.x.61], try #1"
time="2023-09-07T16:08:09Z" level=info msg="Removing container [rke-port-checker] on host [x.x.x.155], try #1"
time="2023-09-07T16:08:09Z" level=info msg="[network] Running control plane -> worker port checks"
time="2023-09-07T16:08:09Z" level=info msg="[network] Checking if host [x.x.x.11] can connect to host(s) [x.x.x.61 x.x.x.11 x.x.x.155] on port(s) [10250], try #1"
time="2023-09-07T16:08:09Z" level=info msg="[network] Checking if host [x.x.x.61] can connect to host(s) [x.x.x.61 x.x.x.11 x.x.x.155] on port(s) [10250], try #1"
time="2023-09-07T16:08:09Z" level=info msg="[network] Checking if host [x.x.x.155] can connect to host(s) [x.x.x.61 x.x.x.11 x.x.x.155] on port(s) [10250], try #1"
time="2023-09-07T16:08:09Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.11]"
time="2023-09-07T16:08:09Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.61]"
time="2023-09-07T16:08:09Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.155]"
time="2023-09-07T16:08:09Z" level=info msg="Starting container [rke-port-checker] on host [x.x.x.61], try #1"
time="2023-09-07T16:08:09Z" level=info msg="Starting container [rke-port-checker] on host [x.x.x.11], try #1"
time="2023-09-07T16:08:09Z" level=info msg="Starting container [rke-port-checker] on host [x.x.x.155], try #1"
time="2023-09-07T16:08:10Z" level=info msg="[network] Successfully started [rke-port-checker] container on host [x.x.x.11]"
time="2023-09-07T16:08:10Z" level=info msg="[network] Successfully started [rke-port-checker] container on host [x.x.x.155]"
time="2023-09-07T16:08:10Z" level=info msg="[network] Successfully started [rke-port-checker] container on host [x.x.x.61]"
time="2023-09-07T16:08:10Z" level=info msg="Removing container [rke-port-checker] on host [x.x.x.11], try #1"
time="2023-09-07T16:08:10Z" level=info msg="Removing container [rke-port-checker] on host [x.x.x.155], try #1"
time="2023-09-07T16:08:10Z" level=info msg="Removing container [rke-port-checker] on host [x.x.x.61], try #1"
time="2023-09-07T16:08:10Z" level=info msg="[network] Running workers -> control plane port checks"
time="2023-09-07T16:08:10Z" level=info msg="[network] Checking if host [x.x.x.155] can connect to host(s) [x.x.x.61 x.x.x.11 x.x.x.155] on port(s) [6443], try #1"
time="2023-09-07T16:08:10Z" level=info msg="[network] Checking if host [x.x.x.11] can connect to host(s) [x.x.x.61 x.x.x.11 x.x.x.155] on port(s) [6443], try #1"
time="2023-09-07T16:08:10Z" level=info msg="[network] Checking if host [x.x.x.61] can connect to host(s) [x.x.x.61 x.x.x.11 x.x.x.155] on port(s) [6443], try #1"
time="2023-09-07T16:08:10Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.61]"
time="2023-09-07T16:08:10Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.11]"
time="2023-09-07T16:08:10Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.155]"
time="2023-09-07T16:08:10Z" level=info msg="Starting container [rke-port-checker] on host [x.x.x.61], try #1"
time="2023-09-07T16:08:10Z" level=info msg="Starting container [rke-port-checker] on host [x.x.x.11], try #1"
time="2023-09-07T16:08:10Z" level=info msg="Starting container [rke-port-checker] on host [x.x.x.155], try #1"
time="2023-09-07T16:08:10Z" level=info msg="[network] Successfully started [rke-port-checker] container on host [x.x.x.11]"
time="2023-09-07T16:08:10Z" level=info msg="[network] Successfully started [rke-port-checker] container on host [x.x.x.61]"
time="2023-09-07T16:08:10Z" level=info msg="[network] Successfully started [rke-port-checker] container on host [x.x.x.155]"
time="2023-09-07T16:08:10Z" level=info msg="Removing container [rke-port-checker] on host [x.x.x.11], try #1"
time="2023-09-07T16:08:10Z" level=info msg="Removing container [rke-port-checker] on host [x.x.x.61], try #1"
time="2023-09-07T16:08:10Z" level=info msg="Removing container [rke-port-checker] on host [x.x.x.155], try #1"
time="2023-09-07T16:08:10Z" level=info msg="[network] Checking KubeAPI port Control Plane hosts"
time="2023-09-07T16:08:10Z" level=info msg="[network] Removing port listener containers"
time="2023-09-07T16:08:10Z" level=info msg="Removing container [rke-etcd-port-listener] on host [x.x.x.155], try #1"
time="2023-09-07T16:08:10Z" level=info msg="Removing container [rke-etcd-port-listener] on host [x.x.x.61], try #1"
time="2023-09-07T16:08:10Z" level=info msg="Removing container [rke-etcd-port-listener] on host [x.x.x.11], try #1"
time="2023-09-07T16:08:10Z" level=info msg="[remove/rke-etcd-port-listener] Successfully removed container on host [x.x.x.155]"
time="2023-09-07T16:08:10Z" level=info msg="[remove/rke-etcd-port-listener] Successfully removed container on host [x.x.x.61]"
time="2023-09-07T16:08:11Z" level=info msg="[remove/rke-etcd-port-listener] Successfully removed container on host [x.x.x.11]"
time="2023-09-07T16:08:11Z" level=info msg="Removing container [rke-cp-port-listener] on host [x.x.x.11], try #1"
time="2023-09-07T16:08:11Z" level=info msg="Removing container [rke-cp-port-listener] on host [x.x.x.61], try #1"
time="2023-09-07T16:08:11Z" level=info msg="Removing container [rke-cp-port-listener] on host [x.x.x.155], try #1"
time="2023-09-07T16:08:11Z" level=info msg="[remove/rke-cp-port-listener] Successfully removed container on host [x.x.x.61]"
time="2023-09-07T16:08:11Z" level=info msg="[remove/rke-cp-port-listener] Successfully removed container on host [x.x.x.155]"
time="2023-09-07T16:08:11Z" level=info msg="[remove/rke-cp-port-listener] Successfully removed container on host [x.x.x.11]"
time="2023-09-07T16:08:11Z" level=info msg="Removing container [rke-worker-port-listener] on host [x.x.x.61], try #1"
time="2023-09-07T16:08:11Z" level=info msg="Removing container [rke-worker-port-listener] on host [x.x.x.11], try #1"
time="2023-09-07T16:08:11Z" level=info msg="Removing container [rke-worker-port-listener] on host [x.x.x.155], try #1"
time="2023-09-07T16:08:11Z" level=info msg="[remove/rke-worker-port-listener] Successfully removed container on host [x.x.x.61]"
time="2023-09-07T16:08:11Z" level=info msg="[remove/rke-worker-port-listener] Successfully removed container on host [x.x.x.155]"
time="2023-09-07T16:08:11Z" level=info msg="[remove/rke-worker-port-listener] Successfully removed container on host [x.x.x.11]"
time="2023-09-07T16:08:11Z" level=info msg="[network] Port listener containers removed successfully"
time="2023-09-07T16:08:11Z" level=info msg="[selinux] Checking if host [x.x.x.61] recognizes SELinux label [label=type:rke_container_t], try #1"
time="2023-09-07T16:08:11Z" level=info msg="[selinux] Checking if host [x.x.x.11] recognizes SELinux label [label=type:rke_container_t], try #1"
time="2023-09-07T16:08:11Z" level=info msg="[selinux] Checking if host [x.x.x.155] recognizes SELinux label [label=type:rke_container_t], try #1"
time="2023-09-07T16:08:11Z" level=info msg="Removing container [rke-selinux-checker] on host [x.x.x.61], try #1"
time="2023-09-07T16:08:11Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.11]"
time="2023-09-07T16:08:11Z" level=info msg="Removing container [rke-selinux-checker] on host [x.x.x.155], try #1"
time="2023-09-07T16:08:11Z" level=info msg="[remove/rke-selinux-checker] Successfully removed container on host [x.x.x.155]"
time="2023-09-07T16:08:11Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.155]"
time="2023-09-07T16:08:11Z" level=info msg="[remove/rke-selinux-checker] Successfully removed container on host [x.x.x.61]"
time="2023-09-07T16:08:11Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.61]"
time="2023-09-07T16:08:11Z" level=info msg="Starting container [rke-selinux-checker] on host [x.x.x.11], try #1"
time="2023-09-07T16:08:11Z" level=info msg="Starting container [rke-selinux-checker] on host [x.x.x.155], try #1"
time="2023-09-07T16:08:11Z" level=info msg="Starting container [rke-selinux-checker] on host [x.x.x.61], try #1"
time="2023-09-07T16:08:11Z" level=info msg="Successfully started [rke-selinux-checker] container on host [x.x.x.11]"
time="2023-09-07T16:08:11Z" level=info msg="Waiting for [rke-selinux-checker] container to exit on host [x.x.x.11]"
time="2023-09-07T16:08:11Z" level=info msg="Waiting for [rke-selinux-checker] container to exit on host [x.x.x.11]"
time="2023-09-07T16:08:11Z" level=info msg="Container [rke-selinux-checker] is still running on host [x.x.x.11]: stderr: [], stdout: []"
time="2023-09-07T16:08:12Z" level=info msg="Successfully started [rke-selinux-checker] container on host [x.x.x.155]"
time="2023-09-07T16:08:12Z" level=info msg="Waiting for [rke-selinux-checker] container to exit on host [x.x.x.155]"
time="2023-09-07T16:08:12Z" level=info msg="Waiting for [rke-selinux-checker] container to exit on host [x.x.x.155]"
time="2023-09-07T16:08:12Z" level=info msg="Container [rke-selinux-checker] is still running on host [x.x.x.155]: stderr: [], stdout: []"
time="2023-09-07T16:08:12Z" level=info msg="Successfully started [rke-selinux-checker] container on host [x.x.x.61]"
time="2023-09-07T16:08:12Z" level=info msg="Waiting for [rke-selinux-checker] container to exit on host [x.x.x.61]"
time="2023-09-07T16:08:12Z" level=info msg="Waiting for [rke-selinux-checker] container to exit on host [x.x.x.61]"
time="2023-09-07T16:08:12Z" level=info msg="Container [rke-selinux-checker] is still running on host [x.x.x.61]: stderr: [], stdout: []"
time="2023-09-07T16:08:13Z" level=info msg="[certificates] kube-apiserver certificate changed, force deploying certs"
time="2023-09-07T16:08:13Z" level=info msg="[certificates] Deploying kubernetes certificates to Cluster nodes"
time="2023-09-07T16:08:13Z" level=info msg="Finding container [cert-deployer] on host [x.x.x.11], try #1"
time="2023-09-07T16:08:13Z" level=info msg="Finding container [cert-deployer] on host [x.x.x.61], try #1"
time="2023-09-07T16:08:13Z" level=info msg="Finding container [cert-deployer] on host [x.x.x.155], try #1"
time="2023-09-07T16:08:13Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.11]"
time="2023-09-07T16:08:13Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.61]"
time="2023-09-07T16:08:13Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.155]"
time="2023-09-07T16:08:13Z" level=info msg="Starting container [cert-deployer] on host [x.x.x.61], try #1"
time="2023-09-07T16:08:13Z" level=info msg="Starting container [cert-deployer] on host [x.x.x.11], try #1"
time="2023-09-07T16:08:13Z" level=info msg="Starting container [cert-deployer] on host [x.x.x.155], try #1"
time="2023-09-07T16:08:13Z" level=info msg="Finding container [cert-deployer] on host [x.x.x.11], try #1"
time="2023-09-07T16:08:13Z" level=info msg="Finding container [cert-deployer] on host [x.x.x.61], try #1"
time="2023-09-07T16:08:13Z" level=info msg="Finding container [cert-deployer] on host [x.x.x.155], try #1"
time="2023-09-07T16:08:18Z" level=info msg="Finding container [cert-deployer] on host [x.x.x.11], try #1"
time="2023-09-07T16:08:18Z" level=info msg="Removing container [cert-deployer] on host [x.x.x.11], try #1"
time="2023-09-07T16:08:18Z" level=info msg="Finding container [cert-deployer] on host [x.x.x.61], try #1"
time="2023-09-07T16:08:18Z" level=info msg="Finding container [cert-deployer] on host [x.x.x.155], try #1"
time="2023-09-07T16:08:18Z" level=info msg="Removing container [cert-deployer] on host [x.x.x.61], try #1"
time="2023-09-07T16:08:18Z" level=info msg="Removing container [cert-deployer] on host [x.x.x.155], try #1"
time="2023-09-07T16:08:18Z" level=info msg="[reconcile] Rebuilding and updating local kube config"
time="2023-09-07T16:08:18Z" level=info msg="Successfully Deployed local admin kubeconfig at [/root/myenv/kube_config_rancher-cluster.yaml]"
time="2023-09-07T16:08:18Z" level=info msg="[reconcile] host [x.x.x.61] is a control plane node with reachable Kubernetes API endpoint in the cluster"
time="2023-09-07T16:08:18Z" level=info msg="[certificates] Successfully deployed kubernetes certificates to Cluster nodes"
time="2023-09-07T16:08:18Z" level=info msg="[file-deploy] Deploying file [/etc/kubernetes/audit-policy.yaml] to node [x.x.x.61]"
time="2023-09-07T16:08:18Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.61]"
time="2023-09-07T16:08:19Z" level=info msg="Starting container [file-deployer] on host [x.x.x.61], try #1"
time="2023-09-07T16:08:19Z" level=info msg="Successfully started [file-deployer] container on host [x.x.x.61]"
time="2023-09-07T16:08:19Z" level=info msg="Waiting for [file-deployer] container to exit on host [x.x.x.61]"
time="2023-09-07T16:08:19Z" level=info msg="Waiting for [file-deployer] container to exit on host [x.x.x.61]"
time="2023-09-07T16:08:19Z" level=info msg="Container [file-deployer] is still running on host [x.x.x.61]: stderr: [], stdout: []"
time="2023-09-07T16:08:20Z" level=info msg="Removing container [file-deployer] on host [x.x.x.61], try #1"
time="2023-09-07T16:08:20Z" level=info msg="[remove/file-deployer] Successfully removed container on host [x.x.x.61]"
time="2023-09-07T16:08:20Z" level=info msg="[file-deploy] Deploying file [/etc/kubernetes/audit-policy.yaml] to node [x.x.x.11]"
time="2023-09-07T16:08:20Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.11]"
time="2023-09-07T16:08:20Z" level=info msg="Starting container [file-deployer] on host [x.x.x.11], try #1"
time="2023-09-07T16:08:21Z" level=info msg="Successfully started [file-deployer] container on host [x.x.x.11]"
time="2023-09-07T16:08:21Z" level=info msg="Waiting for [file-deployer] container to exit on host [x.x.x.11]"
time="2023-09-07T16:08:21Z" level=info msg="Waiting for [file-deployer] container to exit on host [x.x.x.11]"
time="2023-09-07T16:08:21Z" level=info msg="Container [file-deployer] is still running on host [x.x.x.11]: stderr: [], stdout: []"
time="2023-09-07T16:08:22Z" level=info msg="Removing container [file-deployer] on host [x.x.x.11], try #1"
time="2023-09-07T16:08:22Z" level=info msg="[remove/file-deployer] Successfully removed container on host [x.x.x.11]"
time="2023-09-07T16:08:22Z" level=info msg="[file-deploy] Deploying file [/etc/kubernetes/audit-policy.yaml] to node [x.x.x.155]"
time="2023-09-07T16:08:22Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.155]"
time="2023-09-07T16:08:22Z" level=info msg="Starting container [file-deployer] on host [x.x.x.155], try #1"
time="2023-09-07T16:08:22Z" level=info msg="Successfully started [file-deployer] container on host [x.x.x.155]"
time="2023-09-07T16:08:22Z" level=info msg="Waiting for [file-deployer] container to exit on host [x.x.x.155]"
time="2023-09-07T16:08:22Z" level=info msg="Waiting for [file-deployer] container to exit on host [x.x.x.155]"
time="2023-09-07T16:08:22Z" level=info msg="Container [file-deployer] is still running on host [x.x.x.155]: stderr: [], stdout: []"
time="2023-09-07T16:08:23Z" level=info msg="Removing container [file-deployer] on host [x.x.x.155], try #1"
time="2023-09-07T16:08:23Z" level=info msg="[remove/file-deployer] Successfully removed container on host [x.x.x.155]"
time="2023-09-07T16:08:23Z" level=info msg="[/etc/kubernetes/audit-policy.yaml] Successfully deployed audit policy file to Cluster control nodes"
time="2023-09-07T16:08:23Z" level=info msg="[reconcile] Reconciling cluster state"
time="2023-09-07T16:08:23Z" level=info msg="[reconcile] Check etcd hosts to be deleted"
time="2023-09-07T16:08:23Z" level=info msg="[remove/etcd] Removing member [etcd-x.x.x.197] from etcd cluster"
time="2023-09-07T16:08:23Z" level=info msg="[remove/etcd] Checking etcd cluster health on [etcd-x.x.x.61] after removing [etcd-x.x.x.197]"
time="2023-09-07T16:08:31Z" level=info msg="[etcd] etcd host [x.x.x.61] reported healthy=true"
time="2023-09-07T16:08:31Z" level=info msg="[remove/etcd] etcd cluster health is healthy on [etcd-x.x.x.61] after removing [etcd-x.x.x.197]"
time="2023-09-07T16:08:31Z" level=info msg="[remove/etcd] Successfully removed member [etcd-x.x.x.197] from etcd cluster"
time="2023-09-07T16:08:31Z" level=info msg="[hosts] host [x.x.x.197] has another role, skipping delete from kubernetes cluster"
time="2023-09-07T16:08:31Z" level=info msg="[dialer] Setup tunnel for host [x.x.x.197]"
time="2023-09-07T16:08:31Z" level=info msg="[etcd] Tearing down etcd plane.."
time="2023-09-07T16:08:31Z" level=info msg="Removing container [etcd] on host [x.x.x.197], try #1"
time="2023-09-07T16:08:31Z" level=info msg="[remove/etcd] Successfully removed container on host [x.x.x.197]"
time="2023-09-07T16:08:32Z" level=info msg="Removing container [etcd-rolling-snapshots] on host [x.x.x.197], try #1"
time="2023-09-07T16:08:32Z" level=info msg="[remove/etcd-rolling-snapshots] Successfully removed container on host [x.x.x.197]"
time="2023-09-07T16:08:32Z" level=info msg="[etcd] Successfully tore down etcd plane.."
time="2023-09-07T16:08:32Z" level=info msg="[hosts] Host [x.x.x.197] is already a worker or control host, skipping cleanup certs."
time="2023-09-07T16:08:32Z" level=info msg="[hosts] Cleaning up host [x.x.x.197]"
time="2023-09-07T16:08:32Z" level=info msg="[hosts] Running cleaner container on host [x.x.x.197]"
time="2023-09-07T16:08:32Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.197]"
time="2023-09-07T16:08:32Z" level=info msg="Starting container [kube-cleaner] on host [x.x.x.197], try #1"
time="2023-09-07T16:08:33Z" level=info msg="[kube-cleaner] Successfully started [kube-cleaner] container on host [x.x.x.197]"
time="2023-09-07T16:08:33Z" level=info msg="Waiting for [kube-cleaner] container to exit on host [x.x.x.197]"
time="2023-09-07T16:08:33Z" level=info msg="Container [kube-cleaner] is still running on host [x.x.x.197]: stderr: [], stdout: []"
time="2023-09-07T16:08:34Z" level=info msg="[hosts] Removing cleaner container on host [x.x.x.197]"
time="2023-09-07T16:08:34Z" level=info msg="Removing container [kube-cleaner] on host [x.x.x.197], try #1"
time="2023-09-07T16:08:34Z" level=info msg="[hosts] Removing dead container logs on host [x.x.x.197]"
time="2023-09-07T16:08:34Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.197]"
time="2023-09-07T16:08:34Z" level=info msg="Starting container [rke-log-cleaner] on host [x.x.x.197], try #1"
time="2023-09-07T16:08:34Z" level=info msg="[cleanup] Successfully started [rke-log-cleaner] container on host [x.x.x.197]"
time="2023-09-07T16:08:34Z" level=info msg="Removing container [rke-log-cleaner] on host [x.x.x.197], try #1"
time="2023-09-07T16:08:35Z" level=info msg="[remove/rke-log-cleaner] Successfully removed container on host [x.x.x.197]"
time="2023-09-07T16:08:35Z" level=info msg="[hosts] Successfully cleaned up host [x.x.x.197]"
time="2023-09-07T16:08:35Z" level=info msg="[reconcile] Check etcd hosts to be added"
time="2023-09-07T16:08:35Z" level=info msg="[add/etcd] Adding member [etcd-x.x.x.11] to etcd cluster"
time="2023-09-07T16:08:35Z" level=info msg="[add/etcd] Successfully Added member [etcd-x.x.x.11] to etcd cluster"
time="2023-09-07T16:08:35Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.11]"
time="2023-09-07T16:08:35Z" level=info msg="Starting container [etcd-fix-perm] on host [x.x.x.11], try #1"
time="2023-09-07T16:08:36Z" level=info msg="Successfully started [etcd-fix-perm] container on host [x.x.x.11]"
time="2023-09-07T16:08:36Z" level=info msg="Waiting for [etcd-fix-perm] container to exit on host [x.x.x.11]"
time="2023-09-07T16:08:36Z" level=info msg="Waiting for [etcd-fix-perm] container to exit on host [x.x.x.11]"
time="2023-09-07T16:08:36Z" level=info msg="Container [etcd-fix-perm] is still running on host [x.x.x.11]: stderr: [], stdout: []"
time="2023-09-07T16:08:37Z" level=info msg="Removing container [etcd-fix-perm] on host [x.x.x.11], try #1"
time="2023-09-07T16:08:37Z" level=info msg="[remove/etcd-fix-perm] Successfully removed container on host [x.x.x.11]"
time="2023-09-07T16:08:37Z" level=info msg="Pulling image [harbor:5000/rancher/mirrored-coreos-etcd:v3.5.3] on host [x.x.x.11], try #1"
time="2023-09-07T16:08:41Z" level=info msg="Image [harbor:5000/rancher/mirrored-coreos-etcd:v3.5.3] exists on host [x.x.x.11]"
time="2023-09-07T16:08:42Z" level=info msg="Starting container [etcd] on host [x.x.x.11], try #1"
time="2023-09-07T16:08:42Z" level=info msg="[etcd] Successfully started [etcd] container on host [x.x.x.11]"
time="2023-09-07T16:08:42Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.11]"
time="2023-09-07T16:08:42Z" level=info msg="Starting container [rke-log-linker] on host [x.x.x.11], try #1"
time="2023-09-07T16:08:43Z" level=info msg="[etcd] Successfully started [rke-log-linker] container on host [x.x.x.11]"
time="2023-09-07T16:08:43Z" level=info msg="Removing container [rke-log-linker] on host [x.x.x.11], try #1"
time="2023-09-07T16:08:43Z" level=info msg="[remove/rke-log-linker] Successfully removed container on host [x.x.x.11]"
time="2023-09-07T16:08:53Z" level=info msg="[etcd] etcd host [x.x.x.61] reported healthy=true"
time="2023-09-07T16:08:53Z" level=info msg="[hosts] host [x.x.x.197] has another role, skipping delete from kubernetes cluster"
time="2023-09-07T16:08:53Z" level=info msg="[worker] Tearing down Worker Plane.."
time="2023-09-07T16:08:53Z" level=info msg="[worker] Host [x.x.x.197] is already a controlplane host, nothing to do."
time="2023-09-07T16:08:53Z" level=info msg="[worker] Successfully tore down Worker Plane.."
time="2023-09-07T16:08:53Z" level=info msg="[hosts] Host [x.x.x.197] is already a controlplane or etcd host, skipping cleanup."
time="2023-09-07T16:08:53Z" level=info msg="[hosts] Cordoning host [x.x.x.197]"
time="2023-09-07T16:08:53Z" level=info msg="[hosts] Deleting host [x.x.x.197] from the cluster"
time="2023-09-07T16:08:53Z" level=info msg="[hosts] Successfully deleted host [x.x.x.197] from the cluster"
time="2023-09-07T16:08:53Z" level=info msg="[controlplane] Tearing down the Controller Plane.."
time="2023-09-07T16:08:53Z" level=info msg="Removing container [kube-apiserver] on host [x.x.x.197], try #1"
time="2023-09-07T16:08:53Z" level=info msg="[remove/kube-apiserver] Successfully removed container on host [x.x.x.197]"
time="2023-09-07T16:08:53Z" level=info msg="Removing container [kube-controller-manager] on host [x.x.x.197], try #1"
time="2023-09-07T16:08:53Z" level=info msg="[remove/kube-controller-manager] Successfully removed container on host [x.x.x.197]"
time="2023-09-07T16:08:53Z" level=info msg="Removing container [kube-scheduler] on host [x.x.x.197], try #1"
time="2023-09-07T16:08:53Z" level=info msg="[remove/kube-scheduler] Successfully removed container on host [x.x.x.197]"
time="2023-09-07T16:08:53Z" level=info msg="Removing container [kubelet] on host [x.x.x.197], try #1"
time="2023-09-07T16:08:54Z" level=info msg="[remove/kubelet] Successfully removed container on host [x.x.x.197]"
time="2023-09-07T16:08:54Z" level=info msg="Removing container [kube-proxy] on host [x.x.x.197], try #1"
time="2023-09-07T16:08:54Z" level=info msg="[remove/kube-proxy] Successfully removed container on host [x.x.x.197]"
time="2023-09-07T16:08:54Z" level=info msg="Removing container [service-sidekick] on host [x.x.x.197], try #1"
time="2023-09-07T16:08:54Z" level=info msg="[remove/service-sidekick] Successfully removed container on host [x.x.x.197]"
time="2023-09-07T16:08:54Z" level=info msg="[controlplane] Successfully tore down Controller Plane.."
time="2023-09-07T16:08:54Z" level=info msg="[hosts] Cleaning up host [x.x.x.197]"
time="2023-09-07T16:08:54Z" level=info msg="[hosts] Running cleaner container on host [x.x.x.197]"
time="2023-09-07T16:08:54Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.197]"
time="2023-09-07T16:08:54Z" level=info msg="Starting container [kube-cleaner] on host [x.x.x.197], try #1"
time="2023-09-07T16:08:54Z" level=info msg="[kube-cleaner] Successfully started [kube-cleaner] container on host [x.x.x.197]"
time="2023-09-07T16:08:54Z" level=info msg="Waiting for [kube-cleaner] container to exit on host [x.x.x.197]"
time="2023-09-07T16:08:55Z" level=info msg="Container [kube-cleaner] is still running on host [x.x.x.197]: stderr: [], stdout: []"
time="2023-09-07T16:08:56Z" level=info msg="[hosts] Removing cleaner container on host [x.x.x.197]"
time="2023-09-07T16:08:56Z" level=info msg="Removing container [kube-cleaner] on host [x.x.x.197], try #1"
time="2023-09-07T16:08:56Z" level=info msg="[hosts] Removing dead container logs on host [x.x.x.197]"
time="2023-09-07T16:08:56Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.197]"
time="2023-09-07T16:08:56Z" level=info msg="Starting container [rke-log-cleaner] on host [x.x.x.197], try #1"
time="2023-09-07T16:08:56Z" level=info msg="[cleanup] Successfully started [rke-log-cleaner] container on host [x.x.x.197]"
time="2023-09-07T16:08:56Z" level=info msg="Removing container [rke-log-cleaner] on host [x.x.x.197], try #1"
time="2023-09-07T16:08:56Z" level=info msg="[remove/rke-log-cleaner] Successfully removed container on host [x.x.x.197]"
time="2023-09-07T16:08:56Z" level=info msg="[hosts] Successfully cleaned up host [x.x.x.197]"
time="2023-09-07T16:08:56Z" level=info msg="[reconcile] Rebuilding and updating local kube config"
time="2023-09-07T16:08:56Z" level=info msg="Successfully Deployed local admin kubeconfig at [/root/myenv/kube_config_rancher-cluster.yaml]"
time="2023-09-07T16:08:56Z" level=info msg="[reconcile] host [x.x.x.61] is a control plane node with reachable Kubernetes API endpoint in the cluster"
time="2023-09-07T16:08:56Z" level=info msg="Restarting container [kube-apiserver] on host [x.x.x.155], try #1"
time="2023-09-07T16:09:02Z" level=info msg="[restart/kube-apiserver] Successfully restarted container on host [x.x.x.155]"
time="2023-09-07T16:09:02Z" level=info msg="Restarting container [kube-controller-manager] on host [x.x.x.155], try #1"
time="2023-09-07T16:09:02Z" level=info msg="[restart/kube-controller-manager] Successfully restarted container on host [x.x.x.155]"
time="2023-09-07T16:09:02Z" level=info msg="Restarting container [kube-apiserver] on host [x.x.x.61], try #1"
time="2023-09-07T16:09:08Z" level=info msg="[restart/kube-apiserver] Successfully restarted container on host [x.x.x.61]"
time="2023-09-07T16:09:08Z" level=info msg="Restarting container [kube-controller-manager] on host [x.x.x.61], try #1"
time="2023-09-07T16:09:08Z" level=info msg="[restart/kube-controller-manager] Successfully restarted container on host [x.x.x.61]"
time="2023-09-07T16:09:08Z" level=info msg="Restarting container [etcd] on host [x.x.x.61], try #1"
time="2023-09-07T16:09:10Z" level=info msg="[restart/etcd] Successfully restarted container on host [x.x.x.61]"
time="2023-09-07T16:09:10Z" level=info msg="Restarting container [etcd] on host [x.x.x.11], try #1"
time="2023-09-07T16:09:11Z" level=info msg="[restart/etcd] Successfully restarted container on host [x.x.x.11]"
time="2023-09-07T16:09:11Z" level=info msg="Restarting container [etcd] on host [x.x.x.155], try #1"
time="2023-09-07T16:09:13Z" level=info msg="[restart/etcd] Successfully restarted container on host [x.x.x.155]"
time="2023-09-07T16:09:13Z" level=info msg="[reconcile] Reconciled cluster state successfully"
time="2023-09-07T16:09:13Z" level=info msg="max_unavailable_worker got rounded down to 0, resetting to 1"
time="2023-09-07T16:09:13Z" level=info msg="Setting maxUnavailable for worker nodes to: 1"
time="2023-09-07T16:09:13Z" level=info msg="Setting maxUnavailable for controlplane nodes to: 1"
time="2023-09-07T16:09:13Z" level=info msg="Pre-pulling kubernetes images"
time="2023-09-07T16:09:13Z" level=info msg="Pulling image [harbor:5000/rancher/hyperkube:v1.23.10-rancher1] on host [x.x.x.11], try #1"
time="2023-09-07T16:09:13Z" level=info msg="Image [harbor:5000/rancher/hyperkube:v1.23.10-rancher1] exists on host [x.x.x.155]"
time="2023-09-07T16:09:13Z" level=info msg="Image [harbor:5000/rancher/hyperkube:v1.23.10-rancher1] exists on host [x.x.x.61]"
time="2023-09-07T16:09:47Z" level=info msg="Image [harbor:5000/rancher/hyperkube:v1.23.10-rancher1] exists on host [x.x.x.11]"
time="2023-09-07T16:09:47Z" level=info msg="Kubernetes images pulled successfully"
time="2023-09-07T16:09:47Z" level=info msg="[etcd] Building up etcd plane.."
time="2023-09-07T16:09:47Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.61]"
time="2023-09-07T16:09:47Z" level=info msg="Starting container [etcd-fix-perm] on host [x.x.x.61], try #1"
time="2023-09-07T16:09:48Z" level=info msg="Successfully started [etcd-fix-perm] container on host [x.x.x.61]"
time="2023-09-07T16:09:48Z" level=info msg="Waiting for [etcd-fix-perm] container to exit on host [x.x.x.61]"
time="2023-09-07T16:09:48Z" level=info msg="Waiting for [etcd-fix-perm] container to exit on host [x.x.x.61]"
time="2023-09-07T16:09:48Z" level=info msg="Container [etcd-fix-perm] is still running on host [x.x.x.61]: stderr: [], stdout: []"
time="2023-09-07T16:09:49Z" level=info msg="Removing container [etcd-fix-perm] on host [x.x.x.61], try #1"
time="2023-09-07T16:09:49Z" level=info msg="[remove/etcd-fix-perm] Successfully removed container on host [x.x.x.61]"
time="2023-09-07T16:09:49Z" level=info msg="Finding container [etcd] on host [x.x.x.61], try #1"
time="2023-09-07T16:09:49Z" level=info msg="Image [harbor:5000/rancher/mirrored-coreos-etcd:v3.5.3] exists on host [x.x.x.61]"
time="2023-09-07T16:09:49Z" level=info msg="Finding container [old-etcd] on host [x.x.x.61], try #1"
time="2023-09-07T16:09:49Z" level=info msg="Stopping container [etcd] on host [x.x.x.61] with stopTimeoutDuration [5s], try #1"
time="2023-09-07T16:09:50Z" level=info msg="Waiting for [etcd] container to exit on host [x.x.x.61]"
time="2023-09-07T16:09:50Z" level=info msg="Renaming container [etcd] to [old-etcd] on host [x.x.x.61], try #1"
time="2023-09-07T16:09:50Z" level=info msg="Starting container [etcd] on host [x.x.x.61], try #1"
time="2023-09-07T16:09:50Z" level=info msg="[etcd] Successfully updated [etcd] container on host [x.x.x.61]"
time="2023-09-07T16:09:50Z" level=info msg="Removing container [old-etcd] on host [x.x.x.61], try #1"
time="2023-09-07T16:09:50Z" level=info msg="[etcd] Snapshots configured to S3 compatible backend at [s3-us-gov-west-1.amazonaws.com] to bucket [myenv-rancher] using accesskey [] and using region [us-gov-west-1] and using endpoint CA [xxx] and using folder [myenv/local-cluster/etcd-snapshots]"
time="2023-09-07T16:09:50Z" level=info msg="[etcd] Running rolling snapshot container [etcd-snapshot-once] on host [x.x.x.61]"
time="2023-09-07T16:09:50Z" level=info msg="Removing container [etcd-rolling-snapshots] on host [x.x.x.61], try #1"
time="2023-09-07T16:09:51Z" level=info msg="[remove/etcd-rolling-snapshots] Successfully removed container on host [x.x.x.61]"
time="2023-09-07T16:09:51Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.61]"
time="2023-09-07T16:09:51Z" level=info msg="Starting container [etcd-rolling-snapshots] on host [x.x.x.61], try #1"
time="2023-09-07T16:09:52Z" level=info msg="[etcd] Successfully started [etcd-rolling-snapshots] container on host [x.x.x.61]"
time="2023-09-07T16:09:57Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.61]"
time="2023-09-07T16:09:57Z" level=info msg="Starting container [rke-bundle-cert] on host [x.x.x.61], try #1"
time="2023-09-07T16:09:57Z" level=info msg="[certificates] Successfully started [rke-bundle-cert] container on host [x.x.x.61]"
time="2023-09-07T16:09:57Z" level=info msg="Waiting for [rke-bundle-cert] container to exit on host [x.x.x.61]"
time="2023-09-07T16:09:57Z" level=info msg="Container [rke-bundle-cert] is still running on host [x.x.x.61]: stderr: [], stdout: []"
time="2023-09-07T16:09:58Z" level=info msg="[certificates] successfully saved certificate bundle [/opt/rke/etcd-snapshots//pki.bundle.tar.gz] on host [x.x.x.61]"
time="2023-09-07T16:09:58Z" level=info msg="Removing container [rke-bundle-cert] on host [x.x.x.61], try #1"
time="2023-09-07T16:09:58Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.61]"
time="2023-09-07T16:09:59Z" level=info msg="Starting container [rke-log-linker] on host [x.x.x.61], try #1"
time="2023-09-07T16:09:59Z" level=info msg="[etcd] Successfully started [rke-log-linker] container on host [x.x.x.61]"
time="2023-09-07T16:09:59Z" level=info msg="Removing container [rke-log-linker] on host [x.x.x.61], try #1"
time="2023-09-07T16:09:59Z" level=info msg="[remove/rke-log-linker] Successfully removed container on host [x.x.x.61]"
time="2023-09-07T16:09:59Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.61]"
time="2023-09-07T16:09:59Z" level=info msg="Starting container [rke-log-linker] on host [x.x.x.61], try #1"
time="2023-09-07T16:10:00Z" level=info msg="[etcd] Successfully started [rke-log-linker] container on host [x.x.x.61]"
time="2023-09-07T16:10:00Z" level=info msg="Removing container [rke-log-linker] on host [x.x.x.61], try #1"
time="2023-09-07T16:10:00Z" level=info msg="[remove/rke-log-linker] Successfully removed container on host [x.x.x.61]"
time="2023-09-07T16:10:00Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.11]"
time="2023-09-07T16:10:04Z" level=info msg="Starting container [etcd-fix-perm] on host [x.x.x.11], try #1"
time="2023-09-07T16:10:05Z" level=info msg="Successfully started [etcd-fix-perm] container on host [x.x.x.11]"
time="2023-09-07T16:10:05Z" level=info msg="Waiting for [etcd-fix-perm] container to exit on host [x.x.x.11]"
time="2023-09-07T16:10:05Z" level=info msg="Waiting for [etcd-fix-perm] container to exit on host [x.x.x.11]"
time="2023-09-07T16:10:05Z" level=info msg="Container [etcd-fix-perm] is still running on host [x.x.x.11]: stderr: [], stdout: []"
time="2023-09-07T16:10:06Z" level=info msg="Removing container [etcd-fix-perm] on host [x.x.x.11], try #1"
time="2023-09-07T16:10:06Z" level=info msg="[remove/etcd-fix-perm] Successfully removed container on host [x.x.x.11]"
time="2023-09-07T16:10:06Z" level=info msg="[etcd] Snapshots configured to S3 compatible backend at [s3-us-gov-west-1.amazonaws.com] to bucket [myenv-rancher] using accesskey [] and using region [us-gov-west-1] and using endpoint CA [xxx] and using folder [myenv/local-cluster/etcd-snapshots]"
time="2023-09-07T16:10:06Z" level=info msg="[etcd] Running rolling snapshot container [etcd-snapshot-once] on host [x.x.x.11]"
time="2023-09-07T16:10:06Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.11]"
time="2023-09-07T16:10:06Z" level=info msg="Starting container [etcd-rolling-snapshots] on host [x.x.x.11], try #1"
time="2023-09-07T16:10:06Z" level=info msg="[etcd] Successfully started [etcd-rolling-snapshots] container on host [x.x.x.11]"
time="2023-09-07T16:10:11Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.11]"
time="2023-09-07T16:10:12Z" level=info msg="Starting container [rke-bundle-cert] on host [x.x.x.11], try #1"
time="2023-09-07T16:10:12Z" level=info msg="[certificates] Successfully started [rke-bundle-cert] container on host [x.x.x.11]"
time="2023-09-07T16:10:12Z" level=info msg="Waiting for [rke-bundle-cert] container to exit on host [x.x.x.11]"
time="2023-09-07T16:10:12Z" level=info msg="Container [rke-bundle-cert] is still running on host [x.x.x.11]: stderr: [], stdout: []"
time="2023-09-07T16:10:13Z" level=info msg="[certificates] successfully saved certificate bundle [/opt/rke/etcd-snapshots//pki.bundle.tar.gz] on host [x.x.x.11]"
time="2023-09-07T16:10:13Z" level=info msg="Removing container [rke-bundle-cert] on host [x.x.x.11], try #1"
time="2023-09-07T16:10:13Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.11]"
time="2023-09-07T16:10:14Z" level=info msg="Starting container [rke-log-linker] on host [x.x.x.11], try #1"
time="2023-09-07T16:10:14Z" level=info msg="[etcd] Successfully started [rke-log-linker] container on host [x.x.x.11]"
time="2023-09-07T16:10:14Z" level=info msg="Removing container [rke-log-linker] on host [x.x.x.11], try #1"
time="2023-09-07T16:10:14Z" level=info msg="[remove/rke-log-linker] Successfully removed container on host [x.x.x.11]"
time="2023-09-07T16:10:14Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.11]"
time="2023-09-07T16:10:14Z" level=info msg="Starting container [rke-log-linker] on host [x.x.x.11], try #1"
time="2023-09-07T16:10:15Z" level=info msg="[etcd] Successfully started [rke-log-linker] container on host [x.x.x.11]"
time="2023-09-07T16:10:15Z" level=info msg="Removing container [rke-log-linker] on host [x.x.x.11], try #1"
time="2023-09-07T16:10:15Z" level=info msg="[remove/rke-log-linker] Successfully removed container on host [x.x.x.11]"
time="2023-09-07T16:10:15Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.155]"
time="2023-09-07T16:10:15Z" level=info msg="Starting container [etcd-fix-perm] on host [x.x.x.155], try #1"
time="2023-09-07T16:10:16Z" level=info msg="Successfully started [etcd-fix-perm] container on host [x.x.x.155]"
time="2023-09-07T16:10:16Z" level=info msg="Waiting for [etcd-fix-perm] container to exit on host [x.x.x.155]"
time="2023-09-07T16:10:16Z" level=info msg="Waiting for [etcd-fix-perm] container to exit on host [x.x.x.155]"
time="2023-09-07T16:10:16Z" level=info msg="Container [etcd-fix-perm] is still running on host [x.x.x.155]: stderr: [], stdout: []"
time="2023-09-07T16:10:17Z" level=info msg="Removing container [etcd-fix-perm] on host [x.x.x.155], try #1"
time="2023-09-07T16:10:17Z" level=info msg="[remove/etcd-fix-perm] Successfully removed container on host [x.x.x.155]"
time="2023-09-07T16:10:17Z" level=info msg="Finding container [etcd] on host [x.x.x.155], try #1"
time="2023-09-07T16:10:17Z" level=info msg="Image [harbor:5000/rancher/mirrored-coreos-etcd:v3.5.3] exists on host [x.x.x.155]"
time="2023-09-07T16:10:17Z" level=info msg="Finding container [old-etcd] on host [x.x.x.155], try #1"
time="2023-09-07T16:10:17Z" level=info msg="Stopping container [etcd] on host [x.x.x.155] with stopTimeoutDuration [5s], try #1"
time="2023-09-07T16:10:18Z" level=info msg="Waiting for [etcd] container to exit on host [x.x.x.155]"
time="2023-09-07T16:10:18Z" level=info msg="Renaming container [etcd] to [old-etcd] on host [x.x.x.155], try #1"
time="2023-09-07T16:10:18Z" level=info msg="Starting container [etcd] on host [x.x.x.155], try #1"
time="2023-09-07T16:10:18Z" level=info msg="[etcd] Successfully updated [etcd] container on host [x.x.x.155]"
time="2023-09-07T16:10:18Z" level=info msg="Removing container [old-etcd] on host [x.x.x.155], try #1"
time="2023-09-07T16:10:18Z" level=info msg="[etcd] Snapshots configured to S3 compatible backend at [s3-us-gov-west-1.amazonaws.com] to bucket [myenv-rancher] using accesskey [] and using region [us-gov-west-1] and using endpoint CA [xxx] and using folder [myenv/local-cluster/etcd-snapshots]"
time="2023-09-07T16:10:18Z" level=info msg="[etcd] Running rolling snapshot container [etcd-snapshot-once] on host [x.x.x.155]"
time="2023-09-07T16:10:18Z" level=info msg="Removing container [etcd-rolling-snapshots] on host [x.x.x.155], try #1"
time="2023-09-07T16:10:19Z" level=info msg="[remove/etcd-rolling-snapshots] Successfully removed container on host [x.x.x.155]"
time="2023-09-07T16:10:19Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.155]"
time="2023-09-07T16:10:19Z" level=info msg="Starting container [etcd-rolling-snapshots] on host [x.x.x.155], try #1"
time="2023-09-07T16:10:19Z" level=info msg="[etcd] Successfully started [etcd-rolling-snapshots] container on host [x.x.x.155]"
time="2023-09-07T16:10:24Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.155]"
time="2023-09-07T16:10:25Z" level=info msg="Starting container [rke-bundle-cert] on host [x.x.x.155], try #1"
time="2023-09-07T16:10:25Z" level=info msg="[certificates] Successfully started [rke-bundle-cert] container on host [x.x.x.155]"
time="2023-09-07T16:10:25Z" level=info msg="Waiting for [rke-bundle-cert] container to exit on host [x.x.x.155]"
time="2023-09-07T16:10:25Z" level=info msg="Container [rke-bundle-cert] is still running on host [x.x.x.155]: stderr: [], stdout: []"
time="2023-09-07T16:10:26Z" level=info msg="[certificates] successfully saved certificate bundle [/opt/rke/etcd-snapshots//pki.bundle.tar.gz] on host [x.x.x.155]"
time="2023-09-07T16:10:26Z" level=info msg="Removing container [rke-bundle-cert] on host [x.x.x.155], try #1"
time="2023-09-07T16:10:26Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.155]"
time="2023-09-07T16:10:27Z" level=info msg="Starting container [rke-log-linker] on host [x.x.x.155], try #1"
time="2023-09-07T16:10:27Z" level=info msg="[etcd] Successfully started [rke-log-linker] container on host [x.x.x.155]"
time="2023-09-07T16:10:27Z" level=info msg="Removing container [rke-log-linker] on host [x.x.x.155], try #1"
time="2023-09-07T16:10:27Z" level=info msg="[remove/rke-log-linker] Successfully removed container on host [x.x.x.155]"
time="2023-09-07T16:10:27Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.155]"
time="2023-09-07T16:10:27Z" level=info msg="Starting container [rke-log-linker] on host [x.x.x.155], try #1"
time="2023-09-07T16:10:28Z" level=info msg="[etcd] Successfully started [rke-log-linker] container on host [x.x.x.155]"
time="2023-09-07T16:10:28Z" level=info msg="Removing container [rke-log-linker] on host [x.x.x.155], try #1"
time="2023-09-07T16:10:28Z" level=info msg="[remove/rke-log-linker] Successfully removed container on host [x.x.x.155]"
time="2023-09-07T16:10:28Z" level=info msg="[etcd] Successfully started etcd plane.. Checking etcd cluster health"
time="2023-09-07T16:10:28Z" level=info msg="[etcd] etcd host [x.x.x.61] reported healthy=true"
time="2023-09-07T16:10:28Z" level=info msg="[controlplane] Now checking status of node x.x.x.61, try #1"
time="2023-09-07T16:10:28Z" level=info msg="[controlplane] Now checking status of node x.x.x.155, try #1"
time="2023-09-07T16:10:28Z" level=info msg="[controlplane] Processing controlplane hosts for upgrade 1 at a time"
time="2023-09-07T16:10:28Z" level=info msg="[controlplane] Adding controlplane nodes x.x.x.11 to the cluster"
time="2023-09-07T16:10:28Z" level=info msg="[controlplane] Parameters provided to drain command: \"Force: true, IgnoreAllDaemonSets: true, DeleteEmptyDirData: true, Timeout: 15m0s, GracePeriodSeconds: -1\""
time="2023-09-07T16:10:28Z" level=info msg="Processing controlplane host x.x.x.61"
time="2023-09-07T16:10:28Z" level=info msg="[controlplane] Now checking status of node x.x.x.61, try #1"
time="2023-09-07T16:10:28Z" level=info msg="[controlplane] Getting list of nodes for upgrade"
time="2023-09-07T16:10:59Z" level=info msg="Upgrading controlplane components for control host x.x.x.61"
time="2023-09-07T16:10:59Z" level=info msg="Finding container [service-sidekick] on host [x.x.x.61], try #1"
time="2023-09-07T16:10:59Z" level=info msg="[sidekick] Sidekick container already created on host [x.x.x.61]"
time="2023-09-07T16:10:59Z" level=info msg="Finding container [kube-apiserver] on host [x.x.x.61], try #1"
time="2023-09-07T16:11:00Z" level=info msg="Image [harbor:5000/rancher/hyperkube:v1.23.10-rancher1] exists on host [x.x.x.61]"
time="2023-09-07T16:11:00Z" level=info msg="Finding container [old-kube-apiserver] on host [x.x.x.61], try #1"
time="2023-09-07T16:11:00Z" level=info msg="Stopping container [kube-apiserver] on host [x.x.x.61] with stopTimeoutDuration [5s], try #1"
time="2023-09-07T16:11:05Z" level=info msg="Waiting for [kube-apiserver] container to exit on host [x.x.x.61]"
time="2023-09-07T16:11:05Z" level=info msg="Renaming container [kube-apiserver] to [old-kube-apiserver] on host [x.x.x.61], try #1"
time="2023-09-07T16:11:05Z" level=info msg="Starting container [kube-apiserver] on host [x.x.x.61], try #1"
time="2023-09-07T16:11:05Z" level=info msg="[controlplane] Successfully updated [kube-apiserver] container on host [x.x.x.61]"
time="2023-09-07T16:11:05Z" level=info msg="Removing container [old-kube-apiserver] on host [x.x.x.61], try #1"
time="2023-09-07T16:11:05Z" level=info msg="[healthcheck] Start Healthcheck on service [kube-apiserver] on host [x.x.x.61]"
time="2023-09-07T16:11:13Z" level=info msg="[healthcheck] service [kube-apiserver] on host [x.x.x.61] is healthy"
time="2023-09-07T16:11:13Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.61]"
time="2023-09-07T16:11:14Z" level=info msg="Starting container [rke-log-linker] on host [x.x.x.61], try #1"
time="2023-09-07T16:11:14Z" level=info msg="[controlplane] Successfully started [rke-log-linker] container on host [x.x.x.61]"
time="2023-09-07T16:11:14Z" level=info msg="Removing container [rke-log-linker] on host [x.x.x.61], try #1"
time="2023-09-07T16:11:14Z" level=info msg="[remove/rke-log-linker] Successfully removed container on host [x.x.x.61]"
time="2023-09-07T16:11:14Z" level=info msg="[healthcheck] Start Healthcheck on service [kube-controller-manager] on host [x.x.x.61]"
time="2023-09-07T16:11:14Z" level=info msg="[healthcheck] service [kube-controller-manager] on host [x.x.x.61] is healthy"
time="2023-09-07T16:11:14Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.61]"
time="2023-09-07T16:11:15Z" level=info msg="Starting container [rke-log-linker] on host [x.x.x.61], try #1"
time="2023-09-07T16:11:15Z" level=info msg="[controlplane] Successfully started [rke-log-linker] container on host [x.x.x.61]"
time="2023-09-07T16:11:15Z" level=info msg="Removing container [rke-log-linker] on host [x.x.x.61], try #1"
time="2023-09-07T16:11:15Z" level=info msg="[remove/rke-log-linker] Successfully removed container on host [x.x.x.61]"
time="2023-09-07T16:11:15Z" level=info msg="[healthcheck] Start Healthcheck on service [kube-scheduler] on host [x.x.x.61]"
time="2023-09-07T16:11:16Z" level=info msg="[healthcheck] service [kube-scheduler] on host [x.x.x.61] is healthy"
time="2023-09-07T16:11:16Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.61]"
time="2023-09-07T16:11:16Z" level=info msg="Starting container [rke-log-linker] on host [x.x.x.61], try #1"
time="2023-09-07T16:11:16Z" level=info msg="[controlplane] Successfully started [rke-log-linker] container on host [x.x.x.61]"
time="2023-09-07T16:11:16Z" level=info msg="Removing container [rke-log-linker] on host [x.x.x.61], try #1"
time="2023-09-07T16:11:17Z" level=info msg="[remove/rke-log-linker] Successfully removed container on host [x.x.x.61]"
time="2023-09-07T16:11:17Z" level=info msg="[controlplane] Now checking status of node x.x.x.61, try #1"
time="2023-09-07T16:11:17Z" level=info msg="Processing controlplane host x.x.x.11"
time="2023-09-07T16:11:17Z" level=info msg="Finding container [service-sidekick] on host [x.x.x.11], try #1"
time="2023-09-07T16:11:17Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.11]"
time="2023-09-07T16:11:17Z" level=info msg="Image [harbor:5000/rancher/hyperkube:v1.23.10-rancher1] exists on host [x.x.x.11]"
time="2023-09-07T16:11:17Z" level=info msg="Starting container [kube-apiserver] on host [x.x.x.11], try #1"
time="2023-09-07T16:11:17Z" level=info msg="[controlplane] Successfully started [kube-apiserver] container on host [x.x.x.11]"
time="2023-09-07T16:11:17Z" level=info msg="[healthcheck] Start Healthcheck on service [kube-apiserver] on host [x.x.x.11]"
time="2023-09-07T16:11:25Z" level=info msg="[healthcheck] service [kube-apiserver] on host [x.x.x.11] is healthy"
time="2023-09-07T16:11:25Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.11]"
time="2023-09-07T16:11:26Z" level=info msg="Starting container [rke-log-linker] on host [x.x.x.11], try #1"
time="2023-09-07T16:11:26Z" level=info msg="[controlplane] Successfully started [rke-log-linker] container on host [x.x.x.11]"
time="2023-09-07T16:11:26Z" level=info msg="Removing container [rke-log-linker] on host [x.x.x.11], try #1"
time="2023-09-07T16:11:26Z" level=info msg="[remove/rke-log-linker] Successfully removed container on host [x.x.x.11]"
time="2023-09-07T16:11:26Z" level=info msg="Image [harbor:5000/rancher/hyperkube:v1.23.10-rancher1] exists on host [x.x.x.11]"
time="2023-09-07T16:11:26Z" level=info msg="Starting container [kube-controller-manager] on host [x.x.x.11], try #1"
time="2023-09-07T16:11:26Z" level=info msg="[controlplane] Successfully started [kube-controller-manager] container on host [x.x.x.11]"
time="2023-09-07T16:11:26Z" level=info msg="[healthcheck] Start Healthcheck on service [kube-controller-manager] on host [x.x.x.11]"
time="2023-09-07T16:11:32Z" level=info msg="[healthcheck] service [kube-controller-manager] on host [x.x.x.11] is healthy"
time="2023-09-07T16:11:32Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.11]"
time="2023-09-07T16:11:32Z" level=info msg="Starting container [rke-log-linker] on host [x.x.x.11], try #1"
time="2023-09-07T16:11:32Z" level=info msg="[controlplane] Successfully started [rke-log-linker] container on host [x.x.x.11]"
time="2023-09-07T16:11:32Z" level=info msg="Removing container [rke-log-linker] on host [x.x.x.11], try #1"
time="2023-09-07T16:11:32Z" level=info msg="[remove/rke-log-linker] Successfully removed container on host [x.x.x.11]"
time="2023-09-07T16:11:32Z" level=info msg="Image [harbor:5000/rancher/hyperkube:v1.23.10-rancher1] exists on host [x.x.x.11]"
time="2023-09-07T16:11:32Z" level=info msg="Starting container [kube-scheduler] on host [x.x.x.11], try #1"
time="2023-09-07T16:11:33Z" level=info msg="[controlplane] Successfully started [kube-scheduler] container on host [x.x.x.11]"
time="2023-09-07T16:11:33Z" level=info msg="[healthcheck] Start Healthcheck on service [kube-scheduler] on host [x.x.x.11]"
time="2023-09-07T16:11:38Z" level=info msg="[healthcheck] service [kube-scheduler] on host [x.x.x.11] is healthy"
time="2023-09-07T16:11:38Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.11]"
time="2023-09-07T16:11:38Z" level=info msg="Starting container [rke-log-linker] on host [x.x.x.11], try #1"
time="2023-09-07T16:11:39Z" level=info msg="[controlplane] Successfully started [rke-log-linker] container on host [x.x.x.11]"
time="2023-09-07T16:11:39Z" level=info msg="Removing container [rke-log-linker] on host [x.x.x.11], try #1"
time="2023-09-07T16:11:39Z" level=info msg="[remove/rke-log-linker] Successfully removed container on host [x.x.x.11]"
time="2023-09-07T16:11:39Z" level=info msg="Finding container [service-sidekick] on host [x.x.x.11], try #1"
time="2023-09-07T16:11:39Z" level=info msg="[sidekick] Sidekick container already created on host [x.x.x.11]"
time="2023-09-07T16:11:39Z" level=info msg="Image [harbor:5000/rancher/hyperkube:v1.23.10-rancher1] exists on host [x.x.x.11]"
time="2023-09-07T16:11:39Z" level=info msg="Starting container [kubelet] on host [x.x.x.11], try #1"
time="2023-09-07T16:11:39Z" level=info msg="[worker] Successfully started [kubelet] container on host [x.x.x.11]"
time="2023-09-07T16:11:39Z" level=info msg="[healthcheck] Start Healthcheck on service [kubelet] on host [x.x.x.11]"
time="2023-09-07T16:11:45Z" level=info msg="[healthcheck] service [kubelet] on host [x.x.x.11] is healthy"
time="2023-09-07T16:11:45Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.11]"
time="2023-09-07T16:11:46Z" level=info msg="Starting container [rke-log-linker] on host [x.x.x.11], try #1"
time="2023-09-07T16:11:47Z" level=info msg="[worker] Successfully started [rke-log-linker] container on host [x.x.x.11]"
time="2023-09-07T16:11:47Z" level=info msg="Removing container [rke-log-linker] on host [x.x.x.11], try #1"
time="2023-09-07T16:11:47Z" level=info msg="[remove/rke-log-linker] Successfully removed container on host [x.x.x.11]"
time="2023-09-07T16:11:47Z" level=info msg="Image [harbor:5000/rancher/hyperkube:v1.23.10-rancher1] exists on host [x.x.x.11]"
time="2023-09-07T16:11:47Z" level=info msg="Starting container [kube-proxy] on host [x.x.x.11], try #1"
time="2023-09-07T16:11:47Z" level=info msg="[worker] Successfully started [kube-proxy] container on host [x.x.x.11]"
time="2023-09-07T16:11:47Z" level=info msg="[healthcheck] Start Healthcheck on service [kube-proxy] on host [x.x.x.11]"
time="2023-09-07T16:11:53Z" level=info msg="[healthcheck] service [kube-proxy] on host [x.x.x.11] is healthy"
time="2023-09-07T16:11:53Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.11]"
time="2023-09-07T16:11:54Z" level=info msg="Starting container [rke-log-linker] on host [x.x.x.11], try #1"
time="2023-09-07T16:11:55Z" level=info msg="[worker] Successfully started [rke-log-linker] container on host [x.x.x.11]"
time="2023-09-07T16:11:55Z" level=info msg="Removing container [rke-log-linker] on host [x.x.x.11], try #1"
time="2023-09-07T16:11:55Z" level=info msg="[remove/rke-log-linker] Successfully removed container on host [x.x.x.11]"
time="2023-09-07T16:11:55Z" level=info msg="Processing controlplane host x.x.x.155"
time="2023-09-07T16:11:55Z" level=info msg="[controlplane] Now checking status of node x.x.x.155, try #1"
time="2023-09-07T16:11:55Z" level=info msg="[controlplane] Getting list of nodes for upgrade"
time="2023-09-07T16:12:29Z" level=info msg="Upgrading controlplane components for control host x.x.x.155"
time="2023-09-07T16:12:29Z" level=info msg="Finding container [service-sidekick] on host [x.x.x.155], try #1"
time="2023-09-07T16:12:29Z" level=info msg="[sidekick] Sidekick container already created on host [x.x.x.155]"
time="2023-09-07T16:12:29Z" level=info msg="Finding container [kube-apiserver] on host [x.x.x.155], try #1"
time="2023-09-07T16:12:29Z" level=info msg="Image [harbor:5000/rancher/hyperkube:v1.23.10-rancher1] exists on host [x.x.x.155]"
time="2023-09-07T16:12:29Z" level=info msg="Finding container [old-kube-apiserver] on host [x.x.x.155], try #1"
time="2023-09-07T16:12:29Z" level=info msg="Stopping container [kube-apiserver] on host [x.x.x.155] with stopTimeoutDuration [5s], try #1"
time="2023-09-07T16:12:35Z" level=info msg="Waiting for [kube-apiserver] container to exit on host [x.x.x.155]"
time="2023-09-07T16:12:35Z" level=info msg="Renaming container [kube-apiserver] to [old-kube-apiserver] on host [x.x.x.155], try #1"
time="2023-09-07T16:12:35Z" level=info msg="Starting container [kube-apiserver] on host [x.x.x.155], try #1"
time="2023-09-07T16:12:35Z" level=info msg="[controlplane] Successfully updated [kube-apiserver] container on host [x.x.x.155]"
time="2023-09-07T16:12:35Z" level=info msg="Removing container [old-kube-apiserver] on host [x.x.x.155], try #1"
time="2023-09-07T16:12:35Z" level=info msg="[healthcheck] Start Healthcheck on service [kube-apiserver] on host [x.x.x.155]"
time="2023-09-07T16:12:49Z" level=info msg="[healthcheck] service [kube-apiserver] on host [x.x.x.155] is healthy"
time="2023-09-07T16:12:49Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.155]"
time="2023-09-07T16:12:49Z" level=info msg="Starting container [rke-log-linker] on host [x.x.x.155], try #1"
time="2023-09-07T16:12:50Z" level=info msg="[controlplane] Successfully started [rke-log-linker] container on host [x.x.x.155]"
time="2023-09-07T16:12:50Z" level=info msg="Removing container [rke-log-linker] on host [x.x.x.155], try #1"
time="2023-09-07T16:12:51Z" level=info msg="[remove/rke-log-linker] Successfully removed container on host [x.x.x.155]"
time="2023-09-07T16:12:51Z" level=info msg="[healthcheck] Start Healthcheck on service [kube-controller-manager] on host [x.x.x.155]"
time="2023-09-07T16:12:51Z" level=info msg="[healthcheck] service [kube-controller-manager] on host [x.x.x.155] is healthy"
time="2023-09-07T16:12:51Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.155]"
time="2023-09-07T16:12:51Z" level=info msg="Starting container [rke-log-linker] on host [x.x.x.155], try #1"
time="2023-09-07T16:12:52Z" level=info msg="[controlplane] Successfully started [rke-log-linker] container on host [x.x.x.155]"
time="2023-09-07T16:12:52Z" level=info msg="Removing container [rke-log-linker] on host [x.x.x.155], try #1"
time="2023-09-07T16:12:52Z" level=info msg="[remove/rke-log-linker] Successfully removed container on host [x.x.x.155]"
time="2023-09-07T16:12:52Z" level=info msg="[healthcheck] Start Healthcheck on service [kube-scheduler] on host [x.x.x.155]"
time="2023-09-07T16:12:52Z" level=info msg="[healthcheck] service [kube-scheduler] on host [x.x.x.155] is healthy"
time="2023-09-07T16:12:52Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.155]"
time="2023-09-07T16:12:53Z" level=info msg="Starting container [rke-log-linker] on host [x.x.x.155], try #1"
time="2023-09-07T16:12:53Z" level=info msg="[controlplane] Successfully started [rke-log-linker] container on host [x.x.x.155]"
time="2023-09-07T16:12:53Z" level=info msg="Removing container [rke-log-linker] on host [x.x.x.155], try #1"
time="2023-09-07T16:12:53Z" level=info msg="[remove/rke-log-linker] Successfully removed container on host [x.x.x.155]"
time="2023-09-07T16:12:53Z" level=info msg="[controlplane] Now checking status of node x.x.x.155, try #1"
time="2023-09-07T16:12:53Z" level=info msg="[controlplane] Successfully upgraded Controller Plane.."
time="2023-09-07T16:12:53Z" level=info msg="[authz] Creating rke-job-deployer ServiceAccount"
time="2023-09-07T16:12:53Z" level=info msg="[authz] rke-job-deployer ServiceAccount created successfully"
time="2023-09-07T16:12:53Z" level=info msg="[authz] Creating system:node ClusterRoleBinding"
time="2023-09-07T16:12:53Z" level=info msg="[authz] system:node ClusterRoleBinding created successfully"
time="2023-09-07T16:12:53Z" level=info msg="[authz] Creating kube-apiserver proxy ClusterRole and ClusterRoleBinding"
time="2023-09-07T16:12:54Z" level=info msg="[authz] kube-apiserver proxy ClusterRole and ClusterRoleBinding created successfully"
time="2023-09-07T16:12:54Z" level=info msg="Successfully Deployed state file at [/root/myenv/rancher-cluster.rkestate]"
time="2023-09-07T16:12:54Z" level=info msg="[state] Saving full cluster state to Kubernetes"
time="2023-09-07T16:12:54Z" level=info msg="[state] Successfully Saved full cluster state to Kubernetes ConfigMap: full-cluster-state"
time="2023-09-07T16:12:54Z" level=info msg="[worker] Upgrading Worker Plane.."
time="2023-09-07T16:12:54Z" level=info msg="[worker] Parameters provided to drain command: \"Force: true, IgnoreAllDaemonSets: true, DeleteEmptyDirData: true, Timeout: 15m0s, GracePeriodSeconds: -1\""
time="2023-09-07T16:12:54Z" level=info msg="[worker] Parameters provided to drain command: \"Force: true, IgnoreAllDaemonSets: true, DeleteEmptyDirData: true, Timeout: 15m0s, GracePeriodSeconds: -1\""
time="2023-09-07T16:12:54Z" level=info msg="[worker] Successfully upgraded Worker Plane.."
time="2023-09-07T16:12:54Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.155]"
time="2023-09-07T16:12:54Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.11]"
time="2023-09-07T16:12:54Z" level=info msg="Starting container [rke-log-cleaner] on host [x.x.x.155], try #1"
time="2023-09-07T16:12:54Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.61]"
time="2023-09-07T16:12:54Z" level=info msg="Starting container [rke-log-cleaner] on host [x.x.x.11], try #1"
time="2023-09-07T16:12:54Z" level=info msg="[cleanup] Successfully started [rke-log-cleaner] container on host [x.x.x.155]"
time="2023-09-07T16:12:54Z" level=info msg="Removing container [rke-log-cleaner] on host [x.x.x.155], try #1"
time="2023-09-07T16:12:54Z" level=info msg="[remove/rke-log-cleaner] Successfully removed container on host [x.x.x.155]"
time="2023-09-07T16:12:55Z" level=info msg="[cleanup] Successfully started [rke-log-cleaner] container on host [x.x.x.11]"
time="2023-09-07T16:12:55Z" level=info msg="Removing container [rke-log-cleaner] on host [x.x.x.11], try #1"
time="2023-09-07T16:12:55Z" level=info msg="[remove/rke-log-cleaner] Successfully removed container on host [x.x.x.11]"
time="2023-09-07T16:13:20Z" level=info msg="Starting container [rke-log-cleaner] on host [x.x.x.61], try #1"
time="2023-09-07T16:13:20Z" level=info msg="[cleanup] Successfully started [rke-log-cleaner] container on host [x.x.x.61]"
time="2023-09-07T16:13:20Z" level=info msg="Removing container [rke-log-cleaner] on host [x.x.x.61], try #1"
time="2023-09-07T16:13:27Z" level=info msg="[remove/rke-log-cleaner] Successfully removed container on host [x.x.x.61]"
time="2023-09-07T16:13:27Z" level=info msg="[sync] Syncing nodes Labels and Taints"
time="2023-09-07T16:13:27Z" level=info msg="[sync] Successfully synced nodes Labels and Taints"
time="2023-09-07T16:13:27Z" level=info msg="[network] Setting up network plugin: calico"
time="2023-09-07T16:13:27Z" level=info msg="[addons] Saving ConfigMap for addon rke-network-plugin to Kubernetes"
time="2023-09-07T16:13:27Z" level=info msg="[addons] Successfully saved ConfigMap for addon rke-network-plugin to Kubernetes"
time="2023-09-07T16:13:27Z" level=info msg="[addons] Executing deploy job rke-network-plugin"
time="2023-09-07T16:13:32Z" level=info msg="[addons] Setting up coredns"
time="2023-09-07T16:13:32Z" level=info msg="[addons] Saving ConfigMap for addon rke-coredns-addon to Kubernetes"
time="2023-09-07T16:13:32Z" level=info msg="[addons] Successfully saved ConfigMap for addon rke-coredns-addon to Kubernetes"
time="2023-09-07T16:13:32Z" level=info msg="[addons] Executing deploy job rke-coredns-addon"
time="2023-09-07T16:13:42Z" level=info msg="[addons] CoreDNS deployed successfully"
time="2023-09-07T16:13:42Z" level=info msg="[dns] DNS provider coredns deployed successfully"
time="2023-09-07T16:13:42Z" level=info msg="[addons] Setting up Metrics Server"
time="2023-09-07T16:13:42Z" level=info msg="[addons] Saving ConfigMap for addon rke-metrics-addon to Kubernetes"
time="2023-09-07T16:13:42Z" level=info msg="[addons] Successfully saved ConfigMap for addon rke-metrics-addon to Kubernetes"
time="2023-09-07T16:13:42Z" level=info msg="[addons] Executing deploy job rke-metrics-addon"
time="2023-09-07T16:13:47Z" level=info msg="[addons] Metrics Server deployed successfully"
time="2023-09-07T16:13:47Z" level=info msg="[ingress] Setting up nginx ingress controller"
time="2023-09-07T16:13:47Z" level=info msg="[ingress] removing admission batch jobs if they exist"
time="2023-09-07T16:13:47Z" level=info msg="[addons] Saving ConfigMap for addon rke-ingress-controller to Kubernetes"
time="2023-09-07T16:13:47Z" level=info msg="[addons] Successfully saved ConfigMap for addon rke-ingress-controller to Kubernetes"
time="2023-09-07T16:13:47Z" level=info msg="[addons] Executing deploy job rke-ingress-controller"
time="2023-09-07T16:13:52Z" level=info msg="[ingress] removing default backend service and deployment if they exist"
time="2023-09-07T16:13:52Z" level=info msg="[ingress] ingress controller nginx deployed successfully"
time="2023-09-07T16:13:52Z" level=info msg="[addons] Setting up user addons"
time="2023-09-07T16:13:52Z" level=info msg="[addons] Saving ConfigMap for addon rke-user-addon to Kubernetes"
time="2023-09-07T16:13:52Z" level=info msg="[addons] Successfully saved ConfigMap for addon rke-user-addon to Kubernetes"
time="2023-09-07T16:13:52Z" level=info msg="[addons] Executing deploy job rke-user-addon"
time="2023-09-07T16:13:57Z" level=info msg="[addons] User addons deployed successfully"
time="2023-09-07T16:13:57Z" level=info msg="Finished building Kubernetes cluster successfully"
I0907 16:11:58.781299   18237 request.go:601] Waited for 1.156102866s due to client-side throttling, not priority and fairness, request: GET:https://x.x.x.61:6443/api/v1/namespaces/cattle-fleet-system/pods/fleet-controller-568944f85f-kb4k8?timeout=30s
I0907 16:12:08.781320   18237 request.go:601] Waited for 1.988549942s due to client-side throttling, not priority and fairness, request: GET:https://x.x.x.61:6443/api/v1/namespaces/cattle-resources-system/pods/rancher-backup-865dfdb4cf-lpmx6?timeout=30s
TestHelper :: INFO    :: 2023-09-07 16:13:57,675 :: Waiting for the Rancher workload instances to become available.
TestHelper :: INFO    :: 2023-09-07 16:13:57,676 :: Willing to wait for at most 1800 seconds.
TestHelper :: INFO    :: 2023-09-07 16:14:02,681 :: Checking the availability of rancher pods...
TestHelper :: INFO    :: 2023-09-07 16:14:02,930 :: 3 pods are desired, 3 are actually available. Waited 5 seconds so far, timeout in 1795 seconds.
TestHelper :: INFO    :: 2023-09-07 16:14:02,931 :: Sleeping for an additional 60 seconds to allow Rancher instances to complete their setup...
TestHelper :: INFO    :: 2023-09-07 16:14:07,935 :: Slept 5 out of 60 seconds
TestHelper :: INFO    :: 2023-09-07 16:14:12,940 :: Slept 10 out of 60 seconds
TestHelper :: INFO    :: 2023-09-07 16:14:17,945 :: Slept 15 out of 60 seconds
TestHelper :: INFO    :: 2023-09-07 16:14:22,950 :: Slept 20 out of 60 seconds
TestHelper :: INFO    :: 2023-09-07 16:14:27,956 :: Slept 25 out of 60 seconds
TestHelper :: INFO    :: 2023-09-07 16:14:32,960 :: Slept 30 out of 60 seconds
TestHelper :: INFO    :: 2023-09-07 16:15:02,984 :: Slept 60 out of 60 seconds
TestHelper :: INFO    :: 2023-09-07 16:15:02,995 :: executing terminate_ec2_instance()
TestHelper :: INFO    :: 2023-09-07 16:15:02,996 :: terminating EC2 instance: i-0cae2df89873e4e76
TestHelper :: INFO    :: 2023-09-07 16:15:03,244 :: EC2 instance state: shutting-down
TestHelper :: INFO    :: 2023-09-07 16:15:03,244 :: EC2 instance state: shutting-down
TestHelper :: INFO    :: 2023-09-07 16:15:03,246 :: Completed patch sequence for Rancher node 'x.x.x.197'.

It was able to remove the OLD node (.197)

time="2023-09-07T16:08:23Z" level=info msg="[remove/etcd] Removing member [etcd-x.x.x.197] from etcd cluster"
time="2023-09-07T16:08:23Z" level=info msg="[remove/etcd] Checking etcd cluster health on [etcd-x.x.x.61] after removing [etcd-x.x.x.197]"
time="2023-09-07T16:08:31Z" level=info msg="[etcd] etcd host [x.x.x.61] reported healthy=true"
time="2023-09-07T16:08:31Z" level=info msg="[remove/etcd] etcd cluster health is healthy on [etcd-x.x.x.61] after removing [etcd-x.x.x.197]"
time="2023-09-07T16:08:31Z" level=info msg="[remove/etcd] Successfully removed member [etcd-x.x.x.197] from etcd cluster"
  1. When it doesn't work - it can't remove the OLD node from the etcd cluster
TestHelper :: INFO    :: 2023-09-07 16:21:24,192 :: Generating rancher-cluster.yaml
TestHelper :: INFO    :: 2023-09-07 16:21:24,192 :: Performing the 'rke up' command to update the RKE-provisioned Rancher 'local' cluster.
time="2023-09-07T16:21:24Z" level=info msg="Running RKE version: v1.3.15"
time="2023-09-07T16:21:24Z" level=info msg="Initiating Kubernetes cluster"
time="2023-09-07T16:21:24Z" level=info msg="[certificates] GenerateServingCertificate is disabled, checking if there are unused kubelet certificates"
time="2023-09-07T16:21:24Z" level=info msg="[certificates] Generating Kubernetes API server certificates"
time="2023-09-07T16:21:24Z" level=info msg="[certificates] Generating admin certificates and kubeconfig"
time="2023-09-07T16:21:24Z" level=info msg="[certificates] Generating kube-etcd-x-x-x-156 certificate and key"
time="2023-09-07T16:21:24Z" level=info msg="[certificates] Generating kube-etcd-x-x-x-11 certificate and key"
time="2023-09-07T16:21:24Z" level=info msg="[certificates] Generating kube-etcd-x-x-x-155 certificate and key"
time="2023-09-07T16:21:24Z" level=info msg="[certificates] Deleting unused certificate: kube-etcd-x-x-x-61"
time="2023-09-07T16:21:24Z" level=info msg="Successfully Deployed state file at [/root/myenv/rancher-cluster.rkestate]"
time="2023-09-07T16:21:24Z" level=info msg="Building Kubernetes cluster"
time="2023-09-07T16:21:24Z" level=info msg="[dialer] Setup tunnel for host [x.x.x.155]"
time="2023-09-07T16:21:24Z" level=info msg="[dialer] Setup tunnel for host [x.x.x.156]"
time="2023-09-07T16:21:24Z" level=info msg="[dialer] Setup tunnel for host [x.x.x.11]"
time="2023-09-07T16:21:25Z" level=info msg="[network] Deploying port listener containers"
time="2023-09-07T16:21:25Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.155]"
time="2023-09-07T16:21:25Z" level=info msg="Pulling image [harbor:5000/rancher/rke-tools:v0.1.87] on host [x.x.x.156], try #1"
time="2023-09-07T16:21:25Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.11]"
time="2023-09-07T16:21:26Z" level=info msg="Starting container [rke-etcd-port-listener] on host [x.x.x.11], try #1"
time="2023-09-07T16:21:30Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.156]"
time="2023-09-07T16:21:32Z" level=info msg="Starting container [rke-etcd-port-listener] on host [x.x.x.156], try #1"
time="2023-09-07T16:21:32Z" level=info msg="[network] Successfully started [rke-etcd-port-listener] container on host [x.x.x.156]"
time="2023-09-07T16:21:45Z" level=info msg="Starting container [rke-etcd-port-listener] on host [x.x.x.155], try #1"
time="2023-09-07T16:21:45Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.156]"
time="2023-09-07T16:21:45Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.155]"
time="2023-09-07T16:21:45Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.11]"
time="2023-09-07T16:21:45Z" level=info msg="Starting container [rke-cp-port-listener] on host [x.x.x.156], try #1"
time="2023-09-07T16:21:45Z" level=info msg="Starting container [rke-cp-port-listener] on host [x.x.x.155], try #1"
time="2023-09-07T16:21:45Z" level=info msg="[network] Successfully started [rke-cp-port-listener] container on host [x.x.x.156]"
time="2023-09-07T16:21:46Z" level=info msg="Starting container [rke-cp-port-listener] on host [x.x.x.11], try #1"
time="2023-09-07T16:21:46Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.156]"
time="2023-09-07T16:21:46Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.11]"
time="2023-09-07T16:21:46Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.155]"
time="2023-09-07T16:21:47Z" level=info msg="Starting container [rke-worker-port-listener] on host [x.x.x.156], try #1"
time="2023-09-07T16:21:47Z" level=info msg="Starting container [rke-worker-port-listener] on host [x.x.x.155], try #1"
time="2023-09-07T16:21:47Z" level=info msg="[network] Successfully started [rke-worker-port-listener] container on host [x.x.x.156]"
time="2023-09-07T16:21:47Z" level=info msg="Starting container [rke-worker-port-listener] on host [x.x.x.11], try #1"
time="2023-09-07T16:21:48Z" level=info msg="[network] Port listener containers deployed successfully"
time="2023-09-07T16:21:48Z" level=info msg="[network] Running etcd <-> etcd port checks"
time="2023-09-07T16:21:48Z" level=info msg="[network] Checking if host [x.x.x.156] can connect to host(s) [x.x.x.156 x.x.x.11 x.x.x.155] on port(s) [2379 2380], try #1"
time="2023-09-07T16:21:48Z" level=info msg="[network] Checking if host [x.x.x.11] can connect to host(s) [x.x.x.156 x.x.x.11 x.x.x.155] on port(s) [2379 2380], try #1"
time="2023-09-07T16:21:48Z" level=info msg="[network] Checking if host [x.x.x.155] can connect to host(s) [x.x.x.156 x.x.x.11 x.x.x.155] on port(s) [2379 2380], try #1"
time="2023-09-07T16:21:48Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.156]"
time="2023-09-07T16:21:48Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.11]"
time="2023-09-07T16:21:48Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.155]"
time="2023-09-07T16:21:48Z" level=info msg="Starting container [rke-port-checker] on host [x.x.x.156], try #1"
time="2023-09-07T16:21:48Z" level=info msg="Starting container [rke-port-checker] on host [x.x.x.155], try #1"
time="2023-09-07T16:21:48Z" level=info msg="[network] Successfully started [rke-port-checker] container on host [x.x.x.156]"
time="2023-09-07T16:21:48Z" level=info msg="Starting container [rke-port-checker] on host [x.x.x.11], try #1"
time="2023-09-07T16:21:48Z" level=info msg="Removing container [rke-port-checker] on host [x.x.x.156], try #1"
time="2023-09-07T16:21:48Z" level=info msg="[network] Successfully started [rke-port-checker] container on host [x.x.x.155]"
time="2023-09-07T16:21:48Z" level=info msg="Removing container [rke-port-checker] on host [x.x.x.155], try #1"
time="2023-09-07T16:21:49Z" level=info msg="[network] Successfully started [rke-port-checker] container on host [x.x.x.11]"
time="2023-09-07T16:21:49Z" level=info msg="Removing container [rke-port-checker] on host [x.x.x.11], try #1"
time="2023-09-07T16:21:49Z" level=info msg="[network] Running control plane -> etcd port checks"
time="2023-09-07T16:21:49Z" level=info msg="[network] Checking if host [x.x.x.156] can connect to host(s) [x.x.x.156 x.x.x.11 x.x.x.155] on port(s) [2379], try #1"
time="2023-09-07T16:21:49Z" level=info msg="[network] Checking if host [x.x.x.11] can connect to host(s) [x.x.x.156 x.x.x.11 x.x.x.155] on port(s) [2379], try #1"
time="2023-09-07T16:21:49Z" level=info msg="[network] Checking if host [x.x.x.155] can connect to host(s) [x.x.x.156 x.x.x.11 x.x.x.155] on port(s) [2379], try #1"
time="2023-09-07T16:21:49Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.11]"
time="2023-09-07T16:21:49Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.156]"
time="2023-09-07T16:21:49Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.155]"
time="2023-09-07T16:21:49Z" level=info msg="Starting container [rke-port-checker] on host [x.x.x.156], try #1"
time="2023-09-07T16:21:49Z" level=info msg="[network] Successfully started [rke-port-checker] container on host [x.x.x.156]"
time="2023-09-07T16:21:49Z" level=info msg="Starting container [rke-port-checker] on host [x.x.x.155], try #1"
time="2023-09-07T16:21:49Z" level=info msg="Removing container [rke-port-checker] on host [x.x.x.156], try #1"
time="2023-09-07T16:21:49Z" level=info msg="Starting container [rke-port-checker] on host [x.x.x.11], try #1"
time="2023-09-07T16:21:49Z" level=info msg="[network] Successfully started [rke-port-checker] container on host [x.x.x.11]"
time="2023-09-07T16:21:50Z" level=info msg="Removing container [rke-port-checker] on host [x.x.x.11], try #1"
time="2023-09-07T16:21:50Z" level=info msg="[network] Successfully started [rke-port-checker] container on host [x.x.x.155]"
time="2023-09-07T16:21:50Z" level=info msg="Removing container [rke-port-checker] on host [x.x.x.155], try #1"
time="2023-09-07T16:21:50Z" level=info msg="[network] Running control plane -> worker port checks"
time="2023-09-07T16:21:50Z" level=info msg="[network] Checking if host [x.x.x.156] can connect to host(s) [x.x.x.156 x.x.x.11 x.x.x.155] on port(s) [10250], try #1"
time="2023-09-07T16:21:50Z" level=info msg="[network] Checking if host [x.x.x.11] can connect to host(s) [x.x.x.156 x.x.x.11 x.x.x.155] on port(s) [10250], try #1"
time="2023-09-07T16:21:50Z" level=info msg="[network] Checking if host [x.x.x.155] can connect to host(s) [x.x.x.156 x.x.x.11 x.x.x.155] on port(s) [10250], try #1"
time="2023-09-07T16:21:50Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.156]"
time="2023-09-07T16:21:50Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.11]"
time="2023-09-07T16:21:50Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.155]"
time="2023-09-07T16:21:50Z" level=info msg="Starting container [rke-port-checker] on host [x.x.x.156], try #1"
time="2023-09-07T16:21:51Z" level=info msg="[network] Successfully started [rke-port-checker] container on host [x.x.x.156]"
time="2023-09-07T16:21:51Z" level=info msg="Removing container [rke-port-checker] on host [x.x.x.156], try #1"
time="2023-09-07T16:21:51Z" level=info msg="Starting container [rke-port-checker] on host [x.x.x.11], try #1"
time="2023-09-07T16:21:51Z" level=info msg="Starting container [rke-port-checker] on host [x.x.x.155], try #1"
time="2023-09-07T16:21:51Z" level=info msg="[network] Successfully started [rke-port-checker] container on host [x.x.x.11]"
time="2023-09-07T16:21:51Z" level=info msg="Removing container [rke-port-checker] on host [x.x.x.11], try #1"
time="2023-09-07T16:21:51Z" level=info msg="[network] Successfully started [rke-port-checker] container on host [x.x.x.155]"
time="2023-09-07T16:21:51Z" level=info msg="Removing container [rke-port-checker] on host [x.x.x.155], try #1"
time="2023-09-07T16:21:52Z" level=info msg="[network] Running workers -> control plane port checks"
time="2023-09-07T16:21:52Z" level=info msg="[network] Checking if host [x.x.x.156] can connect to host(s) [x.x.x.156 x.x.x.11 x.x.x.155] on port(s) [6443], try #1"
time="2023-09-07T16:21:52Z" level=info msg="[network] Checking if host [x.x.x.11] can connect to host(s) [x.x.x.156 x.x.x.11 x.x.x.155] on port(s) [6443], try #1"
time="2023-09-07T16:21:52Z" level=info msg="[network] Checking if host [x.x.x.155] can connect to host(s) [x.x.x.156 x.x.x.11 x.x.x.155] on port(s) [6443], try #1"
time="2023-09-07T16:21:52Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.156]"
time="2023-09-07T16:21:52Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.11]"
time="2023-09-07T16:21:52Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.155]"
time="2023-09-07T16:21:52Z" level=info msg="Starting container [rke-port-checker] on host [x.x.x.156], try #1"
time="2023-09-07T16:21:52Z" level=info msg="[network] Successfully started [rke-port-checker] container on host [x.x.x.156]"
time="2023-09-07T16:21:52Z" level=info msg="Removing container [rke-port-checker] on host [x.x.x.156], try #1"
time="2023-09-07T16:21:52Z" level=info msg="Starting container [rke-port-checker] on host [x.x.x.11], try #1"
time="2023-09-07T16:21:52Z" level=info msg="Starting container [rke-port-checker] on host [x.x.x.155], try #1"
time="2023-09-07T16:21:53Z" level=info msg="[network] Successfully started [rke-port-checker] container on host [x.x.x.11]"
time="2023-09-07T16:21:53Z" level=info msg="Removing container [rke-port-checker] on host [x.x.x.11], try #1"
time="2023-09-07T16:21:53Z" level=info msg="[network] Successfully started [rke-port-checker] container on host [x.x.x.155]"
time="2023-09-07T16:21:53Z" level=info msg="Removing container [rke-port-checker] on host [x.x.x.155], try #1"
time="2023-09-07T16:21:53Z" level=info msg="[network] Checking KubeAPI port Control Plane hosts"
time="2023-09-07T16:21:53Z" level=info msg="[network] Removing port listener containers"
time="2023-09-07T16:21:53Z" level=info msg="Removing container [rke-etcd-port-listener] on host [x.x.x.156], try #1"
time="2023-09-07T16:21:53Z" level=info msg="Removing container [rke-etcd-port-listener] on host [x.x.x.155], try #1"
time="2023-09-07T16:21:53Z" level=info msg="Removing container [rke-etcd-port-listener] on host [x.x.x.11], try #1"
time="2023-09-07T16:21:53Z" level=info msg="[remove/rke-etcd-port-listener] Successfully removed container on host [x.x.x.11]"
time="2023-09-07T16:21:54Z" level=info msg="[remove/rke-etcd-port-listener] Successfully removed container on host [x.x.x.155]"
time="2023-09-07T16:21:54Z" level=info msg="[remove/rke-etcd-port-listener] Successfully removed container on host [x.x.x.156]"
time="2023-09-07T16:21:54Z" level=info msg="Removing container [rke-cp-port-listener] on host [x.x.x.156], try #1"
time="2023-09-07T16:21:54Z" level=info msg="Removing container [rke-cp-port-listener] on host [x.x.x.11], try #1"
time="2023-09-07T16:21:54Z" level=info msg="Removing container [rke-cp-port-listener] on host [x.x.x.155], try #1"
time="2023-09-07T16:21:54Z" level=info msg="[remove/rke-cp-port-listener] Successfully removed container on host [x.x.x.11]"
time="2023-09-07T16:21:54Z" level=info msg="[remove/rke-cp-port-listener] Successfully removed container on host [x.x.x.155]"
time="2023-09-07T16:21:54Z" level=info msg="[remove/rke-cp-port-listener] Successfully removed container on host [x.x.x.156]"
time="2023-09-07T16:21:54Z" level=info msg="Removing container [rke-worker-port-listener] on host [x.x.x.156], try #1"
time="2023-09-07T16:21:54Z" level=info msg="Removing container [rke-worker-port-listener] on host [x.x.x.11], try #1"
time="2023-09-07T16:21:54Z" level=info msg="Removing container [rke-worker-port-listener] on host [x.x.x.155], try #1"
time="2023-09-07T16:21:54Z" level=info msg="[remove/rke-worker-port-listener] Successfully removed container on host [x.x.x.11]"
time="2023-09-07T16:21:54Z" level=info msg="[remove/rke-worker-port-listener] Successfully removed container on host [x.x.x.155]"
time="2023-09-07T16:21:54Z" level=info msg="[remove/rke-worker-port-listener] Successfully removed container on host [x.x.x.156]"
time="2023-09-07T16:21:54Z" level=info msg="[network] Port listener containers removed successfully"
time="2023-09-07T16:21:54Z" level=info msg="[selinux] Checking if host [x.x.x.156] recognizes SELinux label [label=type:rke_container_t], try #1"
time="2023-09-07T16:21:54Z" level=info msg="[selinux] Checking if host [x.x.x.155] recognizes SELinux label [label=type:rke_container_t], try #1"
time="2023-09-07T16:21:54Z" level=info msg="[selinux] Checking if host [x.x.x.11] recognizes SELinux label [label=type:rke_container_t], try #1"
time="2023-09-07T16:21:54Z" level=info msg="Removing container [rke-selinux-checker] on host [x.x.x.11], try #1"
time="2023-09-07T16:21:54Z" level=info msg="Removing container [rke-selinux-checker] on host [x.x.x.155], try #1"
time="2023-09-07T16:21:54Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.156]"
time="2023-09-07T16:21:54Z" level=info msg="[remove/rke-selinux-checker] Successfully removed container on host [x.x.x.155]"
time="2023-09-07T16:21:54Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.155]"
time="2023-09-07T16:21:54Z" level=info msg="[remove/rke-selinux-checker] Successfully removed container on host [x.x.x.11]"
time="2023-09-07T16:21:54Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.11]"
time="2023-09-07T16:21:54Z" level=info msg="Starting container [rke-selinux-checker] on host [x.x.x.156], try #1"
time="2023-09-07T16:21:54Z" level=info msg="Starting container [rke-selinux-checker] on host [x.x.x.11], try #1"
time="2023-09-07T16:21:54Z" level=info msg="Successfully started [rke-selinux-checker] container on host [x.x.x.156]"
time="2023-09-07T16:21:54Z" level=info msg="Waiting for [rke-selinux-checker] container to exit on host [x.x.x.156]"
time="2023-09-07T16:21:54Z" level=info msg="Waiting for [rke-selinux-checker] container to exit on host [x.x.x.156]"
time="2023-09-07T16:21:54Z" level=info msg="Container [rke-selinux-checker] is still running on host [x.x.x.156]: stderr: [], stdout: []"
time="2023-09-07T16:21:54Z" level=info msg="Starting container [rke-selinux-checker] on host [x.x.x.155], try #1"
time="2023-09-07T16:21:55Z" level=info msg="Successfully started [rke-selinux-checker] container on host [x.x.x.11]"
time="2023-09-07T16:21:55Z" level=info msg="Waiting for [rke-selinux-checker] container to exit on host [x.x.x.11]"
time="2023-09-07T16:21:55Z" level=info msg="Waiting for [rke-selinux-checker] container to exit on host [x.x.x.11]"
time="2023-09-07T16:21:55Z" level=info msg="Successfully started [rke-selinux-checker] container on host [x.x.x.155]"
time="2023-09-07T16:21:55Z" level=info msg="Waiting for [rke-selinux-checker] container to exit on host [x.x.x.155]"
time="2023-09-07T16:21:55Z" level=info msg="Waiting for [rke-selinux-checker] container to exit on host [x.x.x.155]"
time="2023-09-07T16:21:55Z" level=info msg="Container [rke-selinux-checker] is still running on host [x.x.x.11]: stderr: [], stdout: []"
time="2023-09-07T16:21:55Z" level=info msg="Container [rke-selinux-checker] is still running on host [x.x.x.155]: stderr: [], stdout: []"
time="2023-09-07T16:21:56Z" level=info msg="[certificates] kube-apiserver certificate changed, force deploying certs"
time="2023-09-07T16:21:56Z" level=info msg="[certificates] Deploying kubernetes certificates to Cluster nodes"
time="2023-09-07T16:21:56Z" level=info msg="Finding container [cert-deployer] on host [x.x.x.11], try #1"
time="2023-09-07T16:21:56Z" level=info msg="Finding container [cert-deployer] on host [x.x.x.156], try #1"
time="2023-09-07T16:21:56Z" level=info msg="Finding container [cert-deployer] on host [x.x.x.155], try #1"
time="2023-09-07T16:21:56Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.156]"
time="2023-09-07T16:21:56Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.11]"
time="2023-09-07T16:21:56Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.155]"
time="2023-09-07T16:21:56Z" level=info msg="Starting container [cert-deployer] on host [x.x.x.156], try #1"
time="2023-09-07T16:21:56Z" level=info msg="Starting container [cert-deployer] on host [x.x.x.11], try #1"
time="2023-09-07T16:21:56Z" level=info msg="Finding container [cert-deployer] on host [x.x.x.156], try #1"
time="2023-09-07T16:21:57Z" level=info msg="Finding container [cert-deployer] on host [x.x.x.11], try #1"
time="2023-09-07T16:21:57Z" level=info msg="Starting container [cert-deployer] on host [x.x.x.155], try #1"
time="2023-09-07T16:21:57Z" level=info msg="Finding container [cert-deployer] on host [x.x.x.155], try #1"
time="2023-09-07T16:22:02Z" level=info msg="Finding container [cert-deployer] on host [x.x.x.156], try #1"
time="2023-09-07T16:22:02Z" level=info msg="Removing container [cert-deployer] on host [x.x.x.156], try #1"
time="2023-09-07T16:22:02Z" level=info msg="Finding container [cert-deployer] on host [x.x.x.11], try #1"
time="2023-09-07T16:22:02Z" level=info msg="Removing container [cert-deployer] on host [x.x.x.11], try #1"
time="2023-09-07T16:22:02Z" level=info msg="Finding container [cert-deployer] on host [x.x.x.155], try #1"
time="2023-09-07T16:22:02Z" level=info msg="Removing container [cert-deployer] on host [x.x.x.155], try #1"
time="2023-09-07T16:22:02Z" level=info msg="[reconcile] Rebuilding and updating local kube config"
time="2023-09-07T16:22:02Z" level=info msg="Successfully Deployed local admin kubeconfig at [/root/myenv/kube_config_rancher-cluster.yaml]"
time="2023-09-07T16:22:02Z" level=warning msg="[reconcile] host [x.x.x.156] is a control plane node without reachable Kubernetes API endpoint in the cluster"
time="2023-09-07T16:22:02Z" level=info msg="Successfully Deployed local admin kubeconfig at [/root/myenv/kube_config_rancher-cluster.yaml]"
time="2023-09-07T16:22:02Z" level=info msg="[reconcile] host [x.x.x.11] is a control plane node with reachable Kubernetes API endpoint in the cluster"
time="2023-09-07T16:22:02Z" level=info msg="[certificates] Successfully deployed kubernetes certificates to Cluster nodes"
time="2023-09-07T16:22:02Z" level=info msg="[file-deploy] Deploying file [/etc/kubernetes/audit-policy.yaml] to node [x.x.x.156]"
time="2023-09-07T16:22:02Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.156]"
time="2023-09-07T16:22:03Z" level=info msg="Starting container [file-deployer] on host [x.x.x.156], try #1"
time="2023-09-07T16:22:03Z" level=info msg="Successfully started [file-deployer] container on host [x.x.x.156]"
time="2023-09-07T16:22:03Z" level=info msg="Waiting for [file-deployer] container to exit on host [x.x.x.156]"
time="2023-09-07T16:22:03Z" level=info msg="Waiting for [file-deployer] container to exit on host [x.x.x.156]"
time="2023-09-07T16:22:03Z" level=info msg="Container [file-deployer] is still running on host [x.x.x.156]: stderr: [], stdout: []"
time="2023-09-07T16:22:04Z" level=info msg="Removing container [file-deployer] on host [x.x.x.156], try #1"
time="2023-09-07T16:22:04Z" level=info msg="[remove/file-deployer] Successfully removed container on host [x.x.x.156]"
time="2023-09-07T16:22:04Z" level=info msg="[file-deploy] Deploying file [/etc/kubernetes/audit-policy.yaml] to node [x.x.x.11]"
time="2023-09-07T16:22:04Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.11]"
time="2023-09-07T16:22:05Z" level=info msg="Starting container [file-deployer] on host [x.x.x.11], try #1"
time="2023-09-07T16:22:05Z" level=info msg="Successfully started [file-deployer] container on host [x.x.x.11]"
time="2023-09-07T16:22:05Z" level=info msg="Waiting for [file-deployer] container to exit on host [x.x.x.11]"
time="2023-09-07T16:22:05Z" level=info msg="Waiting for [file-deployer] container to exit on host [x.x.x.11]"
time="2023-09-07T16:22:05Z" level=info msg="Container [file-deployer] is still running on host [x.x.x.11]: stderr: [], stdout: []"
time="2023-09-07T16:22:06Z" level=info msg="Removing container [file-deployer] on host [x.x.x.11], try #1"
time="2023-09-07T16:22:06Z" level=info msg="[remove/file-deployer] Successfully removed container on host [x.x.x.11]"
time="2023-09-07T16:22:06Z" level=info msg="[file-deploy] Deploying file [/etc/kubernetes/audit-policy.yaml] to node [x.x.x.155]"
time="2023-09-07T16:22:06Z" level=info msg="Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.155]"
time="2023-09-07T16:22:06Z" level=info msg="Starting container [file-deployer] on host [x.x.x.155], try #1"
time="2023-09-07T16:22:07Z" level=info msg="Successfully started [file-deployer] container on host [x.x.x.155]"
time="2023-09-07T16:22:07Z" level=info msg="Waiting for [file-deployer] container to exit on host [x.x.x.155]"
time="2023-09-07T16:22:07Z" level=info msg="Waiting for [file-deployer] container to exit on host [x.x.x.155]"
time="2023-09-07T16:22:07Z" level=info msg="Container [file-deployer] is still running on host [x.x.x.155]: stderr: [], stdout: []"
time="2023-09-07T16:22:08Z" level=info msg="Removing container [file-deployer] on host [x.x.x.155], try #1"
time="2023-09-07T16:22:08Z" level=info msg="[remove/file-deployer] Successfully removed container on host [x.x.x.155]"
time="2023-09-07T16:22:08Z" level=info msg="[/etc/kubernetes/audit-policy.yaml] Successfully deployed audit policy file to Cluster control nodes"
time="2023-09-07T16:22:08Z" level=info msg="[reconcile] Reconciling cluster state"
time="2023-09-07T16:22:08Z" level=info msg="[reconcile] Check etcd hosts to be deleted"
time="2023-09-07T16:22:08Z" level=info msg="[remove/etcd] Removing member [etcd-x.x.x.61] from etcd cluster"

.156 is the NEW node. RKE up doesn't install the hyperkube containers

# docker ps -a
CONTAINER ID        IMAGE                                                                COMMAND                  CREATED             STATUS                      PORTS               NAMES
4e2b00a2aa5d        harbor:5000/rancher/rke-tools:v0.1.87   "/docker-entrypoin..."   35 minutes ago      Exited (0) 35 minutes ago                       rke-selinux-checker

# docker inspect rke-selinux-checker|grep -i error
            "Error": "",
@vynguyenchantal
Copy link
Author

vynguyenchantal commented Sep 7, 2023

#3073 describes similar issue but it was closed without explanation and/or resolution.

Sometime (1 out of 10 tries), the 1st node got replaced okay but the 2nd one got stuck Removing member [etcd-x.x.x.x] from etcd cluster forever.

I've also tried rke up --config cluster.yaml withou update-only and it still can't remove the node from the etcd.

For now, I have to do this workaround:

  1. Rebuild the local cluster
  2. Restore from etcd (so that I don't have to build the other clusters managed by this local cluster)

@vynguyenchantal
Copy link
Author

.30 is the NEW node but there is no hyperkube containers deployed. .61 is the OLD node and can't be removed from the etcd cluster because .30 can't join or what?

#  rke up --config cluster.yaml --update-only
INFO[0012] Successfully Deployed local admin kubeconfig at [./kube_config_rancher-cluster.yaml]
WARN[0012] [reconcile] host [x.x.x.30] is a control plane node without reachable Kubernetes API endpoint in the cluster
INFO[0012] Successfully Deployed local admin kubeconfig at [./kube_config_rancher-cluster.yaml]
INFO[0012] [reconcile] host [x.x.x.11] is a control plane node with reachable Kubernetes API endpoint in the cluster
INFO[0012] [certificates] Successfully deployed kubernetes certificates to Cluster nodes
INFO[0012] [file-deploy] Deploying file [/etc/kubernetes/audit-policy.yaml] to node [x.x.x.30]
INFO[0012] Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.30]
INFO[0013] Starting container [file-deployer] on host [x.x.x.30], try #1
INFO[0013] Successfully started [file-deployer] container on host [x.x.x.30]
INFO[0013] Waiting for [file-deployer] container to exit on host [x.x.x.30]
INFO[0013] Waiting for [file-deployer] container to exit on host [x.x.x.30]
INFO[0013] Container [file-deployer] is still running on host [x.x.x.30]: stderr: [], stdout: []
INFO[0014] Removing container [file-deployer] on host [x.x.x.30], try #1
INFO[0014] [remove/file-deployer] Successfully removed container on host [x.x.x.30]
INFO[0014] [file-deploy] Deploying file [/etc/kubernetes/audit-policy.yaml] to node [x.x.x.11]
INFO[0014] Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.11]
INFO[0014] Starting container [file-deployer] on host [x.x.x.11], try #1
INFO[0015] Successfully started [file-deployer] container on host [x.x.x.11]
INFO[0015] Waiting for [file-deployer] container to exit on host [x.x.x.11]
INFO[0015] Waiting for [file-deployer] container to exit on host [x.x.x.11]
INFO[0015] Container [file-deployer] is still running on host [x.x.x.11]: stderr: [], stdout: []
INFO[0016] Removing container [file-deployer] on host [x.x.x.11], try #1
INFO[0016] [remove/file-deployer] Successfully removed container on host [x.x.x.11]
INFO[0016] [file-deploy] Deploying file [/etc/kubernetes/audit-policy.yaml] to node [x.x.x.155]
INFO[0016] Image [harbor:5000/rancher/rke-tools:v0.1.87] exists on host [x.x.x.155]
INFO[0016] Starting container [file-deployer] on host [x.x.x.155], try #1
INFO[0016] Successfully started [file-deployer] container on host [x.x.x.155]
INFO[0016] Waiting for [file-deployer] container to exit on host [x.x.x.155]
INFO[0016] Waiting for [file-deployer] container to exit on host [x.x.x.155]
INFO[0016] Container [file-deployer] is still running on host [x.x.x.155]: stderr: [], stdout: []
INFO[0017] Removing container [file-deployer] on host [x.x.x.155], try #1
INFO[0017] [remove/file-deployer] Successfully removed container on host [x.x.x.155]
INFO[0017] [/etc/kubernetes/audit-policy.yaml] Successfully deployed audit policy file to Cluster control nodes
INFO[0017] [reconcile] Reconciling cluster state
INFO[0017] [reconcile] Check etcd hosts to be deleted
INFO[0017] [remove/etcd] Removing member [etcd-x.x.x.61] from etcd cluster

@RajaniG-AWS
Copy link

#3073 describes similar issue but it was closed without explanation and/or resolution.

Sometime (1 out of 10 tries), the 1st node got replaced okay but the 2nd one got stuck Removing member [etcd-x.x.x.x] from etcd cluster forever.

I've also tried rke up --config cluster.yaml withou update-only and it still can't remove the node from the etcd.

For now, I have to do this workaround:

  1. Rebuild the local cluster
  2. Restore from etcd (so that I don't have to build the other clusters managed by this local cluster)

How did you restore from etcd?

@vynguyenchantal
Copy link
Author

Hi!
Can you describe your problem to me? Restore form etcd is rke etcd snapshot-restore command.

Copy link
Contributor

This repository uses an automated workflow to automatically label issues which have not had any activity (commit/comment/label) for 60 days. This helps us manage the community issues better. If the issue is still relevant, please add a comment to the issue so the workflow can remove the label and we know it is still valid. If it is no longer relevant (or possibly fixed in the latest release), the workflow will automatically close the issue in 14 days. Thank you for your contributions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants