Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with k8s.io/docs/setup/windows/user-guide-windows-nodes/ #13501

Closed
1 task
x8091 opened this issue Mar 28, 2019 · 12 comments
Closed
1 task

Issue with k8s.io/docs/setup/windows/user-guide-windows-nodes/ #13501

x8091 opened this issue Mar 28, 2019 · 12 comments
Labels
kind/support Categorizes issue or PR as a support question. language/en Issues or PRs related to English language lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.

Comments

@x8091
Copy link

x8091 commented Mar 28, 2019

This is a...

  • Feature Request
  • [ x] Bug Report

Problem:
kube-proxy component cannot run on Windows node.

Cluster info: 3 nodes, all running v1.14
master-node: Ubuntu 16.04
worker-node01: Ubuntu 16.04
worker-node02: Windows Server 2019 (installed all recent updates)

In addition, there are mistakes in this guide.
In the first part, it says kube-proxy component need to be installed on Windows

Components that run on Windows
While the Kubernetes control plane runs on your Linux node(s), the following components are configured and run on your Windows node(s).
kubelet
kube-proxy
kubectl (optional)
Container runtime

In later section (apply node-selector patch), I found that os=linux also applied for kube-proxy.
Screen Shot 2019-03-28 at 6 02 36 PM

This causes a lot of confusions.
Therefore, I try both cases:

  • If I let kube-proxy run on Windows node, it fail to run

Screen Shot 2019-03-28 at 12 58 55 PM

  • If I prevent kube-proxy to run on Windows node (by using node-selector), I cannot schedule pods/containers on Windows node, it hanging in ContainerCreating status.

Proposed Solution:
Please help to clarify whether kube-proxy is needed or not. If yes, how to fix my issue.

Page to Update:
https://kubernetes.io/...

@PatrickLang
Copy link
Contributor

kube-proxy is needed, but it should be launched directly on the Windows node by the start.ps1 script https://github.com/Microsoft/SDN/blob/ce70380ccb6b71b69efdb313c2192b786cde14cc/Kubernetes/flannel/start.ps1#L67

Can you check the following to help narrow it down?

  1. After you run that script, can you run get-process kube-proxy and confirm it's running?
  2. Log into the Windows node and make sure docker run kubeletwin/pause works
  3. If the pod still won't start, can you check the logs and include the output from kubectl describe pod … so we can see the events?

The kube-proxy daemonset won't work on Windows because we don't have privileged container support.

cc @daschott

@x8091
Copy link
Author

x8091 commented Mar 29, 2019

After you run that script, can you run get-process kube-proxy and confirm it's running?

Yes, it's running

PS C:\K\cni\config> get-process kube-proxy

Handles  NPM(K)    PM(K)      WS(K)     CPU(s)     Id  SI ProcessName
-------  ------    -----      -----     ------     --  -- -----------
    235      16    21644      29660     584.20   8920   2 kube-proxy

Log into the Windows node and make sure docker run kubeletwin/pause works

Yes, it works

PS C:\K\cni\config> docker run kubeletwin/pause

Pinging aea7144e66fe [::1] with 32 bytes of data:
Reply from ::1: time<1ms
Reply from ::1: time<1ms
Reply from ::1: time<1ms

If the pod still won't start, can you check the logs and include the output from kubectl describe pod … so we can see the events?

Here is the log:

Name:               kube-proxy-87mdh
Namespace:          kube-system
Priority:           2000001000
PriorityClassName:  system-node-critical
Node:               win-bu3nulfkq1v/10.10.1.242
Start Time:         Thu, 28 Mar 2019 06:18:02 -0400
Labels:             controller-revision-hash=b7775b676
                    k8s-app=kube-proxy
                    pod-template-generation=5
Annotations:        <none>
Status:             Pending
IP:                 10.10.1.242
Controlled By:      DaemonSet/kube-proxy
Containers:
  kube-proxy:
    Container ID:  
    Image:         k8s.gcr.io/kube-proxy:v1.14.0
    Image ID:      
    Port:          <none>
    Host Port:     <none>
    Command:
      /usr/local/bin/kube-proxy
      --config=/var/lib/kube-proxy/config.conf
      --hostname-override=$(NODE_NAME)
    State:          Waiting
      Reason:       ContainerCreating
    Ready:          False
    Restart Count:  0
    Environment:
      NODE_NAME:   (v1:spec.nodeName)
    Mounts:
      /lib/modules from lib-modules (ro)
      /run/xtables.lock from xtables-lock (rw)
      /var/lib/kube-proxy from kube-proxy (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-proxy-token-qdtph (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  kube-proxy:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      kube-proxy
    Optional:  false
  xtables-lock:
    Type:          HostPath (bare host directory volume)
    Path:          /run/xtables.lock
    HostPathType:  FileOrCreate
  lib-modules:
    Type:          HostPath (bare host directory volume)
    Path:          /lib/modules
    HostPathType:  
  kube-proxy-token-qdtph:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  kube-proxy-token-qdtph
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     
                 CriticalAddonsOnly
                 node.kubernetes.io/disk-pressure:NoSchedule
                 node.kubernetes.io/memory-pressure:NoSchedule
                 node.kubernetes.io/network-unavailable:NoSchedule
                 node.kubernetes.io/not-ready:NoExecute
                 node.kubernetes.io/pid-pressure:NoSchedule
                 node.kubernetes.io/unreachable:NoExecute
                 node.kubernetes.io/unschedulable:NoSchedule
Events:
  Type     Reason                  Age                     From                      Message
  ----     ------                  ----                    ----                      -------
  Warning  FailedCreatePodSandBox  9m2s (x50668 over 17h)  kubelet, win-bu3nulfkq1v  Failed create pod sandbox: rpc error: code = Unknown desc = failed to start sandbox container for pod "kube-proxy-87mdh": Error response from daemon: network host not found
  Normal   SandboxChanged          4m2s (x50935 over 17h)  kubelet, win-bu3nulfkq1v  Pod sandbox changed, it will be killed and re-created.

The error message network host not found also keep appearing on Windows PowerShell.

There are few things to note:

  1. 3 nodes are running on different subnets, but they have L3 connectivity.
  2. I created the cluster by using kubeadm.

Thanks.

@neolit123
Copy link
Member

I created the cluster by using kubeadm.

kubeadm is not supported yet officially with the Windows GA containers release, but let me know if you find any problems related to the tool: kubernetes/kubeadm#1393

having a quick look at the issue, i'm not even sure how kubeadm join passes on the Windows worker node.

can you show the output of the kubeadm join -v=4 .... command?
are you even using kubeadm join to join this worker?

@x8091
Copy link
Author

x8091 commented Mar 29, 2019

@neolit123 : I didn't use kubeadm join command. As you can see in the document, to join Windows node we use a PowerShell script, here is the detail:
.\start.ps1 -ManagementIP 10.10.1.242 -NetworkMode overlay -ClusterCIDR 10.244.0.0/16 -ServiceCIDR 10.96.0.0/12 -KubeDnsServiceIP 10.96.0.10 -LogDir C:\k -KubeletFeatureGates "WinOverlay=true"

Note that on master node, it already see Windows node as worker role with status Ready.

Secondly, if FeatureGate WinOverlay is not provided, the script keep saying "Waiting for the network to be created".

@neolit123
Copy link
Member

if you are creating the control plane node with kubeadm, but then you want to join a worker node to the same cluster without kubeadm join it will probably fail boostrapping the worker properly.

https://kubernetes.io/docs/reference/setup-tools/kubeadm/implementation-details/#configure-tls-bootstrapping-for-node-joining

kubeadm uses BoostrapTokens:
https://kubernetes.io/docs/reference/access-authn-authz/bootstrap-tokens/

@x8091
Copy link
Author

x8091 commented Mar 29, 2019

Your explanation makes sense to me. Seems like to join Windows node we have to use either AKS or GCE right?

@neolit123
Copy link
Member

or any "bootrapper" that currently works with Windows nodes.

as far as i know kube-up also supports that, but the tool is deprecated:
https://github.com/kubernetes/kubernetes/blob/8dd09e0b36d510ddbedb6da446a47e2ffa86c4a4/cluster/gce/windows/README-GCE-Windows-kube-up.md

@x8091
Copy link
Author

x8091 commented Mar 29, 2019

Thank you very much for the information.
I will try and update here if anything come up.
Regards

@sftim
Copy link
Contributor

sftim commented Jun 4, 2019

/language en

@k8s-ci-robot k8s-ci-robot added the language/en Issues or PRs related to English language label Jun 4, 2019
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 2, 2019
@sftim
Copy link
Contributor

sftim commented Sep 10, 2019

This was mostly handled like a support request
/triage support
/close

@k8s-ci-robot
Copy link
Contributor

@sftim: Closing this issue.

In response to this:

This was mostly handled like a support request
/triage support
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the kind/support Categorizes issue or PR as a support question. label Sep 10, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/support Categorizes issue or PR as a support question. language/en Issues or PRs related to English language lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.
Projects
None yet
Development

No branches or pull requests

6 participants