You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Environmental Info:
K3s Version:
k3s version v1.20.6+k3s1 (8d04328)
go version go1.15.10
Node(s) CPU architecture, OS, and Version:
Linux 5.4.0-72-generic #80-Ubuntu SMP Mon Apr 12 17:35:00 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Cluster Configuration:
1 server, 1 agent
Describe the bug:
My cluster consists of 1 server and 1 agent, both behind different NATs. The creation of the cluster worked without any problems. Access to all GPUs is also working. I installed kubeflow(https://github.com/kubeflow/kubeflow) on top of k3s, that also works. However, when creating a Jupyter notebook in the kubeflow dashboard, it only works on the server, i.e. I can create one notebook accessing the GPU of the server, but when creating another one with a GPU (which would need to be executed on the agent, since the server only has 1 GPU), the notebook starts but is not accessible.
notebook-4 runs on the server (working fine), and notebook-3 runs on the agent (not accessible).
kubectl describe pod notebook-3-0 -n kubeflow-user-example-com
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 5m33s default-scheduler Successfully assigned kubeflow-user-example-com/notebook-3-0 to untersberg
Normal Pulling 5m33s kubelet Pulling image "docker.io/istio/proxyv2:1.9.0"
Normal Pulled 5m32s kubelet Successfully pulled image "docker.io/istio/proxyv2:1.9.0" in 1.492170458s
Normal Created 5m32s kubelet Created container istio-init
Normal Started 5m32s kubelet Started container istio-init
Normal Pulling 5m31s kubelet Pulling image "docker.io/istio/proxyv2:1.9.0"
Normal Pulled 5m31s kubelet Container image "gcr.io/arrikto-public/tensorflow-1.15.2-notebook-gpu:1.0.0.arr1" already present on machine
Normal Created 5m31s kubelet Created container notebook-3
Normal Started 5m31s kubelet Started container notebook-3
Normal Pulled 5m30s kubelet Successfully pulled image "docker.io/istio/proxyv2:1.9.0" in 1.366426061s
Normal Created 5m30s kubelet Created container istio-proxy
Normal Started 5m30s kubelet Started container istio-proxy
Warning Unhealthy 5m2s (x14 over 5m28s) kubelet Readiness probe failed: Get "http://10.42.1.16:15021/healthz/ready": dial tcp 10.42.1.16:15021: connect: connection refused
Warning Unhealthy 32s (x96 over 3m42s) kubelet Readiness probe failed: HTTP probe failed with statuscode: 503
Using deprecated annotation `kubectl.kubernetes.io/default-logs-container` in pod/notebook-3-0. Please use `kubectl.kubernetes.io/default-container` instead
[I 15:49:10.702 LabApp] Writing notebook server cookie secret to /home/jovyan/.local/share/jupyter/runtime/notebook_cookie_secret
[W 15:49:10.839 LabApp] All authentication is disabled. Anyone who can connect to this server will be able to run code.
[I 15:49:11.009 LabApp] JupyterLab extension loaded from /usr/local/lib/python3.6/dist-packages/jupyterlab
[I 15:49:11.009 LabApp] JupyterLab application directory is /usr/local/share/jupyter/lab
[I 15:49:11.172 LabApp] Serving notebooks from local directory: /home/jovyan
[I 15:49:11.172 LabApp] The Jupyter Notebook is running at:
[I 15:49:11.172 LabApp] http://notebook-3-0:8888/notebook/kubeflow-user-example-com/notebook-3/
[I 15:49:11.172 LabApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
Steps To Reproduce:
Installed K3s:
On the server export K3S_EXTERNAL_IP=<server_public_ip> curl -sfL https://get.k3s.io | INSTALL_K3S_EXEC="--write-kubeconfig ~/.kube/config --write-kubeconfig-mode 666 --tls-san $K3S_EXTERNAL_IP --node-external-ip=$K3S_EXTERNAL_IP" sh -
On the agent: export K3S_TOKEN=<token> export K3S_URL=https://<server_public_ip>:6443 export INSTALL_K3S_EXEC="--token $K3S_TOKEN --server $K3S_URL"
Using deprecated annotation `kubectl.kubernetes.io/default-logs-container` in pod/notebook-4-0. Please use `kubectl.kubernetes.io/default-container` instead
[I 15:49:58.940 LabApp] Writing notebook server cookie secret to /home/jovyan/.local/share/jupyter/runtime/notebook_cookie_secret
[W 15:49:59.093 LabApp] All authentication is disabled. Anyone who can connect to this server will be able to run code.
[I 15:49:59.263 LabApp] JupyterLab extension loaded from /usr/local/lib/python3.6/dist-packages/jupyterlab
[I 15:49:59.263 LabApp] JupyterLab application directory is /usr/local/share/jupyter/lab
[I 15:49:59.434 LabApp] Serving notebooks from local directory: /home/jovyan
[I 15:49:59.434 LabApp] The Jupyter Notebook is running at:
[I 15:49:59.434 LabApp] http://notebook-4-0:8888/notebook/kubeflow-user-example-com/notebook-4/
[I 15:49:59.434 LabApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
kubectl describe pod notebook-4-0 -n kubeflow-user-example-com
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 6m11s default-scheduler Successfully assigned kubeflow-user-example-com/notebook-4-0 to gaisberg
Normal Pulling 6m10s kubelet Pulling image "docker.io/istio/proxyv2:1.9.0"
Normal Pulled 6m9s kubelet Successfully pulled image "docker.io/istio/proxyv2:1.9.0" in 1.23111517s
Normal Created 6m9s kubelet Created container istio-init
Normal Started 6m9s kubelet Started container istio-init
Normal Pulled 6m8s kubelet Container image "gcr.io/arrikto-public/tensorflow-1.15.2-notebook-gpu:1.0.0.arr1" already present on machine
Normal Created 6m8s kubelet Created container notebook-4
Normal Started 6m8s kubelet Started container notebook-4
Normal Pulling 6m8s kubelet Pulling image "docker.io/istio/proxyv2:1.9.0"
Normal Pulled 6m7s kubelet Successfully pulled image "docker.io/istio/proxyv2:1.9.0" in 1.224361895s
Normal Created 6m7s kubelet Created container istio-proxy
Normal Started 6m7s kubelet Started container istio-proxy
The text was updated successfully, but these errors were encountered:
It sounds like the agent and server can communicate with each other, but pods on different nodes cannot. What flannel backend are you using? You may have better luck with wireguard compared to vxlan if both nodes are behind different firewalls.
Yes, I thought so, too. They are behind different firewalls. I did not change anything, so I guess the one that comes with the standard installation? I am pretty new to k3s, do you have any idea what I could specifically do? That would be very much appreciated!
You might try wireguard instead so that it can tunnel between the two networks; you'll need to start both nodes with --node-external-ip=$PUBLIC_IP and --flannel-backend=wireguard. Most CNI backends assume your nodes are on a flat network and can reach each other directly. Even with wireguard you may have a hard time getting this working, depending on how restrictive your firewalls are.
Environmental Info:
K3s Version:
k3s version v1.20.6+k3s1 (8d04328)
go version go1.15.10
Node(s) CPU architecture, OS, and Version:
Linux 5.4.0-72-generic #80-Ubuntu SMP Mon Apr 12 17:35:00 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Cluster Configuration:
1 server, 1 agent
Describe the bug:
My cluster consists of 1 server and 1 agent, both behind different NATs. The creation of the cluster worked without any problems. Access to all GPUs is also working. I installed
kubeflow
(https://github.com/kubeflow/kubeflow) on top ofk3s
, that also works. However, when creating a Jupyter notebook in the kubeflow dashboard, it only works on the server, i.e. I can create one notebook accessing the GPU of the server, but when creating another one with a GPU (which would need to be executed on the agent, since the server only has 1 GPU), the notebook starts but is not accessible.notebook-4
runs on the server (working fine), andnotebook-3
runs on the agent (not accessible).kubectl describe pod notebook-3-0 -n kubeflow-user-example-com
kubectl logs notebook-3-0 -n kubeflow-user-example-com
Steps To Reproduce:
On the server
export K3S_EXTERNAL_IP=<server_public_ip>
curl -sfL https://get.k3s.io | INSTALL_K3S_EXEC="--write-kubeconfig ~/.kube/config --write-kubeconfig-mode 666 --tls-san $K3S_EXTERNAL_IP --node-external-ip=$K3S_EXTERNAL_IP" sh -
On the agent:
export K3S_TOKEN=<token>
export K3S_URL=https://<server_public_ip>:6443
export INSTALL_K3S_EXEC="--token $K3S_TOKEN --server $K3S_URL"
curl -sfL https://get.k3s.io | sh -
Afterwards, I installed
kubeflow
as described here: https://github.com/kubeflow/manifests#install-with-a-single-commandExpected behavior:
I expected both notebooks to be accessible in the dashboard.
Actual behavior:
The notebook
notebook-3
(on the agent) is not accessible.Additional context / logs:
Logs from
notebook-4
kubectl logs notebook-4-0 -n kubeflow-user-example-com
kubectl describe pod notebook-4-0 -n kubeflow-user-example-com
The text was updated successfully, but these errors were encountered: