You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug:
I had a single-server k3s cluster with a wireguard backend running last night. I recently tried adding additional servers to it. The additional servers came online, and they have wireguard connections to each other. Here's server 3.3.3.3 successfully connecting to 2.2.2.2 and 4.4.4.4:
> wg show
interface: flannel.1
public key: BKyc3q6MpDFaVqLYrPU3NX7kmr9RahhgQ7JvYV0XFSg=
private key: (hidden)
listening port: 51820
peer: XjnFZcY1o/sbom6/8Z6SSuWbq0cbdMa/w4DOWC5q8Do=
endpoint: 4.4.4.4:51820
allowed ips: 10.42.1.0/24
latest handshake: 25 seconds ago
transfer: 6.11 KiB received, 4.30 KiB sent
persistent keepalive: every 25 seconds
peer: pWv5a3iIfdURa/wlPK5wivy9KleCgeWL//ZJ2eAFbyY=
endpoint: 2.2.2.2:51820
allowed ips: 10.42.2.0/24
latest handshake: 50 seconds ago
transfer: 6.11 KiB received, 4.30 KiB sent
persistent keepalive: every 25 seconds
The original server supposedly has all of this configuration, but none of the connections are open:
> wg show
interface: flannel.1
public key: OeFOZblQVwEkYBEwRGx0cefR+ChN+KNYM1vJjYs70w0=
private key: (hidden)
listening port: 51820
peer: MwE3kq82mJGbBc55suKWQNq1+Tn5/DjHCFp05BrmalI=
endpoint: 4.4.4.4:51820
allowed ips: (none)
transfer: 0 B received, 78.91 KiB sent
persistent keepalive: every 25 seconds
peer: BECrn2WJkNGay50K405E7B7OT3TKoRevg9xZw5xupwc=
endpoint: 2.2.2.2:51820
allowed ips: (none)
transfer: 0 B received, 78.48 KiB sent
persistent keepalive: every 25 seconds
peer: BKyc3q6MpDFaVqLYrPU3NX7kmr9RahhgQ7JvYV0XFSg=
endpoint: 3.3.3.3:51820
allowed ips: 10.42.3.0/24
transfer: 0 B received, 78.05 KiB sent
persistent keepalive: every 25 seconds
I tried to fix this with a systemctl restart k3s, and now nothing shows up at all on the 1.1.1.1 server:
> wg show
interface: flannel.1
public key: xOsXaDV5NZx0ZBbpuj9TGvwrFBYHJKs3f3bf0p5i534=
private key: (hidden)
listening port: 51820
Steps To Reproduce:
Not sure yet.
Expected behavior:
Established Wireguard connections. Or, presumably, k3s should fix up the local networking environment
Actual behavior:
No connections, all traffic to other nodes gets blackholed.
EDIT: Trying to diagnose further... it looks like the other servers in the cluster can't contact each other either, even though the wireguard connections are up. Flannel clearly having problems here, not sure why yet.
k3s check-config says I'm okay but does come up with this... but aren't those routes the ones that k3s created?
System:
- /usr/sbin iptables v1.8.4 (legacy): ok
- swap: should be disabled
- routes: default CIDRs 10.42.0.0/16 or 10.43.0.0/16 already routed
EDIT 2: Seemingly resolved with rolling systemctl restart k3s on all affected servers... the routes came back one at a time. Interesting.
The text was updated successfully, but these errors were encountered:
We have also encountered issues when adding new nodes with wireguard via ansible.
We believe that changing the wireguard keys (the ansible role does that on update) breakes k3s connections somehow.
Restarting the servers fixed the issue but we are planning on migrating away from wireguard to local private network.
@ieugen Interesting. I'm also using Ansible, but using my own playbooks, not https://github.com/k3s-io/k3s-ansible or anything like that. I wonder if it has to do with restarting k3s at the wrong time? Our playbook does restart the k3s service two or three times over the course of the playbook, that could be the issue.
This repository uses a bot to automatically label issues which have not had any activity (commit/comment/label) for 180 days. This helps us manage the community issues better. If the issue is still relevant, please add a comment to the issue so the bot can remove the label and we know it is still valid. If it is no longer relevant (or possibly fixed in the latest release), the bot will automatically close the issue in 14 days. Thank you for your contributions.
Environmental Info:
K3s Version:
Node(s) CPU architecture, OS, and Version:
Cluster Configuration:
Four servers, at the moment I'm hitting this problem. (In the process of spinning up the servers.) Config looks like:
Describe the bug:
I had a single-server k3s cluster with a wireguard backend running last night. I recently tried adding additional servers to it. The additional servers came online, and they have wireguard connections to each other. Here's server
3.3.3.3
successfully connecting to2.2.2.2
and4.4.4.4
:The original server supposedly has all of this configuration, but none of the connections are open:
I tried to fix this with a
systemctl restart k3s
, and now nothing shows up at all on the 1.1.1.1 server:Steps To Reproduce:
Not sure yet.
Expected behavior:
Established Wireguard connections. Or, presumably,
k3s
should fix up the local networking environmentActual behavior:
No connections, all traffic to other nodes gets blackholed.
EDIT: Trying to diagnose further... it looks like the other servers in the cluster can't contact each other either, even though the wireguard connections are up. Flannel clearly having problems here, not sure why yet.
k3s check-config
says I'm okay but does come up with this... but aren't those routes the ones that k3s created?EDIT 2: Seemingly resolved with rolling
systemctl restart k3s
on all affected servers... the routes came back one at a time. Interesting.The text was updated successfully, but these errors were encountered: