Releases: vitobotta/hetzner-k3s
v2.0.0
This is the new major release packed with new features, improvements and fixes. Read the upgrade notes below carefully before upgrading. Thanks to the contributors!
New
- Added support for external datastores for HA clusters as alternative to embedded etcd; using a database like Postgres or MySQL makes for better scalability compared to embedded etcd
- Added support for Cilium as CNI, offering significantly improved performance and scalability compared to default Flannel. Please note that this has been tested with latest Ubuntu (24.04). There are some known issues when using Cilium with k3s on Ubuntu 22.04 and perhaps other Linux flavours/versions
- Added a configuration option to disable the private network, allowing for much larger clusters (private networks are limited to 100 servers per network)
- Updated all manifests (CCM, CSI, Autoscaler, System Upgrade Controller)
- Spegel - known as embedded registry mirror in k3s - is a new optional software component (enabled by default) that can be installed by hetzner-k3s in the cluster. This allows peer-to-peer distribution of container images between nodes, which helps work around a known issue with some Hetzner IPs being banned by some registries. With Spegel, if an image is already present on other nodes it will be fetched from those nodes instead of the registry, so it helps with the banned IPs issue while also speeding up image pulling. Important: please check here if the embedded registry is available in your k3s version. If it's not, disable it in the config file or k3s won't be able to start properly.
- Added support for Visual Studio Code dev container to make developing the tool easier
- Added support for the Hillsboro, Oregon region, which was not available in the previous version due to a conflict with network zones
- Enabled local path storage class, for workloads like databases that benefit from max IOPS (see this for more info)
- With HA cluster a load balancer for the API is no longer created. Instead, a multi context kubeconfig file is generated to be able to interact with the cluster with the selected master. This saves costs and is more secure since connections to the masters directly can be restricted to networks you specify in the config file.
Improvements
- Made creation of cloud resources more reliable with automatic recovery in some scenarios (thing will be retried multiple attempts when something fails e.g. due to high concurrency of temporary issue with the Hetzner API)
- Implemented automatic handling of Hetzner API rate limits. The tool will automatically wait when the rate limit has been hit and will resume automatically when possible. This makes it easier to create larger clusters that might require a lot of API calls
- Reduced the number of API calls required to handle existing instances when rerunning the tool. In order to save on API calls, then tool now checks if the node is already a member of the cluster by using kubectl; if it is and it can be reachable ad the external IP address reported by kubectl, then it doesn't need to make API calls to find information about the instance and can proceed with updating the instance directly; this makes it easier to add many nodes to an existing cluster since the number of total API calls required for a second run is lower than it would have been before
- Massively improved cluster creation speed: during tests with private network disabled (private networks support max 100 servers per network), I was able to create a 200 node cluster in less than 4 minutes
- Improved logging of all actions so it's easier to identify which item some log lines refer to
- More clever handling of placement groups: each Hetzner Cloud project allows max 50 placement groups, with max 10 servers per placement group. This means that a cluster using placement groups would normally allow max 500 servers (with private network disabled), but with this update hetzner-k3s will more cleverly make use of placement groups, allowing to create extra servers without one if the limit has been reached already
- Improved structure of the configuration file to group related settings together in a more coherent way
- The instance type is no longer included in the instance names as this was causing confusion when changing instance type directly from the Hetzner console.
- Raised the timeout for SSH connections to 5 seconds since sometimes a connection can take longer than 1 second (which was the previous limits) causing the verification of whether a server is ready to hang
- Added information on contributing with the VSCode dev container (by @jpetazzo)
- Use the IP of the load balancer in the kubeconfig instead of a hostname, since the IP of the LB cannot be known in advance potentially causing problems with the first interactions with the API server until the DNS record resolves to the correct IP (by @axgkl)
Fixes
- Addressed a known issue with DNS ("Nameserver Limits Exceeded" warning) by forcing k3s to use a custom resolv.conf with a single nameserver (Google's)
- Made it possible to reliably replace the "seed" master, that is the first master used to initialize the cluster. Prior to this change, if the first master needed to be replaced due to faulty hardware or else, the risk of compromising the whole cluster was significant
- Placement groups are now deleted automatically when unused (e.g. after deleting a node pool or the whole cluster)
- Fixed an issue with the detection of the private network interface (by @cwilhelm)
- Default node port range is now automatically open in the firewall
- Ensured we wait for Cloud Init to complete the initialization process before setting up k3s (by @axgkl)
- When public IPs are disabled in the configuration, now they are also disabled for autoscaled nodes (by @Funzinator)
- Fixed support for multiline post create commands
- Fixed an issue when using a custom SSH port with newer distros that use socket activation (by @jpetazzo)
- Autoscaled nodes are now automatically deleted like static pool nodes
Upgrading from v1.1.5
Important: Read these upgrade notes carefully and test the upgrade with a test cluster first, if possible.
Before upgrading:
- Delete existing kubeconfig
- Create the file
/etc/k8s-resolv.conf
on ALL instances (both masters and workers); the file should include a single line:nameserver 8.8.8.8
- Update the config file following the new structure you can see here. For example move the setting
use_ssh_agent
from the root of the config file to
networking:
ssh:
use_agent: ...
Follow the same pattern for these settings:
ssh_port -> networking.ssh.port
public_ssh_key_path -> networking.ssh.public_key_path
private_ssh_key_path -> networking.ssh.private_key_path
ssh_allowed_networks -> networking.allowed_networks.ssh
api_allowed_networks -> networking.allowed_networks.api
private_network_subnet -> networking.private_network.subnet
disable_flannel -> networking.cni.enabled = false
enable_encryption -> networking.cni.encryption = true
cluster_cidr -> networking.cluster_cidr
service_cidr -> networking.service_cidr
cluster_dns -> networking.cluster_dns
enable_public_net_ipv4 -> networking.public_network.ipv4
enable_public_net_ipv6 -> networking.public_network.ipv6
existing_network -> settings.networking.private_network.existing_network_name
cloud_controller_manager_manifest_url -> manifests.cloud_controller_manager_manifest_url
csi_driver_manifest_url -> manifests.csi_driver_manifest_url
system_upgrade_controller_deployment_manifest_url -> manifests.system_upgrade_controller_deployment_manifest_url
system_upgrade_controller_crd_manifest_url -> manifests.system_upgrade_controller_crd_manifest_url
cluster_autoscaler_manifest_url -> manifests.cluster_autoscaler_manifest_url
- set
networking.private_network.enabled
totrue
as all existing clusters were using a private network while the new default isfalse
to allow creating larger clusters more easily - set
include_instance_type_in_instance_name
totrue
; this is because historically the instance type was included in the names of the instances, causing confusion when changing instance type from the Hetzner console. Since clusters created prior to v2 used that old naming scheme, this new setting must be set totrue
to preserve that behavior with v2. - If the cluster is HA, delete the load balancer created by previous versions of hetzner-k3s for the Kubernetes API as it's no longer needed (see improvements section)
Contributing:
If you are a Visual Studio Code user and would like to contribute to the project, you can now more easily work on it by using a dev container with Code. Crystal and all the other dependencies are already included in the container. See docs for more details.
v1.1.5
- New: Add configuration options for cluster-cidr, service-cidr and cluster-dns by @Floppy012
- Update Hetzner CSI to v2.5.1
- Update Hetzner CCM to v1.18.0
- Update cluster autoscaler to 1.28.0
- Fix: don't force Ubuntu as the OS for autoscaled nodes if an image has not been specified for autoscaling. Use default image from the config instead
- Improvement: for large clusters, create servers in batches of 10 to avoid problems with the Hetzner API
- Improvement: limit concurrency for upgrading K3s on worker nodes to 25% of nodes per time
v1.1.4
NOTE: the hetzner-k3s --version
command will show v1.1.3 as I forgot to update the version before releasing, but it's actually 1.1.4 if you install with homebrew or download the binaries from this release.
- Added config parameter to disable public interfaces when creating nodes (ipv4 and/or ipv6) by @derlinuxer
- Cloud init function can be configured in pool section too (pool config overrules global config) by @derlinuxer
- Improved release workflow by @derlinuxer
- Build for Linux ARM by @derlinuxer
- Add private ip of API load balancer to TLS SAN of K8s API by @mgalesloot
- Allow adding a custom DNS name to the TLS SAN of K8s API by @mgalesloot
- Bump cluster-autoscaler to v1.27.3 by @Funzinator
v1.1.3
- Made it possible to configure URLs for manifests (CSI driver, Cloud Controller Manager and System Upgrade Controller)
- Upgraded Cluster Autoscaler to 1.26.3
- Added a switch that allows to disable Flannel for cases when you want to install a different CNI
- Made it possible to set the SSH port
v1.1.2
- Added support for the new ARM instances 🎉
v1.1.1
This is mostly a maintenance release to refactor and improve the code. In addition:
- Removed the restriction of max 10 servers per worker pool. This was due to a limitation in Hetzner's placement groups that can contain max 10 servers each. With this update, hetzner-k3s creates as many placement groups as needed to accommodate the number of servers specified in the pool
- Upgraded Cluster Autoscaler to 1.26.2
v1.1.0
- Fixed an issue introduced in the previous release that prevented traffic from flowing through the private network
v1.0.9
- Make it possible to customize the subnet
v1.0.8
- Fixed: now it is possible to specify the Hetzner Cloud token in the environment variable
HCLOUD_TOKEN
instead of the config file. This makes it possible to safely commit the config file to a repository without leaking the token - Fixed: there was an issue preventing cluster upgrades due to a bug in the comparison of the new k3s version with the current
v1.0.7
- Makes it possible to use keys with old SHA1 crypto on Redhat system, until the upstream ssh2 library is updated