This is a repository for my home infrastructure and Kubernetes cluster. I try to adhere to Infrastructure as Code (IaC) and GitOps practices using tools like Kubernetes, Flux, Renovate and GitHub Actions.
This semi hyper-converged cluster operates on Talos Linux, an immutable and ephemeral Linux distribution tailored for Kubernetes, and is deployed on bare-metal MS-01 workstations. Rook supplies my workloads with persistent block, object, and file storage, while a separate server handles media file storage. The cluster is designed to enable a full teardown without any data loss.
There is a template at onedr0p/cluster-template if you want to follow along with some of the practices I use here.
- actions-runner-controller: Self-hosted Github runners.
- cert-manager: Creates SSL certificates for services in my cluster.
- cilium: Internal Kubernetes container networking interface.
- cloudflared: Enables Cloudflare secure access to my ingresses.
- external-dns: Automatically syncs ingress DNS records to a DNS provider.
- external-secrets: Managed Kubernetes secrets using 1Password Connect.
- ingress-nginx: Kubernetes ingress controller using NGINX as a reverse proxy and load balancer.
- multus: Multi-homed pod networking.
- rook: Distributed block storage for peristent storage.
- sops: Managed secrets for Kubernetes which are commited to Git.
- spegel: Stateless cluster local OCI registry mirror.
- volsync: Backup and recovery of persistent volume claims.
Flux watches my kubernetes folder (see Directories below) and makes the changes to my clusters based on the state of my Git repository.
The way Flux works for me here is it will recursively search the kubernetes/apps folder until it finds the most top level kustomization.yaml
per directory and then apply all the resources listed in it. That aforementioned kustomization.yaml
will generally only have a namespace resource and one or many Flux kustomizations (ks.yaml
). Under the control of those Flux kustomizations there will be a HelmRelease
or other resources related to the application which will be applied.
Renovate monitors my entire repository for dependency updates, automatically creating a PR when updates are found. When some PRs are merged Flux applies the changes to my cluster.
This Git repository contains the following directories under kubernetes.
📁 kubernetes # Kubernetes cluster defined as code
├─📁 apps # Apps deployed into my cluster grouped by namespace (see below)
├─📁 bootstrap # Initial resources to bootstrap my cluster
└─📁 flux # Flux system configuration
This is a high-level look how Flux deploys my applications with dependencies. Below there are 3 Flux kustomizations postgres
, postgres-cluster
, and atuin
. postgres
is the first app that needs to be running and healthy before postgres-cluster
and once postgres-cluster
is healthy atuin
will be deployed.
graph TD;
id1>Kustomization: flux-system] -->|Creates| id2>Kustomization: cluster-apps];
id2>Kustomization: cluster-apps] -->|Creates| id3>Kustomization: postgres];
id2>Kustomization: cluster-apps] -->|Creates| id5>Kustomization: postgres-cluster]
id2>Kustomization: cluster-apps] -->|Creates| id8>Kustomization: atuin]
id3>Kustomization: postgres] -->|Creates| id4(HelmRelease: postgres);
id5>Kustomization: postgres-cluster] -->|Depends on| id3>Kustomization: postgres];
id5>Kustomization: postgres-cluster] -->|Creates| id10(Cluster: postgres);
id8>Kustomization: atuin] -->|Creates| id9(HelmRelease: atuin);
id8>Kustomization: atuin] -->|Depends on| id5>Kustomization: postgres-cluster];
In my cluster there are two instances of ExternalDNS running. One for syncing private DNS records to my UDM Pro Max
using ExternalDNS webhook provider for UniFi, while another instance syncs public DNS to Cloudflare
. This setup is managed by creating ingresses with two specific classes: internal
for private DNS and external
for public DNS. The external-dns
instances then syncs the DNS records to their respective platforms accordingly.
Device | Count | OS Disk Size | Data Disk Size | Ram | Operating System | Purpose |
---|---|---|---|---|---|---|
MS-01 (i9-13900H) | 3 | 1.92TB M.2 | 3.84TB U.2 + 1.92TB M.2 | 96GB | Talos | Kubernetes |
Synology NAS RS1221+ | 1 | - | 8x22TB HDD | 32GB | DSM 7 | NFS |
PiKVM (RasPi 4) | 1 | 64GB (SD) | - | 4GB | PiKVM | KVM |
TESmart 8 Port KVM Switch | 1 | - | - | - | - | Network KVM (for PiKVM) |
UniFi UDM Pro Max | 1 | - | 2x16TB HDD | - | UniFi OS | Router & NVR |
UniFi USW Pro Aggregation | 1 | - | - | - | UniFi OS | 10G/25Gb Core Switch |
UniFi USW Pro Max 24 PoE | 1 | - | - | - | UniFi OS | 2.5Gb PoE Switch |
UniFi USP PDU Pro | 1 | - | - | - | UniFi OS | PDU |
APC SMT15000RM2UNC | 1 | - | - | - | - | UPS |
Many thanks to my friend @onedrop and all the fantastic people who donate their time to the Home Operations Discord community. Be sure to check out kubesearch.dev for ideas on how to deploy applications or get ideas on what you may deploy.
See the latest release notes.
See LICENSE.