-
-
Notifications
You must be signed in to change notification settings - Fork 14.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Networkd containers #140669
base: master
Are you sure you want to change the base?
Networkd containers #140669
Conversation
nixos/modules/virtualisation/containers-next/container-profile.nix
Outdated
Show resolved
Hide resolved
Thanks a lot for your comments! As soon as we start discussing the RFC in January and I know how much of this PR is actually useful, I'll start fixing these :) |
fed3c5e
to
2fc5579
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would old things from /nix/var/nix/profiles/per-container/nixos/
and /var/lib/containers/
be migrated? I am not using the nixos-container script anymore because it is buggy but the container was originally created with it.
Hope the comments help you.
I implemented a VM test demonstrating how a migration should work: https://github.com/NixOS/nixpkgs/pull/140669/files#diff-86feebe7d88f2d7c0dd00d87e110566c6e8fcb98cefdc7a06f3478789ef55a79
The same principle applies to imperative containers. |
Sometimes it's needed to build a configuration within a `nix-build` for systemd units. While this is fairly easy for .service-units (where you can easily define overrides), it's not possible for `systemd-nspawn(1)`. This is mostly a hack to get dedicated bind-mounts of store paths from `pkgs.closureInfo` into the configuration without IFD. In the long term we either want to fix this in systemd or find a more suited solution though.
…s w/networkd This is the first batch of changes for a new container-module replacing the current `nixos-container`-subsystem in the longterm. The state in here is still strongly inspired by the `containers`[1]-module to declare declarative nspawn-instances by using NixOS config for the host and the container itself. For now, this module uses the tentative namespace `nixos.containers', but that's subject to change. This new module will also contain the following key-differences: * Rather than writing a big abstraction-layer on top, we'll rely on `.nspawn`-units[2]. This has the benefits that (1) we can stop adding options for each new nspawn-feature (such as MACVLANs, ephemeral instances, etc.) because it can be directly written into the `.nspawn`-unit using the module system like systemd.nspawn.foo.filesConfig = { BindReadOnly = /* ... */ }; Also, administrators don't need to learn too much about our abstractions, they only need to know a few basics about the module-system and how to write systemd units. * This feature strictly enforces `systemd-networkd` on both the container & the host. It can be turned off for containers in the host-namespace without a private network though. The reason for this is that the current `nixos-container` implementation has the long-standing bug that the container's uplink is broken *until* the container has booted since the host-side of the veth-pair is configured in `ExecStartPost=`[3]. This is, because there's no proper way to take care of it in an earlier stage since `systemd-nspawn` creates the interface itself. This has e.g. the implication that services inside the container wrongly assume that they connect to e.g. an external database via network (since `network{,-online}.target` was reached), however this is not the case due to the unconfigured host-side veth interface. However, when using `systemd-networkd(8)` on both sides, this is not the case anymore since systemd will automatially take care of configuring the network correctly when an nspawn unit starts and `networkd` is active. Apart from a basic draft, this also contains support for RFC1918 IPv4-addresses configured via DHCP and ULA-IPv6 addresses configured via SLAAC and `radvd(8)` including support for ephemeral containers. Further additions such as a better config-activation mechanism and a tool to manage containers imperatively will follow. [1] https://nixos.org/manual/nixos/stable/options.html#opt-containers [2] https://www.freedesktop.org/software/systemd/man/systemd.nspawn.html# [3] https://github.com/NixOS/nixpkgs/blob/8b0f315b7691adcee291b2ff139a1beed7c50d94/nixos/modules/virtualisation/nixos-containers.nix#L189-L240
This exposes a given `containerPort` to the host address. So if port 80 from the container is forwarded to the host's port 8080 and the container uses `2001:DB8::42` and the host-side uses `2001:DB8::23` on the veth-interface, then `[2001:DB::42]:80` will be available on the host as `[2001:DB8::2]:8080`.
This change tests various combinations of static & dynamic addressing and also fixes a bug where `radvd(8)` was errorneously configured for veth-pairs where it's actually not needed. This test is also supposed to show how to use `systemd`-configuration to implement most of the features (for instance there's no custom set of options to implement MACVLANs) and serves as regression-test for future `systemd`-updates in NixOS. Please note that the `ndppd`-hack is only here because QEMU doesn't do proper IPv6 neighbour resolution. In fact, I left comments whenever some workarounds were needed for the testing-facility.
This test is supposed to demonstrate how to migrate a single container to the new subsystem. Of course, docs on how to rewrite config isn't written yet, this is mainly a POC to show that it's generally possible by * Deploying a new configuration (using `nixos.containers`) being equivalent to the old one. * Moving the state from `/var/lib/containers` to `/var/lib/machines`. * Rebooting the host - unfortunately - because otherwise `systemd-networkd` will reach an inconsistent state - at least with v247. For the reboot-part I also had to change the QEMU vm-builder a bit to actually support persistent boot-disks.
Applied the diff from NixOS#140669 at revision cd533c3.
I've been using this PR successfully for a while now and made a few changes. The current NixOS container module allows using an existing nixosConfiguration or path so I've implemented a similar feature in my branch. I also added an option to allow using a specialisation of the container's nixosConfiguration. containers-next default.nix A clever use case for these features is you can easily deploy a production container and then a development/test specialisation of that same container. I still have a bit to clean up but I thought I'd just comment on my experience with this PR which has been great. |
(also posted in #nix-rfc-108:matrix.org) Hi! |
Thanks for the update ma27. I am indeed eager to contribute/take over and continue this work. It's a shame you've lost time to contribute but I hope that myself and whoever else comes on board can continue the great work you've done here 🙂 For those reading this that are maybe not aware, I had developed my own version of the nixos-nspawn declarative container tooling here: https://github.com/m1cr0man/python-nixos-nspawn Short of what is in #216025 (which I do need to update to reflect further work done here), it is standalone and can be imported as a flake. I have been running declarative containers now for over a year at least and it's been working great. What I'd like to suggest as an action plan right now is getting the minimal viable changes to nixpkgs into master as soon as possible, and then creating a flake to iterate on the imperative and declarative container modules/components before then merging that into master too (if it makes sense). Right now, it's a bit inconvenient to run on a forked nixpkgs for any of RFC108 to work correctly, and I think it will ease adoption over time too. I still need to familiarize myself more with the imperative container management suite and also the migration path/solution for legacy containers. These are just my thoughts of course, and I'm eager to hear what others think we should do. 😄 |
Small update. Currently seeing if it's possible to consolidate the generation of systemd units (nspawn + networkd) between imperative and declarative containers. Right now, declarative container units are generated in Nix whilst the imperative container units are generated in Python, the reason being it's really awkward to make the module work correctly in both scenarios. I have a working (read: buildable but untested) POC here so I'm hopeful it will be possible. My main motive is to reduce the chance of the two deployment methods diverging and to reduce duplicate code. |
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/nixcon-governance-workshop/32705/9 |
Applied the diff from NixOS#140669 at revision cd533c3.
Motivation for this change
POC for NixOS/rfcs#108
Tests are regularly built at https://hydra.ist.nicht-so.sexy/jobset/nixpkgs/networkd-containers.
(This is also the reason for the temporary
jobset.nix
in the project's root).Things done
sandbox = true
set innix.conf
? (See Nix manual)nix-shell -p nixpkgs-review --run "nixpkgs-review wip"
./result/bin/
)