Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

boot.initrd.network.ssh.hostRSAKey breaks activation if removed #34262

Open
Baughn opened this issue Jan 25, 2018 · 11 comments
Open

boot.initrd.network.ssh.hostRSAKey breaks activation if removed #34262

Baughn opened this issue Jan 25, 2018 · 11 comments

Comments

@Baughn
Copy link
Contributor

Baughn commented Jan 25, 2018

Issue description

After first setting hostRSAKey and rebuilding the system, if the file is subsequently removed (and the setting commented out) then activation will fail.

It appears that all generations use the same initrd, instead of creating a separate file for each. This is true even when they should be separate. My best guess would be that the hostRSAKey is not included in the hash.

personal> closures copied successfully
saya...> cp: cannot stat '/run/keys/hostRSAKey': No such file or directory
saya...> Traceback (most recent call last):
saya...>   File "/nix/store/c5bnfxl43j0f5lfivg2pgrczvl7vh9iv-systemd-boot-builder.py", line 210, in <module>
saya...>     main()
saya...>   File "/nix/store/c5bnfxl43j0f5lfivg2pgrczvl7vh9iv-systemd-boot-builder.py", line 197, in main
saya...>     write_entry(*gen, machine_id)
saya...>   File "/nix/store/c5bnfxl43j0f5lfivg2pgrczvl7vh9iv-systemd-boot-builder.py", line 85, in write_entry
saya...>     subprocess.check_call([append_initrd_secrets, "/boot%s" % (initrd)])
saya...>   File "/nix/store/53dyjh7xjhnbibqllr7j27lk2h98n7j7-python3-3.6.4/lib/python3.6/subprocess.py", line 291, in check_call
saya...>     raise CalledProcessError(retcode, cmd)
saya...> subprocess.CalledProcessError: Command '['/nix/store/0xfvmgbafj9xxzzvba2pckd1w0i83qrs-append-initrd-secrets/bin/append-initrd-secrets', '/boot/efi/nixos/7k38fm34cq6xrca4nxb10zz2hk191zp1-initrd-initrd.efi']' returned non-zero exit status 1.
grep -r 7k38fm /boot
/boot/loader/entries/nixos-generation-66.conf:initrd /efi/nixos/7k38fm34cq6xrca4nxb10zz2hk191zp1-initrd-initrd.efi
/boot/loader/entries/nixos-generation-67.conf:initrd /efi/nixos/7k38fm34cq6xrca4nxb10zz2hk191zp1-initrd-initrd.efi
/boot/loader/entries/nixos-generation-68.conf:initrd /efi/nixos/7k38fm34cq6xrca4nxb10zz2hk191zp1-initrd-initrd.efi
/boot/loader/entries/nixos-generation-69.conf:initrd /efi/nixos/7k38fm34cq6xrca4nxb10zz2hk191zp1-initrd-initrd.efi

On a sidenote, while fixing the problem (using nix-collect-garbage -d), I arrived at a situation where the most-recent GRUB boot entry referred to a system configuration that no longer existed. I'm not sure how.

Technical details

  • system: "x86_64-linux"
  • host os: Linux 4.14.14, NixOS, 18.03.git.d492cdc789c (Impala)
  • multi-user?: yes
  • sandbox: relaxed
  • version: nix-env (Nix) 1.11.16
  • channels(root): "nixos-18.03pre126063.95880aaf062"
  • nixpkgs: /nix/var/nix/profiles/per-user/root/channels/nixos/nixpkgs
@WilliButz
Copy link
Member

@Baughn could you paste the dropbear-specific config-section? :)

@Baughn
Copy link
Contributor Author

Baughn commented Jan 25, 2018

Sure, but it's a little complicated.

We talked this over on IRC. For anyone following along, http://ix.io/EG7 has the relevant configuration with the failed bits commented out, in emergency-shell.nix.

@Baughn
Copy link
Contributor Author

Baughn commented Jan 25, 2018

After a lot of digging, it seems the problem is here: https://github.com/NixOS/nixpkgs/blob/master/nixos/modules/system/boot/loader/systemd-boot/systemd-boot-builder.py#L84

Specifically, write_entry gets called in a loop, for every generation. This fails when the host key has already been removed from the system. It's theoretically fixable by not updating the initrd unnecessarily, but it might be easier to document it and wait for secrets-in-nix-store to exist.

There's another bug which would block that fix. Assuming this initrd secret is the only difference between the configurations, their respective initrds will have the same hash -- and so the same filename in /boot. That wouldn't cause trouble for initrd ssh, but should be fixed anyway.

@Baughn
Copy link
Contributor Author

Baughn commented Jan 26, 2018

From the looks of #8, we won't be getting a perfect solution anytime soon. That leaves the options of "fixing" systemd-boot (which would make it more fragile), or simply documenting the bug in the hostess key and related attributes. I'm inclined towards the latter.

@ghost
Copy link

ghost commented Feb 18, 2018

I ran into this issue while setting up a new nixos machine and was completely puzzled by the error until i figured what was going on. I think at the very least the error could be more helpful.

@stale
Copy link

stale bot commented Jun 5, 2020

Thank you for your contributions.

This has been automatically marked as stale because it has had no activity for 180 days.

If this is still important to you, we ask that you leave a comment below. Your comment can be as simple as "still important to me". This lets people see that at least one person still cares about this. Someone will have to do this at most twice a year if there is no other activity.

Here are suggestions that might help resolve this more quickly:

  1. Search for maintainers and people that previously touched the related code and @ mention them in a comment.
  2. Ask on the NixOS Discourse.
  3. Ask on the #nixos channel on irc.freenode.net.

@stale stale bot added the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Jun 5, 2020
@ejpcmac
Copy link
Contributor

ejpcmac commented Jun 5, 2020

What’s the state of this issue today? I have not modified this setting recently on any of my systems, but hitting this issue while re-installing a system could lead to confusion.

@stale stale bot removed the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Jun 5, 2020
@ejpcmac
Copy link
Contributor

ejpcmac commented Jul 14, 2020

Self reply: I’ve re-organised yesterday the files in my config repository, moving my dropbear host RSA key—and correctly updating the relevant configuration line, forgetting about this issue. I’ve lost something like 40 minutes trying to remember how to fix it. For others—and maybe future self—who encounter this issue, follows a procedure that works to move the key file:

  1. Keep the file at the same place, otherwise the initrd builder is unhappy.
  2. Comment out the boot.initrd.network.ssh section.
  3. sudo nixos-rebuild switch
  4. sudo nix-collect-garbage -d (yes, all your generations are gone…)
  5. sudo nixos-rebuild switch => this updates the /boot partition, effectively updating boot entries and the initrd.
  6. Move the key file (doing a new sudo nixos-rebuild switch without changing the config should then work, as the initrd refering to it is now gone.
  7. Uncomment the boot.initrd.network.ssh section and update hostRSAKey.
  8. sudo nixos-rebuild switch
  9. You should be fine.

@stale
Copy link

stale bot commented Jan 10, 2021

I marked this as stale due to inactivity. → More info

@stale stale bot added the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Jan 10, 2021
@sorpaas
Copy link
Member

sorpaas commented Oct 14, 2021

(For stalebot) This issue is still relevant.

@stale stale bot removed the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Oct 14, 2021
@Baughn
Copy link
Contributor Author

Baughn commented Feb 12, 2022

I've resolved this to my satisfaction for my own systems, using Agenix to handle the SSH keys. It's not a true fix, but if you come across this bug in 2022 you might try that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants