Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

suc-script needs to mount /sys for UKI upgrades #2586

Closed
Tracked by #1792
mudler opened this issue May 24, 2024 · 11 comments
Closed
Tracked by #1792

suc-script needs to mount /sys for UKI upgrades #2586

mudler opened this issue May 24, 2024 · 11 comments
Assignees
Labels

Comments

@mudler
Copy link
Member

mudler commented May 24, 2024

Seems we have a discrepancy in the docs: https://kairos.io/docs/upgrade/trustedboot/

We create a separate suc-script, and seems also it is needed to mount /sys as well before performing the upgrade.

This card is about unifying the suc-script so it works with both UKI and non-UKI environment, or alternatively have two separate scripts that are documented, installed in the system and in the packages repository.

@mudler mudler mentioned this issue May 24, 2024
27 tasks
@ci-robbot ci-robbot added the question Further information is requested label May 24, 2024
@mudler mudler added uki and removed question Further information is requested labels May 24, 2024
@mudler mudler mentioned this issue May 24, 2024
33 tasks
@jimmykarily jimmykarily added the triage Add this label to issues that should be triaged and prioretized in the next planning call label May 24, 2024
@jimmykarily
Copy link
Contributor

jimmykarily commented May 27, 2024

The suc scripts:

let's make sure, our docs and script work both for UKI and non-UKI cases.

@nianyush you also have a version of the suc script that works with UKI. Can you link to it here?

@jimmykarily jimmykarily moved this to In Progress 🏃 in 🧙Issue tracking board May 27, 2024
@jimmykarily jimmykarily moved this from In Progress 🏃 to Todo 🖊 in 🧙Issue tracking board May 27, 2024
@nianyush
Copy link

Sure I can help to add an example

@nianyush
Copy link

For uki, we built an image based on ubuntu and addded efi files into it. So basically its an ubuntu image + /trusted-boot folder which looks like

/trusted-boot
├── EFI
│   ├── BOOT
│   │   └── BOOTX64.EFI
│   └── kairos
│       └── norole.efi
└── loader
    ├── entries
    │   └── norole.conf
    ├── keys
    │   └── auto
    │       ├── db.auth
    │       ├── db.der
    │       ├── KEK.auth
    │       ├── KEK.der
    │       ├── PK.auth
    │       └── PK.der
    └── loader.conf

SUC plan to work with this

---
apiVersion: v1
kind: Secret
metadata:
  name: upgrade
  namespace: system-upgrade
type: Opaque
stringData:
  upgrade.sh: |
    #!/bin/sh
    rm -rf /host/usr/local/trusted-boot
    mkdir -p /host/usr/local/trusted-boot
    mount --rbind /trusted-boot /host/usr/local/trusted-boot
    chroot /host kairos-agent --debug upgrade --source dir:/usr/local/trusted-boot
---
apiVersion: upgrade.cattle.io/v1
kind: Plan
metadata:
  name: os-upgrade
  namespace: system-upgrade
  labels:
    k3s-upgrade: server
spec:
  concurrency: 1
  version: "<CONTAINER_IMAGE_TAG>"
  nodeSelector:
    matchExpressions:
      - {key: kubernetes.io/hostname, operator: Exists}
  serviceAccountName: system-upgrade
  secrets:
    - name: upgrade
      path: /host/run/system-upgrade/secrets/upgrade
  cordon: false
  drain:
    force: false
    disableEviction: true
  upgrade:
    image: "<CONTAINER_IMAGE>"
    command: ["/bin/bash"]
    args: ["/run/system-upgrade/secrets/upgrade/upgrade.sh"]

@nianyush
Copy link

And for non uki, we just use the kairos rootfs image as the plan image instead of using ubuntu image like how kairos docs is doing nowadays. The benefit would be that we can leverage k8s functionalities to manage registry secrets and containerd for caching.

---
apiVersion: v1
kind: Secret
metadata:
  name: upgrade
  namespace: system-upgrade
type: Opaque
stringData:
  upgrade.sh: |
    #!/bin/sh
    mount --rbind /host/dev /dev
    mount --rbind /host/run /run
    mkdir -p /etc/rancher/k3s # remove if you are not using k3s
    mount --rbind /host/etc/rancher/k3s /etc/rancher/k3s # remove if you are not using k3s
    kairos-agent --debug upgrade --source dir:/
---
apiVersion: upgrade.cattle.io/v1
kind: Plan
metadata:
  name: os-upgrade
  namespace: system-upgrade
  labels:
    k3s-upgrade: server
spec:
  concurrency: 1
  version: "<CONTAINER_IMAGE_TAG>"
  nodeSelector:
    matchExpressions:
      - {key: kubernetes.io/hostname, operator: Exists}
  serviceAccountName: system-upgrade
  secrets:
    - name: upgrade
      path: /host/run/system-upgrade/secrets/upgrade
  cordon: false
  drain:
    force: false
    disableEviction: true
  upgrade:
    image: "<CONTAINER_IMAGE>"
    command: ["/bin/bash"]
    args: ["/run/system-upgrade/secrets/upgrade/upgrade.sh"]

@nianyush
Copy link

Since /proc/cmdline is available by default in SUC plan pod, so we can unify them into a single script

apiVersion: v1
kind: Secret
metadata:
  name: upgrade
  namespace: system-upgrade
type: Opaque
stringData:
  upgrade.sh: |
    #!/bin/bash
    if grep -q "rd.immucore.uki" /proc/cmdline; then
      rm -rf /host/usr/local/trusted-boot
      mkdir -p /host/usr/local/trusted-boot
      mount --rbind /trusted-boot /host/usr/local/trusted-boot
      chroot /host kairos-agent --debug upgrade --source dir:/usr/local/trusted-boot
    else
      mount --rbind /host/dev /dev
      mount --rbind /host/run /run
      mkdir -p /etc/rancher/k3s # remove if you are not using k3s
      mount --rbind /host/etc/rancher/k3s /etc/rancher/k3s # remove if you are not using k3s
      kairos-agent --debug upgrade --source dir:/
    fi

@jimmykarily jimmykarily moved this from Todo 🖊 to In Progress 🏃 in 🧙Issue tracking board Jun 3, 2024
@jimmykarily jimmykarily moved this from In Progress 🏃 to Todo 🖊 in 🧙Issue tracking board Jun 3, 2024
@jimmykarily jimmykarily moved this from Todo 🖊 to In Progress 🏃 in 🧙Issue tracking board Jun 3, 2024
@jimmykarily jimmykarily self-assigned this Jun 3, 2024
@kairos-io kairos-io deleted a comment from ci-robbot Jun 3, 2024
@jimmykarily
Copy link
Contributor

I created a draft PR for docs to add all the missing bits from this ticket: https://github.com/kairos-io/kairos-docs/pull/207/files (won't merge until everything is done).

@jimmykarily
Copy link
Contributor

The system-upgrade-controller tries to dain the node if a drain block is defined. For some reason we define it in our examples: https://kairos.io/docs/upgrade/trustedboot/#upgrades-with-kubernetes which makes the upgrade fail with errors about the system-upgrade service account not having persmisisons to delete Pods. I even tried to edit the ClusterRole of the drainer to add that permissions but that simply makes things worse by starting to drain the node that is running the current upgrade. I guess draining is not desired and by removing the drain: block from the Plan, it seems to work.

Also the system-upgrade-controller instructions on how to install it mention this command:

kubectl apply -k github.com/rancher/system-upgrade-controller

which doesn't work because latest tag does not exist. I had to edit the deployment of the controller and change the tags to an existing one (v0.9.1. as in our tests, or latest v0.13.4). I'm not sure what the state of that project is. The users have found workarounds (see the ticket with the missing tag) but no proper fix from the project's maintainers.

@jimmykarily
Copy link
Contributor

jimmykarily commented Jun 7, 2024

I only had to remove drain from the Plan in the secure boot docs page. I didn't have to mount /sys (as in the ticket description). I will also check if the same script works in the non-UKI case.

I'm not sure if we want to change the script to also work for @nianyush 's case. I don't see how "k8s functionalities to manage registry secrets and containerd for caching" are actually used. Is this something the provider does? (in which case the default Kairos provider doesn't do?).

@jimmykarily
Copy link
Contributor

Clarified in planning: What we want to avoid is running kairos-agent with source being oci: because in order to pull from private registries, this command can't use ImagePullSecrets but needs its own authentication.

So we need an image that can be started as a Pod and already has the artifacts for the upgrade.

@nianyush
Copy link

Sorry missed your msg earlier @jimmykarily this is exactly what I was talking about, kairos-agent with remote oci image won't work in private registry case. And for multiple upgrades case, containerd would probably cache some layers of the images so it might be faster and more efficient.

@jimmykarily jimmykarily moved this from In Progress 🏃 to Under review 🔍 in 🧙Issue tracking board Jun 11, 2024
@jimmykarily
Copy link
Contributor

Went with docs. It's now merged. Closing.

@github-project-automation github-project-automation bot moved this from Under review 🔍 to Done ✅ in 🧙Issue tracking board Jun 17, 2024
@jimmykarily jimmykarily removed the triage Add this label to issues that should be triaged and prioretized in the next planning call label Jun 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Archived in project
Development

No branches or pull requests

4 participants