Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A more easy way to setup nydus snapshotter by DaemonSet #570

Merged
merged 12 commits into from
Jan 26, 2024

Conversation

ChengyuZhu6
Copy link

@ChengyuZhu6 ChengyuZhu6 commented Dec 19, 2023

Confidential Containers leverage nydus snapshotter to pull images within the guest using fs_driver=proxy and share images on the host using fs_driver=blockdev to ensure data confidentiality. However, when running related tests for Kata CI with nydus snapshotter, we faced challenges installing and uninstalling nydus snapshotter on various architectures and configuring the containerd settings. We envisioned a smoother experience, where nydus snapshotter provides a daemonset, enabling users to seamlessly run and clean up nydus snapshotter and making it more easy.

Fixes #565

Add comments for fs_driver on snapshotter config.

Signed-off-by: ChengyuZhu6 <[email protected]>
@ChengyuZhu6
Copy link
Author

@ChengyuZhu6 ChengyuZhu6 changed the title A more generic way to setup nydus snapshotter by DaemonSet A more easy way to setup nydus snapshotter by DaemonSet Dec 20, 2023
@adamqqqplay
Copy link
Contributor

@ChengyuZhu6 Thanks for your great work! It will take some time to review.

@imeoer
Copy link
Collaborator

imeoer commented Dec 25, 2023

It seems to break some tests, let's take a look first.

@imeoer
Copy link
Collaborator

imeoer commented Dec 29, 2023

Cloud we enhance the helm chart? We can provide the options to configure the snapshotter fs_driver.

@ChengyuZhu6
Copy link
Author

ChengyuZhu6 commented Dec 29, 2023

Cloud we enhance the helm chart? We can provide the options to configure the snapshotter fs_driver.

Yes, I plan to use helm for installing snapshotter, but it requires a different repo. So I have proposed an alternative method that does not rely on helm. Next, I will submit a PR to the helm repo.

@imeoer
Copy link
Collaborator

imeoer commented Dec 29, 2023

@@ -26,6 +26,14 @@ COPY containerd-nydus-grpc /usr/local/bin/
RUN chmod +x /usr/local/bin/containerd-nydus-grpc
COPY nydusd-config.fusedev.json /etc/nydus/config.json
COPY nydusd-config-localfs.json /etc/nydus/localfs.json
COPY nydusd-config.fusedev.json /etc/nydus/nydusd-fusedev.json
COPY nydusd-config-localfs.json /etc/nydus/nydusd-localfs.json
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nydusd-config-localfs.json is copied twice (likewise nydusd-config.fusedev.json);

Maybe it can be copied to /etc/nydus/nydusd-localfs.json only, then create a /etc/nydus/localfs.json symlink

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

COPY config-proxy.toml /etc/nydus/config-proxy.toml
COPY config-fscache.toml /etc/nydus/config-fscache.toml
COPY config-fusedev.toml /etc/nydus/config-fusedev.toml

COPY entrypoint.sh /

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the renaming of config-coco-host-sharing.toml → misc/snapshotter/config-blockdev.toml and ...napshotter/config-coco-guest-pulling.toml → misc/snapshotter/config-proxy.toml , would make sense to add the kata- filename prefix because they should be used with kata containers?

fscache) SNAPSHOTTER_CONFIG="${NYDUS_CONFIG_DIR}/config-fscache.toml" ;;
blockdev) SNAPSHOTTER_CONFIG="${NYDUS_CONFIG_DIR}/config-blockdev.toml" ;;
proxy) SNAPSHOTTER_CONFIG="${NYDUS_CONFIG_DIR}/config-proxy.toml" ;;
*) die "invalid fs driver" ;;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to log the name of the invalid driver, e.g., invalid fs driver: ${FS_DRIVER}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.


FROM alpine:3.17.0

WORKDIR /root/

RUN apk add --no-cache libc6-compat
RUN apk add --no-cache libc6-compat bash jq
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @ChengyuZhu6 ! I'm not sure if you are aware of but we have found an issue while using toml-cli on kata containers that was corrupting the containerd's configuration file. @fidencio gave some details in kata-containers/kata-containers#8679 (comment) but it turns out you must ensure jq > 1.7 is installed. Here is how it was solved on kata side: kata-containers/kata-containers@41320c5

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @wainersm. Thanks for your review. Here we use toml-cli , not tomlq.

Copy link

@wainersm wainersm Jan 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, I was wondering why the binary is named toml rather than tomlq and that explains :)

local key="$1"
local value="$2"
local file="${3:-$SNAPSHOTTER_CONFIG}"
toml set --overwrite "$file" "$key" $value
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"$value"

update_toml_config "experimental.tarfs.mount_tarfs_on_host" "$MOUNT_TARFS_ON_HOST"
update_toml_config "experimental.tarfs.export_mode" "$EXPORT_MODE"
;;
*) die "invalid fs driver in options_handler" ;;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here again print the $FS_DRIVER value

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

elif toml check $CONTAINER_RUNTIME_CONFIG "proxy_plugins.nydus" >/dev/null; then
echo "the config has configured the nydus proxy plugin!"
else
die "there is another proxy plugin in the containerd!"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't it be multiple proxy plugins?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your reminder, I have fixed it.

if toml check $CONTAINER_RUNTIME_CONFIG "proxy_plugins" >/dev/null; then
if [ "$(toml get $CONTAINER_RUNTIME_CONFIG proxy_plugins | jq length)" == "0" ]; then
echo "there is no proxy plugin!"
cat <<EOF >>"$CONTAINER_RUNTIME_CONFIG"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible to add the proxy_plugins entire section with toml instead of cat?

Maybe have a variable to hold /run/containerd-nydus/containerd-nydus-grpc.sock?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe have a variable to hold /run/containerd-nydus/containerd-nydus-grpc.sock?
Fixed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible to add the proxy_plugins entire section with toml instead of cat?

I don't think that's possible due to a limitation on the toml-cli.

EOF
fi

update_toml_config "plugins.\"io.containerd.grpc.v1.cri\".containerd.discard_unpacked_layers" false "${CONTAINER_RUNTIME_CONFIG}"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mind to document why nydus-snapshotter needs those two configurations enabled?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

update_toml_config "plugins.\"io.containerd.grpc.v1.cri\".containerd.snapshotter" "nydus" "${CONTAINER_RUNTIME_CONFIG}"
fi

nsenter -t 1 -m systemctl -- restart containerd.service
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it wait containerd to restart then check it is running properly again?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, please, I second Wainer's request on this, as we faced issues where containerd would simply be off after some changes.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -154,3 +154,21 @@ function deploy_snapshotter() {
configure_snapshotter
${COMMANDLINE} &
}

function cleanup_snapshotter() {
pid=$(ps -ef | grep containerd-nydus-grpc | grep -v grep | awk '{print $1}')
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect that you can use $SNAPSHOTTER_BINARY. Something like:

pid=$(pidof -s "$SNAPSHOTTER_BINARY" || true)

BTW, I'm a little confused here... is the snapshotter daemon running on the host or within the daemonset container? If the former then it needs to run the command with nsenter, right?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would running in the daemonset.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have changed the logic so that the snapshotter runs on the host and can be as a systemd service.

pid=$(ps -ef | grep containerd-nydus-grpc | grep -v grep | awk '{print $1}')
if [ ! -z "$pid" ]; then
for i in $(nsenter -t 1 -m ctr -n k8s.io snapshot --snapshotter nydus list | grep -v KEY | cut -d' ' -f1); do
nsenter -t 1 -m ctr -n k8s.io snapshot --snapshotter nydus rm $i || true
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose this operation can fail if a running pod has used the snapshot. Here we ignore the fail and proceed to remove the other snapshots then finally remove nydus-snapshotter completely. I'm wondering what happens to the pod that is still using the snapshot. Does it keep running normally?

@wainersm
Copy link

wainersm commented Jan 9, 2024

@ChengyuZhu6 I've finished my review and left a couple of comments. I didn't find any bug but I'm afraid I won't have time to give it a try. Thanks for working on this && amazing work!

Copy link

@beraldoleal beraldoleal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @ChengyuZhu6 , thanks for this. Just left a few comments here.


Next, we can configure access control for nydus snapshotter.
```bash
kubectl apply -f nydus-snapshotter-rbac.yaml

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is not present at the nydus-snapshotter root folder:

kubectl apply -f misc/snapshotter/nydus-snapshotter-rbac.yaml

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Thanks for your review @beraldoleal !

Afterward, we can deploy a DaemonSet for nydus snapshotter.

```bash
kubectl apply -f nydus-snapshotter.yaml

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here:

kubectl apply -f misc/snapshotter/nydus-snapshotter.yaml

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.


```bash
$ kubectl delete -f nydus-snapshotter.yaml
$ kubectl delete -f nydus-snapshotter-rbac.yaml

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

RUN mkdir -p /usr/bin/ /opt/nydus/ /var/lib/containerd-nydus/cache /tmp/blobs/
COPY --from=sourcer /nydus* /usr/bin/
COPY --from=sourcer /toml /usr/bin/
COPY containerd-nydus-grpc /usr/bin/

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with this repository, so not sure if I'm missing something (probably I am).... but I was not able to test the entire solution because I can't build the image:

"/containerd-nydus-grpc": not found

Do you mind to share what I'm missing?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you should build this project binaries and copy containerd-nydus-grpc over misc/snaphotter. Thanks for giving it a try @beraldoleal !

Copy link

@beraldoleal beraldoleal Jan 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wainersm I tried that, but the binary was still missing because is a wrong path. @ChengyuZhu6 do you mind sharing if we need any extra step to make this passing? Or maybe a README.md here with a quick/basic instructions?

Copy link
Author

@ChengyuZhu6 ChengyuZhu6 Jan 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your informations. The following are steps to build the snapshotter docker image:

$ make  # build snapshotter binaries
$ cp bin/* misc/snapshotter/
$ pushd misc/snapshotter/
$ docker build -t ghcr.io/containerd/nydus-snapshotter:latest .
$ popd

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice then to add those instructions to a README on misc/snapshotter, not fancy, just this. Alternatively, add those steps as part of a make snapshotter target.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice then to add those instructions to a README on misc/snapshotter, not fancy, just this. Alternatively, add those steps as part of a make snapshotter target.

Done.

@ChengyuZhu6 ChengyuZhu6 force-pushed the payload branch 3 times, most recently from 5060b74 to 045a661 Compare January 11, 2024 08:27
subjects:
- kind: ServiceAccount
name: nydus-snapshotter-sa
namespace: nydus-snapshotter
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "No newline at end of file"

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

log_to_stdout = true
# Snapshotter's log level
level = "info"

[snapshot]
# Insert Kata volume information to `Mount.Options`
enable_kata_volume = true
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "No newline at end of file"

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

# Print logs to stdout rather than logging files
log_to_stdout = true
# Snapshotter's log level
level = "info"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "No newline at end of file"

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

# Print logs to stdout rather than logging files
log_to_stdout = true
# Snapshotter's log level
level = "info"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "No newline at end of file"

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

ChengyuZhu6 added 4 commits January 25, 2024 21:37
This can be made transparent to users, eliminating the need for them to understand
how to configure snapshotter in Containerd and reducing errors caused by
misconfiguration.

Signed-off-by: ChengyuZhu6 <[email protected]>
This can be seamlessly handled for users, eliminating the requirement for them
to restore Containerd to its state prior to running the snapshotter.
This approach reduces errors caused by misconfiguration or misbehavior.

Signed-off-by: ChengyuZhu6 <[email protected]>
introduce main function to handle different commandline.

Signed-off-by: ChengyuZhu6 <[email protected]>
Use configmap to decouple configuration artifacts from image content
to keep containerized applications portable and inject configuration
files and variables to the daemonset.

Signed-off-by: ChengyuZhu6 <[email protected]>
misc/snapshotter/config-blockdev.toml Outdated Show resolved Hide resolved
docs/setup_snapshotter_by_daemonset.md Outdated Show resolved Hide resolved
@imeoer
Copy link
Collaborator

imeoer commented Jan 26, 2024

We'd better to standardize all the binary paths to /usr/local/bin/* and all the configuration paths to /etc/nydus/*.json, others LGTM, thanks! Let's provide a helm chart to further reduce the difficulty of the deploy steps.

ChengyuZhu6 added 4 commits January 26, 2024 13:22
Add document to setup snapshotter by daemonset.

Signed-off-by: ChengyuZhu6 <[email protected]>
We no longer need the entrypoint because snapshotter.sh handles
both the deployment and the cleanup of the snapshotter.

Signed-off-by: ChengyuZhu6 <[email protected]>
Check the status of the systemd service after restarting it.

Signed-off-by: ChengyuZhu6 <[email protected]>
@ChengyuZhu6 ChengyuZhu6 force-pushed the payload branch 2 times, most recently from b5822db to 518d37c Compare January 26, 2024 05:23
update the e2e tests.

Signed-off-by: ChengyuZhu6 <[email protected]>
@imeoer
Copy link
Collaborator

imeoer commented Jan 26, 2024

Thanks @ChengyuZhu6 for the work and all reviewers @fidencio @wainersm @zvonkok @beraldoleal !

@imeoer imeoer merged commit 918ca3b into containerd:main Jan 26, 2024
16 checks passed
@ChengyuZhu6 ChengyuZhu6 deleted the payload branch January 26, 2024 06:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Provide a daemon-set to ease the deployment of nydus-snapshotter
7 participants