Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

podvm: Use local files for build and reference versions file #1448

Merged

Conversation

tumberino
Copy link
Contributor

This introduces new Make targets in the root Makefile to create the podvm image.

If doing all from scratch make docker-builder docker-binaries docker-image. Else if binaries and builder already exist for current versions.yaml file then make docker-image will be enough.

I haven't update the rhel or centos versions yet

@stevenhorsman
Copy link
Member

Sorry for the basic question - is this supposed to replace the current direct docker approach? If so I think we need to update the podvm README.md doc to reflect this?

@tumberino
Copy link
Contributor Author

Sorry for the basic question - is this supposed to replace the current direct docker approach? If so I think we need to update the podvm README.md doc to reflect this?

It is a helper rather than a complete replacement, but you're right the docs will need updating too.

@wainersm
Copy link
Member

hi @jtumber-ibm , I hope to have some time today or tomorrow to review this. In meanwhile I was trying a different solution to allow caching kata-agent/AA to speed up the CI jobs, I am glad you sent this because I was having a really hard time to solve some cases :)

.dockerignore Outdated Show resolved Hide resolved
Makefile Outdated Show resolved Hide resolved
Makefile Outdated Show resolved Hide resolved
Makefile Show resolved Hide resolved
Makefile Show resolved Hide resolved
Makefile Outdated Show resolved Hide resolved
@tumberino tumberino force-pushed the docker-default-podvm-build branch 2 times, most recently from cd26974 to b54c73f Compare October 2, 2023 12:56
@wainersm
Copy link
Member

wainersm commented Oct 4, 2023

@jtumber-ibm how does this PR is related with #1018 ? does this supersedes PR 1018?

@wainersm
Copy link
Member

wainersm commented Oct 4, 2023

@jtumber-ibm still trying to understand your changes, I just glanced at the code. One question that I have is about f binaries and builder already exist for current versions.yaml file then make docker-image will be enough...

Apart from versions.yaml,it seems that build flags and build scripts themselves that would invalidate a cache. For example, suppose we decide to change ttrc to false in https://github.com/confidential-containers/cloud-api-adaptor/blob/main/podvm/Makefile.inc#L149 while keeping the same version of guest components.

Makefile Outdated Show resolved Hide resolved
@tumberino
Copy link
Contributor Author

@jtumber-ibm how does this PR is related with #1018 ? does this supersedes PR 1018?

It does supersede #1018, I'll ask @yoheiueda if it is okay to close #1018

@tumberino
Copy link
Contributor Author

@jtumber-ibm still trying to understand your changes, I just glanced at the code. One question that I have is about f binaries and builder already exist for current versions.yaml file then make docker-image will be enough...

Apart from versions.yaml,it seems that build flags and build scripts themselves that would invalidate a cache. For example, suppose we decide to change ttrc to false in https://github.com/confidential-containers/cloud-api-adaptor/blob/main/podvm/Makefile.inc#L149 while keeping the same version of guest components.

That is true, and a good point. I don't have a great answer for how to solve it though. Do you have a recommendation for how to track those changes / invalidate the cache when the build changes? My thought was to add a build-version entry into the versions.yaml but that isn't easy to track against the actual changes

@katexochen
Copy link
Contributor

katexochen commented Oct 5, 2023

jtumber-ibm still trying to understand your changes, I just glanced at the code. One question that I have is about f binaries and builder already exist for current versions.yaml file then make docker-image will be enough...
Apart from versions.yaml,it seems that build flags and build scripts themselves that would invalidate a cache. For example, suppose we decide to change ttrc to false in https://github.com/confidential-containers/cloud-api-adaptor/blob/main/podvm/Makefile.inc#L149 while keeping the same version of guest components.

That is true, and a good point. I don't have a great answer for how to solve it though. Do you have a recommendation for how to track those changes / invalidate the cache when the build changes? My thought was to add a build-version entry into the versions.yaml but that isn't easy to track against the actual changes

I don't think we should build an entire build system from scratch. If we want such features, we will have to use a real build system like Nix or Bazel.

@wainersm
Copy link
Member

wainersm commented Oct 5, 2023

jtumber-ibm still trying to understand your changes, I just glanced at the code. One question that I have is about f binaries and builder already exist for current versions.yaml file then make docker-image will be enough...
Apart from versions.yaml,it seems that build flags and build scripts themselves that would invalidate a cache. For example, suppose we decide to change ttrc to false in https://github.com/confidential-containers/cloud-api-adaptor/blob/main/podvm/Makefile.inc#L149 while keeping the same version of guest components.

That is true, and a good point. I don't have a great answer for how to solve it though. Do you have a recommendation for how to track those changes / invalidate the cache when the build changes? My thought was to add a build-version entry into the versions.yaml but that isn't easy to track against the actual changes

I don't think we should build an entire build system from scratch. If we want such features, we will have to use a real build system like Nix or Bazel.

Yes, sure, don't want to re-invent the wheels. I am not sure about Nix, but even with bazel we would still have problems because the main concern is the build inside containers.

okay, back to the problem... ignore it? i.e. assume in a rare situation we might need to change build parameters but not versions, then defer the solution?

If not ignore then I can think of either:

  • Add a new property to versions to version the build configuration (e.g. build_version=N) that we can bump in case of we want to force the rebuild for whatever reason.
  • Apart from versions.yaml, compute the build files (Makefiles, etc) in the VERSION_HASH. This is the approach used by kata containers, btw.

Both approach are error prone: one might forget to update the build_version. One might forget to include new build files on the computation of the VERSION_HASH.

BTW, @fidencio has recently added (kata-containers/kata-containers#8100) the infra to allow building kata-agent separated from the entire rootfs and, IIUC, the idea is to cache the kata-agent binary just like the other components. In other words, soon we might be able to consume kata-agent binaries... In Fabiano we trust, In fabiano we rest our hopes!

@wainersm
Copy link
Member

wainersm commented Oct 5, 2023

@jtumber-ibm perhaps one last thing not relate with code properly saying. When we (@bpradipt and I) wrote the first implementation, the idea of pulling the sources from git tree was to enable reproducibility. Switching to copying the local sources to the image we lose reproducibility if I am not mistaken. As a developer I love the simplicity that this solution brings to my life (I can even feel the inner peace on my heart growing...) but I'm not sure how it impacts on peer pods' distributors. Is this (loosing reproducibility) something already discussed on the community? I missed many discussions recently so it might be case....

@katexochen
Copy link
Contributor

katexochen commented Oct 5, 2023

When we (bpradipt and I) wrote the first implementation, the idea of pulling the sources from git tree was to enable reproducibility.

@wainersm IMO pulling code from GitHub has nothing to do with reproducibility. You can totally have reproducible builds from a local checkout. If you pull things from remote, you have to pin everything you pull in (by hash!). We never did that, and the way we pulled things, and still pull things, is not reproducible at all. Some examples from current main:

We don't even pin the base image by hash. The meaning of the tag changes multiple times a week:

We pull CAA from main. If I build two times and you merge a PR in between, I get different results:

ARG CAA_SRC_REF="main"

Kata isn't pinned as well, as is any other of the many tools we are building:

ARG KATA_SRC_BRANCH="CCv0"

Currently, the only reproducible bit in our builds yq (and the stuff we already pinned in versions.yaml):

ARG YQ_VERSION="v4.35.1"
ARG YQ_CHECKSUM="sha256:bd695a6513f1196aeda17b174a15e9c351843fb1cef5f9be0af170f2dd744f08"

@katexochen
Copy link
Contributor

katexochen commented Oct 5, 2023

I am not sure about Nix, but even with bazel we would still have problems because the main concern is the build inside containers.

I don't get that point. You wouldn't use those tools within a container, they come with their own sandbox.

Maybe we continue the discussion in the issue. I was trying to bring up reproducible builds some time ago, but there wasn't much feedback #1035 (comment) (beside the excellent work of @jtumber-ibm to at least implement the versions.yaml thing)

@tumberino tumberino force-pushed the docker-default-podvm-build branch from b54c73f to 817b5ff Compare October 10, 2023 12:47
Copy link
Contributor

@katexochen katexochen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Local build was successful, full build took 38 min.

Copy link
Member

@stevenhorsman stevenhorsman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've not had a chance to look through all the code, but I've run it locally with ARCH=s390x make podvm-builder podvm-binaries podvm-image and it created an image at least. I hit one problem on the way and I think we might need to update the documentation of how to build the podvm based on this as an easier way to run it locally. I hope to review it more fully next week. Thanks and sorry for the delays in the review


ifneq (4, $(YQ_MAJOR_VERSION))
$(error "yq major version should be 4, is: $(YQ_MAJOR_VERSION)")
MINIMUM_YQ_MAJOR_VERSION := 4
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an minimum minor version we need as well? I tried this with 4.2 (as I'm lazy and it's the example in the doc: https://github.com/mikefarah/yq#wget) and got:

~/go/src/github.com/confidential-containers/cloud-api-adaptor# make docker-builder docker-binaries docker-image
Error: unknown command ".cloudimg.ubuntu.focal.amd64.url | select(. != null)" for "yq"
Run 'yq --help' for usage.

This went away when I used v4.35.1 like you reference above.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of those problems we could solve with a reproducible dev environment provided by Nix...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mikefarah/yq#1158 (comment)

We could specify eval (e) in our commands instead to ensure we are compatible with all the 4.x minor versions

@surajssd
Copy link
Member

What is blocking this PR?

@katexochen
Copy link
Contributor

What is blocking this PR?

It needs another review (and a rebase).

@stevenhorsman
Copy link
Member

What is blocking this PR?

I think JT needs to rebase once they are back from vacation and it needs another review, which I intend to do once it's rebased to test the images that are created, and run it on s390x and x86, rather than just built a single one to test out the process. I think some doc updates might be important as well as I mention in my comment last week.

Requires the podvm directory for podvm build in a container.
Also requires the .git folder for calculating the commit

Signed-off-by: James Tumber <[email protected]>
Copy the other third-party references for kata-containers too

Signed-off-by: James Tumber <[email protected]>
Move PHONY force into Makefile.inc and add always target
Let FORCE=true rebuild external-src binaries.
Use order-only dependancies to avoid rebuilding binaries that already exist

Signed-off-by: James Tumber <[email protected]>
Removes the remote caa clone and adds local copy of the podvm directory

Signed-off-by: James Tumber <[email protected]>
Use the local files as source of truth for external references
and let the makefile targets handle cloning the git repos

Signed-off-by: James Tumber <[email protected]>
Used for podvm builds

Signed-off-by: James Tumber <[email protected]>
With local podvm builds we no longer need this reference

Signed-off-by: James Tumber <[email protected]>
Use the Add Dockerfile feature for remote sources instead of curl
Remove caa & kata-containers repo cloning

Signed-off-by: James Tumber <[email protected]>
Add targets for creating the podvm images in docker as default
Use the versions.yaml file hash as the image tags

` make docker-builder docker-binaries docker-image` to build everything

Signed-off-by: James Tumber <[email protected]>
Remove caa/kata references from Dockerfile for centos

Signed-off-by: James Tumber <[email protected]>
Remove caa/kata references from Dockerfiles for rhel

Signed-off-by: James Tumber <[email protected]>
Only use versions parsed in as arguments

Signed-off-by: James Tumber <[email protected]>
Single point of reference for yq version/checksum in Makefile.defaults

Signed-off-by: James Tumber <[email protected]>
Added quickstart guide for building podvm

Signed-off-by: James Tumber <[email protected]>
@tumberino tumberino force-pushed the docker-default-podvm-build branch from 817b5ff to 3aa95ca Compare October 26, 2023 12:36
@tumberino
Copy link
Contributor Author

@surajssd @stevenhorsman rebased and doc update added

Copy link
Member

@stevenhorsman stevenhorsman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recreated an s390x podvm based on the documentation and tried it out locally and it seems to pass the tests, so LGTM. Thanks for the updates!

@tumberino tumberino merged commit b2b64c3 into confidential-containers:main Oct 27, 2023
@tumberino tumberino deleted the docker-default-podvm-build branch October 27, 2023 09:59
tumberino added a commit to tumberino/cloud-api-adaptor that referenced this pull request Oct 27, 2023
After the changes in confidential-containers#1448 the root Makefile now references Makefile.defaults, this file and the yq binary need to be included for the build

Signed-off-by: James Tumber <[email protected]>
huoqifeng pushed a commit that referenced this pull request Oct 30, 2023
After the changes in #1448 the root Makefile now references Makefile.defaults, this file and the yq binary need to be included for the build

Signed-off-by: James Tumber <[email protected]>
lysliu pushed a commit to lysliu/cloud-api-adaptor that referenced this pull request Nov 9, 2023
After the changes in confidential-containers#1448 the root Makefile now references Makefile.defaults, this file and the yq binary need to be included for the build

Signed-off-by: James Tumber <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants