Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Kaniko to 0.24.0, use new snapshotMode #286

Merged

Conversation

SaschaSchwarze0
Copy link
Member

Kaniko released v0.24.0 last week. I am updating the cluster build strategy to use the new version. I am also changing the argument set to use --snapshotMode=redo. Kaniko v0.24.0 introduces two new performance-related command line flags that are described here: Kaniko Performance Testing.

We assessed this internally, copying my assessment here:

The problem that the enhancement address go directly against the biggest issue that Kaniko has: Kaniko works by extracting the FROM image specified in the Dockerfile into its (otherwise empty) root file system. Then it does through all the commands in the Dockerfile and performs them. Some of these commands make changes to the file system. In those cases, Kaniko needs to build the appropriate layer with the delta. For ADD and COPY, it is an easy job because Kaniko knows which files it is copying around. The difficulty is RUN. This command can run arbitrary things like installations that add or modify files at any place. To determine changes, Kaniko does file system snapshotting. The comparison of two snapshots allows to see what changed.

The default snapshot mode (full) includes the hash of the file content which is expensive. There already existed the time mode that only checked mtime and is probably very fast, but has a problem in its design (see https://github.com/GoogleContainerTools/kaniko#mtime-and-snapshotting).

The new redo mode tries it more efficiently. It rents ideas from build systems that - to only build deltas - also need to know which files have changed. Specifically, the redo tool is mentioned. The file hash then does not anymore include the file content but not only the mtime.

I think it is obvious that an algorithm that had to read the file content is way more expensive than anything else that can live without it. What is good is that the proposal uses an algorithm that has been in use elsewhere already which makes me give it the trust that it works relyably.

--

The second flag, --use-new-run=true, I find critical. It tries to live without any snapshots. Instead, before a RUN command starts, it writes a file and captures its mtime. Then it runs everything from the RUN command. After that, it iterates all the files and assumes that all files with an mtime after the captured have changed or are new. Conceptually I see there three issues:

--

My conclusion on that is that we are getting a great improvement in the run time here but have to be careful to still create correct images. The existing issues related to this enhancement, GoogleContainerTools/kaniko#1316 and GoogleContainerTools/kaniko#1317, both go in the direction of my assessment as they report issues with the --use-new-run=true flag. And specifically the npm install related issue is completely clear to me because npm hosts compressed tar files and extracts them into the correct node_modules location. And those files have their old mtime from when they were written before getting put into the tarball.

So, at the moment, there is no Kaniko release that contains these new flags. Once there is one, we should pick that up and I recommend to set --snapshotMode=redo in the Kaniko build strategy, but NOT --use-new-run=true. Based on how I understand those fixes, --snapshotMode=redo is the big performance improvement, --use-new-run=true in addition would NOT be another big step forward but only a small one if at all measurable.

In the meantime, there is the v0.24.0 release. I brought up my concerns on use-new-run here but have not received feedback yet.

@SaschaSchwarze0
Copy link
Member Author

@otaviof I included you in the review as you will need to make the same update in your runtime image pr.

@SaschaSchwarze0 SaschaSchwarze0 added the Release 0.1.x Label for release 0.1.x label Jul 6, 2020
@SaschaSchwarze0
Copy link
Member Author

/test 4.4-unit

1 similar comment
@SaschaSchwarze0
Copy link
Member Author

/test 4.4-unit

@zhangtbj
Copy link
Contributor

zhangtbj commented Jul 6, 2020

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jul 6, 2020
@zhangtbj
Copy link
Contributor

zhangtbj commented Jul 6, 2020

Hi @SaschaSchwarze0 ,

There is a conflict, pls help fix

@SaschaSchwarze0
Copy link
Member Author

Okay, one second.

@openshift-ci-robot openshift-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label Jul 6, 2020
@SaschaSchwarze0
Copy link
Member Author

@zhangtbj done.

@zhangtbj
Copy link
Contributor

zhangtbj commented Jul 7, 2020

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jul 7, 2020
@zhangtbj
Copy link
Contributor

zhangtbj commented Jul 7, 2020

/approve

@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: zhangtbj

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 7, 2020
@openshift-merge-robot openshift-merge-robot merged commit 9653182 into shipwright-io:master Jul 7, 2020
@SaschaSchwarze0 SaschaSchwarze0 deleted the sascha-kaniko-0.24 branch August 19, 2020 08:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. Release 0.1.x Label for release 0.1.x
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants