-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix handling of the volume directive #334
Fix handling of the volume directive #334
Conversation
Hi @peter-evans. Thanks for your PR. I'm waiting for a GoogleContainerTools member to verify that this patch is reasonable to test. If it is, they should reply with I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
The integration test for Dockerfile_test_volume might fail because of this offset. I'll try and run the integration tests locally to fix if necessary. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for contributing! It seems like there is a bug with the VOLUME command, but I'm not sure if completely removing it from the snapshot will fix the error.
The tests aren't failing right now because snapshotting is still happening. You'll need to return an empty string array here to indicate that the filesystem shouldn't be snapshotted.
With just that change the test should fail because of the offset, but let's verify that before moving forward.
If removing the snapshot completely causes issues for subsequent layers an alternative might be to set a property on the snapshot layer indicating that it shouldn't be written to the container. |
Yup, as expected the layers test failed:
But the filesystem test also failed:
Meaning that I think this will require further exploration into how the One theory could be that Docker snapshotting |
Sure. I'll do some further investigation. |
I can't find any official documentation to back this up but this is my theory of how Docker treats the Dockerfile directives seem to be split into two categories:
The Below is an example to demonstrate. The the Dockerfile: FROM gcr.io/google-appengine/debian9@sha256:1d6a9a6d106bd795098f60f4abb7083626354fa6735e81743c7f8cfca11259f0
VOLUME /volume1
RUN mkdir /foo1
VOLUME /volume2
VOLUME /volume3
RUN echo "foo2"
VOLUME /volume4
RUN mkdir /foo3
VOLUME /volume5
|
Hey @peter-evans, thanks for investigating this! Your findings match what I've found testing locally as well, so I think we can go ahead with these assumptions for now. Right now, kaniko snapshots in two ways:
The code for this is here In the case where VOLUME is followed by one of the commands in (2), you'd need some way of remembering which files were created by VOLUME and appending them to the list of files to be snapshotted here. Right now, we maintain a whitelist of directories created by volumes, which are added here. We could repurpose this to store volumes that haven't been snapshotted, add these files to This, along with a few more integration tests (you'd just have to add a couple test Dockerfiles in the dockerfiles directory, should be good for now. And again, thanks so much for looking into this! |
Another (probably better) option is to remove the volumeWhitelist from the // This should live outside the command execution loop
var volumes []string
...
// This check should happen after a command has been executed
if dockerCommand.Name() == "volume" {
volumes = append(volumes, snapshotFiles...)
continue
}
...
if snapshotFiles != nil {
if len(snapshotFiles) > 0 {
snapshotFiles = append(snapshotFiles, volumes...)
volumes = []string{}
}
contents, err = snapshotter.TakeSnapshot(snapshotFiles)
} else {
contents, err = snapshotter.TakeSnapshotFS()
volumes = []string{}
} Then, before snapshotting specific files you can append the contents of |
@priyawadhwa Thanks for the pointers. I'll see what I can do. |
1c1d3c9
to
5cac31e
Compare
pkg/executor/build.go
Outdated
@@ -167,6 +168,10 @@ func (s *stageBuilder) build(opts *config.KanikoOptions) error { | |||
return err | |||
} | |||
files := command.FilesToSnapshot() | |||
if command.Name() == "volume" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think cmd.Name()
should work here, that way you won't need to add the function to every command.
Let's also add volume
to constants.go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. Thanks!
// AddVolumePathToWhitelist adds the given path to the whitelist with | ||
// PrefixMatchOnly set to true. Snapshotting will ignore paths prefixed | ||
// with the volume, but the volume itself will not be ignored. | ||
func AddVolumePathToWhitelist(path string) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In what case would we need to whitelist volumes? I'm not totally sure we need to whitelist them at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, volumes do need to be whitelisted. This is because any file-system changes inside a volume should not be added to the snapshot. Consider this example:
FROM gcr.io/google-appengine/debian9@sha256:1d6a9a6d106bd795098f60f4abb7083626354fa6735e81743c7f8cfca11259f0
VOLUME /foo
RUN echo "hello world" > /foo/hello
After the volume /foo
is created any file-system changes within that volume should not be in the snapshot. The file hello
will not be in the layer created at the RUN
directive. However, the directory foo
itself should be included in the snapshot. Due to this complication I felt that the best way to handle it was by being able to flag whitelist entries as PrefixMatchOnly=true
. This only allows matches where the whitelisted directory is a prefix of the path being checked. If it's an exact match it will not be whitelisted. This allows the volume directories to be snapshot, but anything inside them to be whitelisted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After I wrote the above comment I went back to double check. Turns out it's more complicated still. See the following example built with Docker and the container-diff analysis.
FROM gcr.io/google-appengine/debian9@sha256:1d6a9a6d106bd795098f60f4abb7083626354fa6735e81743c7f8cfca11259f0
VOLUME /foo
RUN echo “hello world” > /foo/hello
COPY run.sh /foo/run.sh
VOLUME /bar
COPY run.sh /bar/run.sh
VOLUME /baz
ADD run.sh /baz/run.sh
WORKDIR /baz/bat
RUN echo “hello world” > /bar/hello
RUN echo “hello world” > /baz/hello
RUN echo “hello world” > /tmp/hello
Analysis for kaniko-test Layer 1:
FILE SIZE
/foo 0
Analysis for kaniko-test Layer 2:
FILE SIZE
/foo 24B
/foo/run.sh 24B
Analysis for kaniko-test Layer 3:
FILE SIZE
/bar 24B
/bar/run.sh 24B
Analysis for kaniko-test Layer 4:
FILE SIZE
/baz 24B
/baz/run.sh 24B
Analysis for kaniko-test Layer 5:
FILE SIZE
/baz 0
/baz/bat 0
Analysis for kaniko-test Layer 6:
FILE SIZE
/tmp 12B
/tmp/hello 12B
File-system changes within volumes are whitelisted during system-wide snapshots. None of the hello
files created within volumes appear in the layers. The only hello
file that does appear is contained in /tmp
, which isn't a volume. However, directives that specify files (ADD/COPY/WORKDIR) override the whitelist.
At the moment, TakeSnapshot(files)
is checking the whitelist. I will try removing it and improve the volume integration test to cover these cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed checking the whitelist for TakeSnapshot(files)
and it now works as expected. I added to Dockerfile_test_volume_2
to provide coverage of these specific file directive cases.
5cac31e
to
ab4b65e
Compare
if err != nil { | ||
return err | ||
} | ||
|
||
// Only create and snapshot the dir if it didn't exist already | ||
if _, err := os.Stat(volume); os.IsNotExist(err) { | ||
logrus.Infof("Creating directory %s", volume) | ||
v.snapshotFiles = []string{volume} | ||
v.snapshotFiles = append(v.snapshotFiles, volume) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a comment about this change. This is a bug I found where it is not respecting multiple volumes created by the same VOLUME
directive.
The Dockerfile_test_volume
test did not catch it because tmp
is already a directory and so v.snapshotFiles
did not get overridden. I added an extra volume for the directive to create to provide test coverage.
VOLUME /foo/bar /tmp /qux/quux
ab4b65e
to
046b3ac
Compare
@priyawadhwa I can see that the kokoro run failed. Any way I can see the detail for that? |
@peter-evans looks like the number of layers was incorrect:
|
@priyawadhwa I just spun up a new VM in GCP and ran the integration tests again and they all pass. Ubuntu 16.04.5 LTS It would be great if you could let me know some detail about kokoro's build environment, such as distro, Docker version, etc. |
I also checked the layers in
|
046b3ac
to
b1e28dd
Compare
I rebased onto master and ran the integration tests again. All pass. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, likes like all tests are passing now, just had one question.
return false, fmt.Errorf("Error checking if parent dir %s can be snapshotted: %s", file, err) | ||
} | ||
if !shouldSnapshot { | ||
if val, ok := snapshottedFiles[file]; ok && val { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you remove shouldSnapshot() and isBuildFile()? I think those functions are necessary so that kaniko is able to build itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the specific files cases (ADD, COPY, WORKDIR) that call snapshotFiles()
it is correct that they are added to the snapshot even if one of their parent directories is a volume. That is the main reason I removed checking the whitelist, which is what shouldSnapshot()
was doing. Now that we aren't checking the whitelist there is no need for isBuildFile()
either. That function's job was to make sure that build files that were whitelisted were added to the snapshot anyway.
The key thing to consider about this change is, is it ok to not check the whitelist at all for specific files cases, directives ADD, COPY and WORKDIR? (VOLUME is included too, but only if it's snapshot with another specific file case and not a system-wide snapshot.) Up to now, Kaniko has prevented adding whitelisted files and directories with those directives. It doesn't seem to be necessary to me. The main purpose of the whitelist seems to be to ignore files and directories when there is no way to know if the file was mounted or came from the base image. If a Dockerfile is explicitly trying to add files and directories with ADD, COPY and WORKDIR, I don't think there is a reason to prevent that. Let me know if you think I've missed something!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@peter-evans that makes sense, thanks for explaining!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks so much for looking into this and fixing this bug!
This is an attempt to fix #313 by not adding the created directories to the
VOLUME
command snapshot. Adding to the snapshot creates a layer in the container that doesn't exist when building with Docker.