Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hardlinks are not (well) supported by image unpacker #282

Closed
vito opened this issue Aug 9, 2018 · 5 comments
Closed

Hardlinks are not (well) supported by image unpacker #282

vito opened this issue Aug 9, 2018 · 5 comments
Assignees
Labels
kind/bug Something isn't working

Comments

@vito
Copy link

vito commented Aug 9, 2018

Hi there! Very excited for Kaniko - I'm modeling Concourse's new Registry Image resource on Kaniko's image extraction process, and noticed one gotcha that affects Kaniko too.

The problem is rooted in the fact that Kaniko extracts image layers in reverse order.

Try having Kaniko fetch vito/broken-base and run git from the resulting rootfs.

Here's a Dockerfile for that:

FROM vito/broken-base
RUN ls -al /usr/libexec/git-core/git /usr/bin/git
RUN git --version

And here's the Dockerfile for vito/broken-base:

FROM alpine
RUN apk --no-cache add git
RUN rm /usr/bin/git && ln -s /usr/libexec/git-core/git /usr/bin/git

I see this output:

INFO[0000] Unpacking filesystem of vito/broken-base...  
2018/08/09 01:18:50 No matching credentials found for index.docker.io, falling back on anonymous
INFO[0000] Mounted directories: [/kaniko /var/run /sys /dev/shm /dev/pts /tmp/garden-init /proc /scratch /tmp/build/e55deab7 /tmp/build/e55deab7/source /tmp/build/e55deab7/image /etc/hosts /etc/resolv.conf] 
INFO[0000] Unpacking layer: 2                           
INFO[0000] Unpacking layer: 1                           
INFO[0000] Not adding /usr because it was added by a prior layer 
INFO[0000] Not adding /usr/bin because it was added by a prior layer 
INFO[0000] Not adding /usr/bin/git because it was added by a prior layer 
INFO[0001] Unpacking layer: 0                           
INFO[0001] Not adding /etc because it was added by a prior layer 
INFO[0001] Not adding /etc/apk because it was added by a prior layer 
INFO[0001] Not adding /etc/apk/protected_paths.d because it was added by a prior layer 
INFO[0001] Not adding /etc/apk/world because it was added by a prior layer 
INFO[0001] Not adding /etc/hosts because it is whitelisted 
INFO[0001] Not adding /etc/ssl because it was added by a prior layer 
INFO[0001] Not adding /etc/ssl/certs because it was added by a prior layer 
INFO[0001] Not adding /lib because it was added by a prior layer 
INFO[0001] Not adding /lib/apk because it was added by a prior layer 
INFO[0001] Not adding /lib/apk/db because it was added by a prior layer 
INFO[0001] Not adding /lib/apk/db/installed because it was added by a prior layer 
INFO[0001] Not adding /lib/apk/db/scripts.tar because it was added by a prior layer 
INFO[0001] Not adding /lib/apk/db/triggers because it was added by a prior layer 
INFO[0001] Not adding /proc because it is whitelisted   
INFO[0001] Not adding /sys because it is whitelisted    
INFO[0001] Not adding /usr because it was added by a prior layer 
INFO[0001] Not adding /usr/bin because it was added by a prior layer 
INFO[0001] Not adding /usr/lib because it was added by a prior layer 
INFO[0001] Not adding /usr/local because it was added by a prior layer 
INFO[0001] Not adding /usr/local/share because it was added by a prior layer 
INFO[0001] Not adding /usr/sbin because it was added by a prior layer 
INFO[0001] Not adding /usr/share because it was added by a prior layer 
INFO[0001] Not adding /var because it was added by a prior layer 
INFO[0001] Not adding /var/run because it is whitelisted 
INFO[0001] Taking snapshot of full filesystem...        
INFO[0014] cmd: /bin/sh                                 
INFO[0014] args: [-c ls -al /usr/bin/git /usr/libexec/git-core/git && git --version] 
lrwxrwxrwx    1 root     root            25 Aug  9 01:18 /usr/bin/git -> /usr/libexec/git-core/git
lrwxrwxrwx    1 root     root            12 Aug  9 01:18 /usr/libexec/git-core/git -> /usr/bin/git
ERRO[0014] exit status 127

So, it has /usr/bin/git -> /usr/libexec/git-core/git, but then it has /usr/libexec/git-core/git pointing back to /usr/bin/git. That's a circular symlink, with no actual git binary anymore.

Running this same thing with Docker works fine:

~/w/r/tasks (master) $ docker run -it vito/broken-base git --version
git version 2.18.0

Here's what's happening: the layer that installs git actually has /usr/bin/git as a hardlink to /usr/libexec/git-core/git. A later layer then replaces /usr/bin/git with a symlink to that same target. But when the layers are processed in reverse, a symlink is first created at /usr/bin/git, and then the hardlink entry in the prior layer ends up pointing /usr/libexec/git-core/git back to the already-created symlink.

This is "hardlink" entry is actually created as a symlink, as implemented in fs_util.go, which is also kinda weird but I don't think that's too important here.

To be honest this feels like a pretty fundamental problem with the reverse order approach. :/ It's nice that it can do it all in one pass and that it doesn't need to write removed/changed files in earlier layers, but hardlinks kind of throw a wrench into things. I ended up just switching to chronological order in this commit: concourse/registry-image-resource@ecf520b

@priyawadhwa priyawadhwa added the kind/bug Something isn't working label Aug 9, 2018
@priyawadhwa priyawadhwa self-assigned this Aug 9, 2018
@dlorenc
Copy link
Collaborator

dlorenc commented Aug 10, 2018

Thanks for the detailed description @vito! We're thinking this one through to see if there's anything we can do here.

@dlorenc
Copy link
Collaborator

dlorenc commented Aug 10, 2018

Hmm,

So I was able to easily repro this with your broken base. I then took your same initial Dockerfile (the one that installs/symlinks git) and pushed it to my registry:

FROM alpine
RUN apk --no-cache add git
RUN rm /usr/bin/git && ln -s /usr/libexec/git-core/git /usr/bin/git
RUN git --version

And then built the second image, referencing my registry:

FROM $MYREGISTRY
RUN ls -al /usr/libexec/git-core/git /usr/bin/git
RUN git --version

and couldn't repro this:

lrwxrwxrwx    1 root     root            25 Aug 10 15:27 /usr/bin/git -> /usr/libexec/git-core/git
lrwxrwxrwx    1 root     root            25 Aug 10 15:27 /usr/libexec/git-core/git -> /usr/bin/git-receive-pack

Any other info on how to got that broken image created in the first place?

@vito
Copy link
Author

vito commented Aug 10, 2018

@dlorenc I think this might be subject to chance. When the layer .tars are created, it's up to the tool writing the tar entries to decide which file is actually recorded as a regular file entry and which entries are recorded as hard-links to the extracted file. From a filesystem perspective they're all just files with the same inode, so there's no way to tell who's "linked" to who (as far as I can tell with e.g. stat and ls).

It looks like, in your case, git-receive-pack ended up being recorded as the the "regular file" entry, and /usr/libexec/git-core/git hard-linked to it. So it worked. In my case, I'm guessing the earlier layer ended up with /usr/bin/git as the "regular file" entry that it linked to.

I'm not sure how the layer archives are created and whether it's using e.g. GNU tar or a native Go implementation. Either way this seems like something that could be subject to however the directory tree is traversed while building the archive, which may be non-deterministic.

@priyawadhwa
Copy link
Collaborator

Hey @vito, we ended up switching to extracting the fs in order in #325 , so this should work now!

I'm going to go ahead and close this issue, but please comment or open another if you face any more problems.

@vito
Copy link
Author

vito commented Aug 30, 2018

Makes sense! Thanks for following up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants