Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Building with --use-new-run=true generates an incomplete image #1317

Closed
oohnoitz opened this issue Jun 14, 2020 · 5 comments · Fixed by #1379 or #1383
Closed

Building with --use-new-run=true generates an incomplete image #1317

oohnoitz opened this issue Jun 14, 2020 · 5 comments · Fixed by #1379 or #1383
Labels
area/performance issues related to kaniko performance enhancement

Comments

@oohnoitz
Copy link

Actual behavior
The RUN layers appear to be empty. I've done a few different scenarios and it seems to affect all RUN layers after the first RUN command.

❯ docker image history 172.17.0.2:5000/kaniko-test:run-1
IMAGE               CREATED             CREATED BY                                      SIZE                COMMENT
622d50f678d4        292 years ago       COPY --from=builder /opt/app/_build/prod/rel…   39.9MB
<missing>           292 years ago       WORKDIR /app                                    0B
<missing>           292 years ago       RUN apt-get update &&   apt-get install -y -…   0B
<missing>           292 years ago       RUN apt-get -qq install -y locales locales-a…   0B
<missing>           5 days ago          /bin/sh -c #(nop)  CMD ["bash"]                 0B
<missing>           5 days ago          /bin/sh -c #(nop) ADD file:4d35f6c8bbbe6801c…   69.2MB

I did a test without multistage build and here are the results: https://gist.github.com/oohnoitz/6d46b79619767f0c89bdee89be49689b which has missing RUN layers.

Expected behavior

RUN layers aren't empty.

❯ docker image history 172.17.0.2:5000/kaniko-test:snapshot-1
IMAGE               CREATED             CREATED BY                                      SIZE                COMMENT
d12179c82454        292 years ago       COPY --from=builder /opt/app/_build/prod/rel…   39.9MB
<missing>           292 years ago       WORKDIR /app                                    0B
<missing>           292 years ago       RUN apt-get update &&   apt-get install -y -…   11.9MB
<missing>           292 years ago       RUN seq 1 8 | xargs -I{} mkdir -p /usr/share…   0B
<missing>           292 years ago       RUN apt-get -qq install -y locales locales-a…   238MB
<missing>           292 years ago       RUN apt-get -qq update                          17.4MB
<missing>           5 days ago          /bin/sh -c #(nop)  CMD ["bash"]                 0B
<missing>           5 days ago          /bin/sh -c #(nop) ADD file:4d35f6c8bbbe6801c…   69.2MB

To Reproduce
Steps to reproduce the behavior:

  1. clone https://github.com/oohnoitz/kaniko-test
  2. build with kaniko with only --use-new-run=true flag

Additional Information

  • Dockerfile

    # Set the Docker image you want to base your image off.
    FROM elixir:1.10.2 as builder
    
    ENV MIX_ENV="prod" \
      PORT="5000"
    
    # Add nodejs
    RUN curl -sL https://deb.nodesource.com/setup_13.x | bash -
    
    # Install other stable dependencies that don't change often
    RUN apt-get update && \
      apt-get install -y --no-install-recommends \
      apt-utils nodejs postgresql-client && \
      rm -rf /var/lib/apt/lists/*
    
    WORKDIR /opt/app
    
    # install hex + rebar
    RUN mix local.hex --force && \
      mix local.rebar --force
    
    # install mix dependencies
    COPY mix.exs mix.lock ./
    COPY config config
    RUN mix deps.get
    RUN mix deps.compile
    
    # build assets
    COPY assets assets
    RUN cd assets && npm install && npm run deploy
    RUN mix phx.digest
    
    # build project
    COPY priv priv
    COPY lib lib
    RUN mix compile
    
    # build release
    COPY rel rel
    RUN mix release
    
    FROM debian:buster-slim
    
    RUN apt-get -qq update
    RUN apt-get -qq install -y locales locales-all
    
    # Set LOCALE to UTF8
    RUN locale-gen en_US.UTF-8
    ENV LANG='en_US.UTF-8' LANGUAGE='en_US:en' LC_ALL='en_US.UTF-8'
    
    ENV MIX_ENV="prod" \
      PORT="5000"
    
    # Exposes this port from the docker container to the host machine
    EXPOSE 5000
    
    # Because these dirs were stripped from the slim package and
    # caused issues installing postgres-client
    RUN seq 1 8 | xargs -I{} mkdir -p /usr/share/man/man{}
    
    # Install other stable dependencies that don't change often
    RUN apt-get update && \
      apt-get install -y --no-install-recommends \
      postgresql-client && \
      rm -rf /var/lib/apt/lists/*
    
    WORKDIR /app
    COPY --from=builder /opt/app/_build/prod/rel/kaniko_test ./
    
    # The command to run when this image starts up
    CMD ["./bin/kaniko_test", "start"]
    
  • Build Context
    See repository link provided above in the reproduction step.

  • Kaniko Image (fully qualified with digest)

    gcr.io/kaniko-project/executor:pref
    gcr.io/kaniko-project/executor@sha256:87706fb134bff87dd3ae062948ed32a4ae614397ea48c3706b0aa7dda30dd492
    

Triage Notes for the Maintainers

Description Yes/No
Please check if this a new feature you are proposing
Please check if the build works in docker but not in kaniko
Please check if this error is seen when you use --cache flag
Please check if your dockerfile is a multistage dockerfile
@tejal29 tejal29 added the area/performance issues related to kaniko performance enhancement label Jun 22, 2020
@tejal29
Copy link
Contributor

tejal29 commented Jun 22, 2020

Thanks @oohnoitz. this is high on my priority list and will take a look this Friday

@tejal29
Copy link
Contributor

tejal29 commented Aug 12, 2020

@oohnoitz Thanks for the project!

docker image history 172.17.0.2:5000/kaniko-test:snapshot-1

@oohnoitz I had a very simple bug in the code. I tried the image with fix #1379

you can now see non-empty layers.
image

Can you please give this a try

gcr.io/kaniko-project/executor:latest-1317
gcr.io/kaniko-project/executor:debug-1317
``

@oohnoitz
Copy link
Author

oohnoitz commented Aug 13, 2020

@tejal29 I think there might be another issue with it when using multi-stage builds, which is resulting in some missing files on the final image. The sizes for the RUN layers in specific in the history command doesn't match up with the sizes returned when running without the --use-new-run=true command.

(pasting the output again for reference)

d12179c82454        292 years ago       COPY --from=builder /opt/app/_build/prod/rel…   39.9MB
<missing>           292 years ago       WORKDIR /app                                    0B
<missing>           292 years ago       RUN apt-get update &&   apt-get install -y -…   11.9MB
<missing>           292 years ago       RUN seq 1 8 | xargs -I{} mkdir -p /usr/share…   0B
<missing>           292 years ago       RUN apt-get -qq install -y locales locales-a…   238MB
<missing>           292 years ago       RUN apt-get -qq update                          17.4MB
<missing>           5 days ago          /bin/sh -c #(nop)  CMD ["bash"]                 0B
<missing>           5 days ago          /bin/sh -c #(nop) ADD file:4d35f6c8bbbe6801c…   69.2MB

Due to the mismatch, the image still is missing some dependencies that should have been installed during the RUN apt-get command.

@tejal29
Copy link
Contributor

tejal29 commented Aug 13, 2020

@tejal29 I think there might be another issue with it when using multi-stage builds, which is resulting in some missing files on the final image. The sizes for the RUN layers in specific in the history command doesn't match up with the sizes returned when running without the --use-new-run=true command.

(pasting the output again for reference)

d12179c82454        292 years ago       COPY --from=builder /opt/app/_build/prod/rel…   39.9MB
<missing>           292 years ago       WORKDIR /app                                    0B
<missing>           292 years ago       RUN apt-get update &&   apt-get install -y -…   11.9MB
<missing>           292 years ago       RUN seq 1 8 | xargs -I{} mkdir -p /usr/share…   0B
<missing>           292 years ago       RUN apt-get -qq install -y locales locales-a…   238MB
<missing>           292 years ago       RUN apt-get -qq update                          17.4MB
<missing>           5 days ago          /bin/sh -c #(nop)  CMD ["bash"]                 0B
<missing>           5 days ago          /bin/sh -c #(nop) ADD file:4d35f6c8bbbe6801c…   69.2MB

Due to the mismatch, the image still is missing some dependencies that should have been installed during the RUN apt-get command.

Thanks! Sorry for missing the size discrepancy earlier.
In #1305, @sachaos correctly identified, relying only on ModTime could be one of the cause of this discrepancy.
In #1383, i have tried a very naive but effective approach.
During, performance testing, we saw creating hashes was one of the major bottlenecks as number of files increased.
In order to provide the same performance gain, i have got rid of hashing completely and relied on plain old file info stats before and after run command.
We compare the following parameters

  1. File size
  2. File modification time
  3. File mode bit
  4. uid and gid.
    This is essentially what RedoHasher does however, it computes a hash for every file in a layer. This hash gets re-used next time.

In this naive approach, we compute it before and after every run. This could be repetitive work if there are many run commands.
We do some optimizations by adding layerFileMap instead of layerHashCache
https://github.com/GoogleContainerTools/kaniko/blob/master/pkg/snapshot/layered_map.go#L35 here and store this information for subsequent layers.

@tejal29
Copy link
Contributor

tejal29 commented Aug 13, 2020

I have published updated images

gcr.io/kaniko-project/executor:latest-1317
gcr.io/kaniko-project/executor:debug-1317

And the resulting image with this executor is:
1.

docker run -it --entrypoint /busybox/sh -v /usr/local/google/home/tejaldesai/.config/gcloud:/root/.config/gcloud -v /usr/local/google/home/tejaldesai/workspace/example/kaniko-test:/workspace  gcr.io/kaniko-project/executor:debug-1317

/kaniko/executor -f Dockerfile --context=dir://workspace --destination=gcr.io/tejal-test/kaniko-test-1317 --use-new-run=true

docker pull  gcr.io/tejal-test/kaniko-test-1317 &&  docker history gcr.io/tejal-test/kaniko-test-1317
IMAGE               CREATED             CREATED BY                                      SIZE                COMMENT
2a62ec2c15d4        292 years ago       COPY --from=builder /opt/app/_build/prod/rel…   39.9MB              
<missing>           292 years ago       WORKDIR /app                                    0B                  
<missing>           292 years ago       RUN apt-get update &&   apt-get install -y -…   11.9MB              
<missing>           292 years ago       RUN seq 1 8 | xargs -I{} mkdir -p /usr/share…   0B                  
<missing>           292 years ago       RUN apt-get -qq install -y locales locales-a…   238MB               
<missing>           292 years ago       RUN apt-get -qq update                          17.5MB              
<missing>           9 days ago          /bin/sh -c #(nop)  CMD ["bash"]                 0B                  
<missing>           9 days ago          /bin/sh -c #(nop) ADD file:3af3091e7d2bb40bc…   69.2MB              
tejaldesai@@kaniko (another_impl)$ 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/performance issues related to kaniko performance enhancement
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants