Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Repeated builds using cache produce broken images #1162

Closed
swist opened this issue Mar 27, 2020 · 38 comments · Fixed by #1214
Closed

Repeated builds using cache produce broken images #1162

swist opened this issue Mar 27, 2020 · 38 comments · Fixed by #1214
Assignees
Labels
area/caching For all bugs related to cache issues

Comments

@swist
Copy link

swist commented Mar 27, 2020

Actual behavior
Hey we’re having some problems with kaniko 0.19 and python. We have a dockerfile that looks vaguely like this:

FROM python:3.7.4-slim AS service-base
RUN apt-get update \
  && apt-get --assume-yes install curl gpg  libcurl4-openssl-dev libssl-dev gcc git build-essential netcat libsnappy-dev

RUN pip install poetry=="1.0.2" 
COPY ./_packages /api/_packages
COPY ./config /config

WORKDIR /service/cat
COPY . /service/cat
RUN rm -rf _packages
RUN rm -rf config

RUN \
  poetry config virtualenvs.create false && \
  poetry install

For reasons independent of us, we can’t actually re-organize the sourcecode to remove the RUN rm lines.
What we’re seeing is that if our cache ECR repo is empty, then the image is fine.
We then build an image using the cache, but change one of later layers (for example adding a file to /service/cat)
The build then completes, but upon pulling the image from ECR to a kubelet, we see:
Failed to pull image

"{repo_url}": rpc error: code = Unknown desc = failed to register layer: Error processing tar file(exit status 1): file exists  

Expected behavior
Subsequent builds using cache should produce an image that can be ran on kubernetes

To Reproduce
See above

Additional Information

Triage Notes for the Maintainers

Description Yes/No
Please check if this a new feature you are proposing
Please check if the build works in docker but not in kaniko
Please check if this error is seen when you use --cache flag
Please check if your dockerfile is a multistage dockerfile
@gilbsgilbs
Copy link
Contributor

gilbsgilbs commented Mar 27, 2020

I can confirm that this issue also happens randomly to me when caching is enabled. It happens in non-python project as well and the only workaround I found so far is to remove every cache image and re-build the image. The error message when pulling the image is different on OSX and on Linux:

  • Linux: ApplyLayer duplicates of file paths not supported stdout: {"layerSize":283880101}
  • OSX (described by OP): ERROR: failed to register layer: Error processing tar file(exit status 1): failed to mknod("/etc/alternatives/pager.1.gz", S_IFCHR, 0): file exists (file may vary)

I think it started occurring in v0.17.0 for me, but can't tell for sure.

@tejal29
Copy link
Contributor

tejal29 commented Mar 27, 2020

Thanks @gilbsgilbs I will take a look into this.

@tejal29 tejal29 added the area/caching For all bugs related to cache issues label Mar 27, 2020
@tejal29 tejal29 added this to the Release v1.0.0 milestone Mar 27, 2020
@hochreitercorpuls
Copy link

seems related/equal to #1158 -
we are also experiencing this issue and we are building our containers like this:

FROM continuumio/miniconda3:latest

COPY ./environment.yml /app/environment.yml
RUN conda env update -n base -f /app/environment.yml && conda clean --all -y -q && rm -rf /opt/conda/pkgs

COPY ./ourproject.a /app/ourproject.a
COPY ./ourproject.b /app/ourproject.b
COPY ./ourproject.c /app/ourproject.c
COPY ./ourproject.d /app/ourproject.d
COPY ./ourproject.e /app/ourproject.e
COPY ./ourproject.f /app/ourproject.f

ARG VERSION=docker.dev
ENV DOCKER_BUILD_VERSION $VERSION

RUN SETUPTOOLS_SCM_PRETEND_VERSION=$DOCKER_BUILD_VERSION pip install -e /app/ourproject.a && SETUPTOOLS_SCM_PRETEND_VERSION=$DOCKER_BUILD_VERSION pip install -e /app/ourproject.b -e /app/ourproject.c -e /app/ourproject.d -e /app/ourproject.e -e /app/ourproject.f

WORKDIR /work

our kaniko is invoked like this in our .gitlab-ci.yml (basically like it is recommended by gitlab)

.kaniko_job_template: &kaniko_job
  image:
    name: gcr.io/kaniko-project/executor:debug
    entrypoint: [""]
  timeout: 1h

setup:anaconda docker image:
  stage: setup
  <<: *kaniko_job
  script:
    - 'echo "{\"auths\":{\"$CI_REGISTRY\":{\"username\":\"$CI_REGISTRY_USER\",\"password\":\"$CI_REGISTRY_PASSWORD\"}}}" > /kaniko/.docker/config.json'
    - '/kaniko/warmer --cache-dir=/cache --image=continuumio/miniconda3:latest --verbosity=debug || true'
    - '/kaniko/executor version'
    - '/kaniko/executor --context $CI_PROJECT_DIR --dockerfile $CI_PROJECT_DIR/dockerfiles/conda/Dockerfile --cache=true --cache-dir=/cache --destination $CI_REGISTRY_IMAGE:$CI_COMMIT_SHORT_SHA --destination $CI_REGISTRY_IMAGE:$CI_COMMIT_REF_SLUG-latest --verbosity=debug --cleanup'
  • We use the warmer to prepopulate the cache with the base image. Which works fine.
  • We used it to prepopulate the cache with an earlier version of the to be built image too - right now it is disabled due to previous debugging efforts.
  • What we experience with cache being enabled is, that the first build (without any existant cache) works perfectly fine.
  • Also subsequent builds (with cache enabled) work just as expected, as long as those do not contain any changes within the code - if it's rather adding a tag to an existant image instead of building a new one.
  • Any subsequent run with a new commit containing code changes in any of the copied and pip-installed project parts leads to a corrupted image with its size doubled due to a corrupted layer (excluding base image size).
  • Trying to download the "additional" layer from cache repository registry results in an error just like @gilbsgilbs metioned. failed to register layer: Error processing tar file(exit status 1): failed to mknod("/opt/conda/conda-meta/certifi-2019.11.28-py37_0.json", S_IFCHR, 0): file exists

@swist
Copy link
Author

swist commented Mar 28, 2020

We're also seeing that corrupted images increase in size by O(duplicate of one of the layers).

Have you managed to work out which docker command the additional layer corresponds to? We have suspected that it has to do with layers that don't change the image size (run rm or run mv ) but haven't managed to isolate the issue yet.

Re version: We have used 0.16 briefly, but haven't noticed this issue until 0.17 and up (we upgraded due to other bugs). We assumed that this particular issue is due to our dockerfiles being a bit "loose" (haven't linted them) so we just tried to upgrade our way out of it, but now we're close to target date for our project and it's the last blocker.

Also for context.
It's not entirely clear to me how to cause this to fail reliably.
I've noticed that given a good image and its layers in cache, sometimes adding/removing a file causes the subsequent one to build or not. Tag/envvar changes don't cause problems as @hochreitercorpuls
pointed out.

This is how we invoke kaniko:


/kaniko/executor "--context" "dir:///garden-build/catproj/" "--dockerfile" "cat.Dockerfile" "--destination" "myaccount.dkr.ecr.eu-west-2.amazonaws.com/garden/cat-27f4aa058f" "--cache=true" "--build-arg" "BASE_IMAGE=myaccount.dkr.ecr.eu-west-2.amazonaws.com/base-image:v-8d42ac5948"; 

What we're seeing is that even though the broken layer is in the cache registry, kaniko doesn't detect it and doesn't fail to build. Instead pushes an image with that broken layer in place

@cx1111
Copy link

cx1111 commented Mar 30, 2020

Also running into the error:

Failed to pull image "<imagename>": rpc error: code = Unknown desc = failed to register layer: Error processing tar file(exit status 1): file exists

Building with image: gcr.io/kaniko-project/executor:debug-v0.18.0

Command:

/kaniko/executor --cache=true \
                 --context ${WORKSPACE} \
                 --destination=${imageName} \
                 --dockerfile ${WORKSPACE}/${dockerfileName} \
                 --log-format text

Dockerfile:

# https://github.com/ContinuumIO/docker-images/blob/master/miniconda3/debian/Dockerfile
FROM continuumio/miniconda3:4.7.12

SHELL ["/bin/bash", "-euo", "pipefail", "-c"]

USER root

WORKDIR /source/model

COPY ./environment.yml .

RUN conda env update -f environment.yml \
 && conda clean --all

COPY . .

CMD ["python3", "model.py"]

@swist
Copy link
Author

swist commented Apr 1, 2020

Is there anything else we can provide to help with this?

@tejal29
Copy link
Contributor

tejal29 commented Apr 2, 2020

@swist Sorry with the current situation I have been able to get to this sooner.
I spent a lot of time trying to duplicate this bug at my end with the docker file you provided.
I did the following actions

  1. Build a kaniko image using --cache=true and push to GCR registry
  2. I modified the environment.yml to add a simple dependency.
name: stats-1
dependencies:
  - numpy

  1. Re-building the image. I verified the size of the cached layers and did not notice any duplication.
  2. Added a python file in to the source context and rebuild. No changes in the size of cached layer or the image produced.

Is there anything else i am missing?

I also deployed a pod with the image produced on GKE and the pod was running fine. I am looking into code now to see what more can i do.

@swist
Copy link
Author

swist commented Apr 2, 2020

We're pretty sure that adding a dependency is going to invalidate the broken layer - it is going force a reinstall. What we have found has sometimes fails is when you add an extra empty file in the context so that it gets copied in just before the last layer

I'll try to get you a minimal reproducible example, unfortunately the only thing I have currently is my org's code.

@swist
Copy link
Author

swist commented Apr 2, 2020

According to community slack, something like this should do:

  • run the same multistage build twice where the first non-target stage results in the same files

@jeffbehl
Copy link

jeffbehl commented Apr 6, 2020

I'm having this issue with a non-multistage build for a python application with --cache=true set and no other cache variables present. The resulting image cannot be pulled:

...
e645b11c20c7: Pull complete 
f99c55bbe43f: Pull complete 
a601f4aae6a4: Pull complete 
0eab66b1bfee: Extracting [==================================================>]  658.4MB/658.4MB
1a62f58fd1c4: Download complete 
c885b0b9491c: Download complete 
1bb9ef84f671: Download complete 
7d64f9e9bf99: Download complete 
b09c7a2df20f: Download complete 
55407e0fe961: Download complete 
failed to register layer: Error processing tar file(exit status 1): failed to mknod("/var/cache/apt/pkgcache.bin", S_IFCHR, 0): file exists

It's uncleared what triggered the issue, though clearing the cache 'fixes' it. if additional information helps please ask and I can try and provide.

@MaesterZ
Copy link

MaesterZ commented Apr 7, 2020

Only solution to still use cache layers in a Docker registry is to rollback to 0.16.0 from any version from 0.17.0 to 0.19.0, this is a major regression and should be p0.

@swist
Copy link
Author

swist commented Apr 8, 2020

Still happens for us. Take any multistage build. Add/remove line that sets ENV=$BUILDARG

@tejal29 do you have any more context as to how this is happening? Were you able to reproduce on your setup?

LukeShu added a commit to datawire/aes-project-builder that referenced this issue Apr 8, 2020
@rhs
Copy link

rhs commented Apr 8, 2020

We're seeing this problem also. It seems to happen whenever we build this Dockerfile:

FROM ubuntu:18.04
MAINTAINER "XXX"
RUN apt-get update
RUN apt-get install -y python3 python3-dev python3-pip
COPY ./ ./app
WORKDIR ./app
RUN pip3 install -r requirements.txt
ENTRYPOINT [ "python3" ]
CMD [ "main.py" ]

@tejal29
Copy link
Contributor

tejal29 commented Apr 14, 2020

Still happens for us. Take any multistage build. Add/remove line that sets ENV=$BUILDARG

@tejal29 do you have any more context as to how this is happening? Were you able to reproduce on your setup?

yes. i was able to reproduce this when i ran the above dockerfile 3rd time.

@caalberts
Copy link

We are also hitting this problem

c935c89cdb3c: Pull complete
59bb444f4818: Pull complete
f14fac6c6413: Extracting [==================================================>]  467.1MB/467.1MB
2431d9999311: Download complete

image: gcr.io/kaniko-project/executor:debug

command:
/kaniko/executor --context=${context} --dockerfile=${Dockerfile} --destination=${image} --cache=true --cache-repo=${IMAGE_CACHE}

More detail in https://gitlab.com/gitlab-org/gitlab/-/merge_requests/28988

@dani29
Copy link
Contributor

dani29 commented Apr 18, 2020

Having previously ran into the same problem (Python based, multi-step Dockerfile), i've tried to use the files here to reproduce the issue but in vain.

@cx1111 - I've tried your Dockerfile (the one based on continuumio/miniconda3:4.7.12) on both v0.18.0 and v0.19.0, with an environment.yml file of a single dependency pytz.
The bug did not reproduce, but something strange I did notice was that in version v0.18.0 - inserting a dummy command just before the CMD (RUN python3 --version), caused the final image size to grow by few hundreds of MB. I'm guessing that's an unrelated bug that was fixed.
Anyway, could you share your environments.yml file?

@rhs - same here, didn't manage to reproduce it. Any chance you can share the requirements.txt file? I tried it with a pretty small set of Python requirements, so I don't know if it might be related.

@dani29
Copy link
Contributor

dani29 commented Apr 18, 2020

@swist I have tried to use your Dockerfile with a very basic pyproject.toml file (copied from here), and Kaniko (v0.18.0+) crashes with a panic. This seem to be related to a very recent issue with Go 1.14 based images (like Kaniko), discussed here: golang/go#37436 https://github.com/golang/go/wiki/LinuxKernelSignalVectorBug

I'll created a separate issue here: #1202

@TimShilov
Copy link

TimShilov commented Apr 18, 2020

I also encountered this problem. I'm using Google Cloud Build. It builds and works completely fine with docker build-docker push or if there's no cache. If there is a cache then it fails with following error:

rpc error: code = Unknown desc = failed to register layer: Error processing tar file(exit status 1): failed to mknod("/tmp/v8-compile-cache-0", S_IFCHR, 0): file exists

Dockerfile:

FROM node:12-slim
WORKDIR /app
EXPOSE 8080
CMD ["node", "app-client.js"]
RUN apt-get update && apt-get install --yes --no-install-recommends \
    ca-certificates \
    curl \
    gnupg \
  && echo "deb http://packages.cloud.google.com/apt gcsfuse-stretch main" \
    | tee /etc/apt/sources.list.d/gcsfuse.list \
  && curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add - \
  && apt-get update \
  && apt-get install --yes gcsfuse \
  && apt-get clean && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
COPY package*.json ./
RUN npm ci
COPY . .

cloudbuild.yaml

steps:
  - id: Build and push container
    name: 'gcr.io/kaniko-project/executor:latest'
    args: [
      '--dockerfile=Dockerfile',
      '--destination=$_IMAGE_NAME:$COMMIT_SHA',
      '--destination=$_IMAGE_NAME:$BRANCH_NAME',
      '--cache=true',
    ]

options:
  substitutionOption: ALLOW_LOOSE
substitutions:
  _IMAGE_NAME: gcr.io/$PROJECT_ID/project_name
  _DOCKERFILE_NAME: Dockerfile
  _DOCKERFILE_DIR: ''

@dani29
Copy link
Contributor

dani29 commented Apr 18, 2020

Finally managed to reproduce it consistently and figured out that it has to do with incorrect whiteout of certain files. I am still not very familiar with the logic in that area, but seeing that it was recently refactored in #1069, perhaps @tejal29 or @cvgw have a better intuition on what may have gone wrong?

To Reproduce:

If you run on certain Linux kernels, you may fail to build it with the latest kaniko image (see issue 1202).
Run make images and build a local image from the source code. That will fix it.

Dockerfile

FROM python:3.7.4-slim AS service-base
RUN apt-get update \
  && apt-get --assume-yes install curl gpg  libcurl4-openssl-dev libssl-dev gcc git build-essential netcat libsnappy-dev

RUN pip install poetry=="1.0.2"
COPY ./_packages /api/_packages
COPY ./config /config

WORKDIR /service/cat
COPY . /service/cat
RUN rm -rf _packages
RUN rm -rf config

RUN \
  poetry config virtualenvs.create false && \
  poetry install

Context:

_packages/
    foo.txt
config/
    foo.txt
pyproject.toml

pyproject.toml is the only file whose contents are meaningful. The text files are just "hello world".

[tool.poetry]
name = "poetry-demo"
version = "0.1.0"
description = ""
authors = ["Sébastien Eustace <[email protected]>"]

[tool.poetry.dependencies]
python = "*"

[tool.poetry.dev-dependencies]
pytest = "^3.4"

Build command to GCR:

docker run --rm -it \
 --volume "$HOME"/.config/gcloud:/root/.config/gcloud \
 --volume $PWD:/build gcr.io/kaniko-project/executor:latest \
 --context=/build \
 --dockerfile=Dockerfile \
 --destination=us.gcr.io/$YOUR_PROJECT/kaniko-bug \
 --cache=true \

Build it once, and you'll see an image on GCR with size of ~190MB that can be pulled.
edit pyproject.toml and add a new dependency f.e pytz = "2020.1".
Build it again, and you'll see a new image, this time with size of ~320MB, that fails to be pulled:

ERROR: failed to register layer: Error processing tar file(exit status 1): failed to mknod("/etc/alternatives/pager.1.gz", S_IFCHR, 0): file exists 

Some observations:

  • If run this script w/ debug verbosity (I recommend to comment out the noisy log on resolve.go), you can see that at some stage, pager.1.gz is whited-out and appears as .wh.pager.1.gz when files are resolved. IIUC this is what the eventual error says - it tries to create a file that already exists on the lower layers of the overlay FS.
  • If you add this funny hack in L185 of snapshot.go, making it skip pager.1.gz:
if s.l.MaybeAddWhiteout(path) {
      if path != "/etc/alternatives/pager.1.gz" {
	   logrus.Debugf("Adding whiteout for %s", path)
	   filesToWhiteOut = append(filesToWhiteOut, path)
}

The produced image is still with the wrong size (320MB, instead of 190MB), but it actually can be pulled normally. Which indicates that the bug that doubled layer's size is still not resolved.

terrchen pushed a commit to terrchen/gitlab that referenced this issue Apr 20, 2020
@mitchfriedman
Copy link
Contributor

We're continuously bumping into this as well and it makes the cache completely unusable. Is there anything we can do to help prioritise this?

Thanks for all your effort.

@nethunter
Copy link

I'm experiencing the same issue on 0.19.0. It works fine on a small Node.js image, but on a more complicated PHP image it dies. Trying now to revert to 0.16.0 to see if that will solve it for me.

@tejal29
Copy link
Contributor

tejal29 commented May 1, 2020

Hey folks, sorry for the delay. I spent yesterday, debugging this issue and here are my findings.

  1. Run kaniko/executor -f Dockerfile --cache=true --destination=<your_prj>
  2. Now the cache is populated,
  3. Make a change in your source files or dependencies.
  4. Re-run . In this case, the image size gets doubled and docker image is not pullable.

Note: This does not happen consistently.

After doing some debugging and from error logs, the cases where docker pull <image> failed due

Error processing tar file(exit status 1): failed to  XXXXXFILE : file exists

There was 2 entries for the file appeared in the layer twice in the first uncached layer or across layers.
The reason for this, kaniko takes a snapshot of entire FS before running any commands.
However, in case when --cache=true, this caused a bug where if with a dockerfile with 5 layers, the command corresponding to layer3 was changed, Kaniko

  • Step 0: will first snapshot the entire FS
  • Step 1, 2: will pull layers 1 and 2 from cache and extract them
  • Step 3: execute command 2 and try to find changed files since last snapshot.
    In this case, the snapshotter recognizes all files from layer1, layer2 and layer 3 as changed.

I fixed this in PR #1214 by moving the Step 0 i.e. initial snapshot when there is first layer which is not chached.

I verified Dockerfile given by @cx1111 and example provided by @dani29

I have pushed this fix in the following images

docker push gcr.io/kaniko-project/executor:debug-edge
docker push gcr.io/kaniko-project/executor:edge

Can someone give it a try?

@tejal29
Copy link
Contributor

tejal29 commented May 1, 2020

I'm experiencing the same issue on 0.19.0. It works fine on a small Node.js image, but on a more complicated PHP image it dies. Trying now to revert to 0.16.0 to see if that will solve it for me.

@nethunter Can you please try:

docker push gcr.io/kaniko-project/executor:debug-edge
docker push gcr.io/kaniko-project/executor:edge

@mitchfriedman Sorry things have been slow. But please try

docker push gcr.io/kaniko-project/executor:debug-edge
docker push gcr.io/kaniko-project/executor:edge

@IlyaSemenov
Copy link

IlyaSemenov commented May 1, 2020

gcr.io/kaniko-project/executor:edge seems to have fixed it for me. (However, I'm not sure how to invalidate the older layers which are still cached.)

@tejal29
Copy link
Contributor

tejal29 commented May 1, 2020

@IlyaSemenov, thanks for verifying. Would have to delete the remote cache.

@TimShilov
Copy link

TimShilov commented May 1, 2020

Unfortunately gcr.io/kaniko-project/executor:edge haven't solved the problem for me.
I removed remote cache and switched to edge. After couple of builds I'm getting the same error:

rpc error: code = Unknown desc = failed to register layer: Error processing tar file(exit status 1): failed to mknod("/tmp/v8-compile-cache-0", S_IFCHR, 0): file exists

@tejal29
Copy link
Contributor

tejal29 commented May 1, 2020

@TimShilov are you making code changes before running each build? Also are you by any chance reusing the same kaniko pod?

@TimShilov
Copy link

@tejal29 I was making dummy changes (just whitespace) in one file between commits.
I'm building images with Google Cloud Build so I'm not 100% sure if it gets reused or not but I assume the answer is no.

Just tried one more time just to make sure I didn't missed anything. Removed remote cache, first image worked fine. After second build I got the error again.
Version 0.16.0 works without any problems.

Is there anything else I can do or try to help?

@tejal29
Copy link
Contributor

tejal29 commented May 1, 2020

ok. let me try with GCB build. I am trying to find a small package-lock.json to reproduce your issue.

@tejal29
Copy link
Contributor

tejal29 commented May 1, 2020

@TimShilov, i tried your dockerfile and it did work for me.
This is what i did:

  1. Run kaniko build on this sample node js app -> https://github.com/heroku/node-js-sample
  2. Changed a file index.js
  3. Re-ran kaniko build.
    I did a docker pull on the 2nd image.

Can you please check if there is a substantial increase in the size of the 2 images (like almost doubling)?

@tejal29
Copy link
Contributor

tejal29 commented May 1, 2020

I am going to do the build one more time!

@tejal29
Copy link
Contributor

tejal29 commented May 1, 2020

confirming, i saw the error on 4th build.

@tejal29
Copy link
Contributor

tejal29 commented May 1, 2020

@TimShilov I finally got your issue fixed. the condition to detect if initial snapshot should be taked was not complete.

  1. In my earlier branch, i took the initial snapshot for first non cached command.
  2. However, for metadata commands this condition would return true. In a dockerfile if you have a metadata command before a cached command, then the code went back to previous case.
    Your dockerfile demonstrated this case.
    I have the changed the conditions to
  3. First non cached command which
  • not a metadata manipulation command
  • not a command which exactly knows which files changed e.g. "COPY"

I will do some more testing and give you an image to try soon

@tejal29
Copy link
Contributor

tejal29 commented May 1, 2020

@TimShilov , Can you try with this image:

 gcr.io/kaniko-project/executor:debug-edge
gcr.io/kaniko-project/executor:edge

@TimShilov
Copy link

TimShilov commented May 2, 2020

@tejal29 Tried again with gcr.io/kaniko-project/executor:edge. I made 4 builds and had no problems so far!
I will keep using edge for a couple of days to see if problem will return but so far seems fixed to me.

Thanks for your help! 👍

@tejal29
Copy link
Contributor

tejal29 commented May 4, 2020

Thanks @TimShilov for your help verifying this build.
This will in v0.20.0 release

@caalberts
Copy link

@tejal29 Thanks for fixing this, really appreciate it! I wanted to try using gcr.io/kaniko-project/executor:debug-edge, but it is now failing when I attempt to create a /kaniko/.docker/config.json

 $ echo "{\"auths\":{\"$CI_REGISTRY\":{\"username\":\"$CI_REGISTRY_USER\",\"password\":\"$CI_REGISTRY_PASSWORD\"}}}" > /kaniko/.docker/config.json
 /busybox/sh: eval: line 147: can't create /kaniko/.docker/config.json: nonexistent directory

Previously I was using gcr.io/kaniko-project/executor:debug-v0.16.0 and there was no nonexistent directory error. Is this an issue specific to edge images, or something has changed between 0.16.0 and edge

@hochreitercorpuls
Copy link

hochreitercorpuls commented May 4, 2020

@tejal29 Thanks for fixing this, really appreciate it! I wanted to try using gcr.io/kaniko-project/executor:debug-edge, but it is now failing when I attempt to create a /kaniko/.docker/config.json

 $ echo "{\"auths\":{\"$CI_REGISTRY\":{\"username\":\"$CI_REGISTRY_USER\",\"password\":\"$CI_REGISTRY_PASSWORD\"}}}" > /kaniko/.docker/config.json
 /busybox/sh: eval: line 147: can't create /kaniko/.docker/config.json: nonexistent directory

Previously I was using gcr.io/kaniko-project/executor:debug-v0.16.0 and there was no nonexistent directory error. Is this an issue specific to edge images, or something has changed between 0.16.0 and edge

It is NOT fixed in debug-v0.20.0
so you'll need a

   $ mkdir -p /kaniko/.docker

before your echo. And gitlab needs to change its documentation.

hatsuyuki15 added a commit to hatsuyuki15/drone-kaniko that referenced this issue May 4, 2021
bump executor version to fix this bug:
GoogleContainerTools/kaniko#1162
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/caching For all bugs related to cache issues
Projects
None yet
Development

Successfully merging a pull request may close this issue.