Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Taskrun status incorrectly assigned #3412

Closed
jtama opened this issue Oct 20, 2020 · 4 comments · Fixed by #3571
Closed

Taskrun status incorrectly assigned #3412

jtama opened this issue Oct 20, 2020 · 4 comments · Fixed by #3571
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@jtama
Copy link

jtama commented Oct 20, 2020

Expected Behavior

When a build-step is marked waiting with reason CreateContainerConfigError, it should be marked as failed, which would cause the whole pipeline to be marked as failed

Actual Behavior

The build-step is waiting, the taskrun status is unknown and pipelinerun is forever in running state

Steps to Reproduce the Problem

  1. Run a pipeline with securityContext that uses kaniko
apiVersion: tekton.dev/v1beta1
kind: Pipeline
metadata:
  name: go-build
spec:
  description: |
    Takes a git repository and a branch name then go build
  params:
    - name: repo-url
      type: string
      description: The git repository URL to clone from.
    - name: branch-name
      type: string
      description: The git branch to clone.
  workspaces:
    - name: shared-data
  tasks:
    - name: fetch-repo
      taskRef:
        name: git-clone
      workspaces:
        - name: output
          workspace: shared-data
      params:
        - name: url
          value: $(params.repo-url)
        - name: revision
          value: $(params.branch-name)
    - name: image-build-and-push
      taskRef:
        name: kaniko
      runAfter:
        - fetch-repo
      workspaces:
        - name: source
          workspace: shared-data

---
apiVersion: tekton.dev/v1beta1
kind: PipelineRun
metadata:
  generateName: issue-reproduction-
spec:
  podTemplate:
    securityContext: # Otherwise git checkout file with a user mvn have no rigth on in directory /workspace
      runAsNonRoot: true
      runAsUser: 1001
      runAsGroup: 3000
      fsGroup: 2000
  pipelineRef:
    name: unknown-state-build
  workspaces:
    - name: shared-data
      volumeClaimTemplate:
        spec:
          accessModes:
            - ReadWriteOnce
          resources:
            requests:
              storage: 1Gi

Additional Info

  • Kubernetes version:
Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.3", GitCommit:"2e7996e3e2712684bc73f0dec0200d64eec7fe40", GitTreeState:"clean", BuildDate:"2020-05-21T14:51:23Z", GoVersion:"go1.14.3", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"16+", GitVersion:"v1.16.13-gke.401", GitCommit:"eb94c181eea5290e9da1238db02cfef263542f5f", GitTreeState:"clean", BuildDate:"2020-09-09T00:57:35Z", GoVersion:"go1.13.9b4", Compiler:"gc", Platform:"linux/amd64"}
  • Tekton Pipeline version:
Client version: 0.13.0
Pipeline version: v0.16.3
Triggers version: v0.8.1
@jtama jtama added the kind/bug Categorizes issue or PR as related to a bug. label Oct 20, 2020
@pritidesai
Copy link
Member

I was able to reproduce this on master:

{"level":"info","ts":"2020-11-30T22:39:24.962Z","logger":"tekton.github.com-tektoncd-pipeline-pkg-reconciler-taskrun.Reconciler","caller":"taskrun/taskrun.go:449","msg":"Successfully reconciled taskrun issue-reproduction-94245-image-build-and-push-8vh25/default with status: &apis.Condition{Type:\"Succeeded\", Status:\"Unknown\", Severity:\"\", LastTransitionTime:apis.VolatileTime{Inner:v1.Time{Time:time.Time{wall:0xbfe97b07395e40dc, ext:6587113560701, loc:(*time.Location)(0x2f9cf60)}}}, Reason:\"CreateContainerConfigError\", Message:\"build step \\\"step-build-and-push\\\" is pending with reason \\\"container's runAsUser breaks non-root policy\\\"\"}","commit":"478479d","knative.dev/traceid":"abd9d0fa-2062-49ac-ba5d-79a46d1d1800","knative.dev/key":"default/issue-reproduction-94245-image-build-and-push-8vh25"}

@pritidesai
Copy link
Member

pritidesai commented Nov 30, 2020

It's because a taskRun is marked as running with Reason "CreateContainerConfigError":

pipeline/pkg/pod/status.go

Lines 294 to 301 in ce564ca

case IsPodHitConfigError(pod):
reason = ReasonCreateContainerConfigError
msg = getWaitingMessage(pod)
default:
reason = ReasonPending
msg = getWaitingMessage(pod)
}
MarkStatusRunning(trs, reason, msg)

Should this kind of error marked as fatal and exit the pipelineRun?

The reason description says this must indicate taskRun failure 🤔

// ReasonCreateContainerConfigError indicates that the TaskRun failed to create a pod due to
// config error of container
ReasonCreateContainerConfigError = "CreateContainerConfigError"

@pritidesai
Copy link
Member

Looks like we tried to fix it in the past with e5b2530#diff-1d8c8017631ff12ab411b4fba64245aa3706a8593d19580a3d421a2c96de2e18

@pritidesai
Copy link
Member

Should this taskRun status be marked as false instead of unknown:

 "issue-reproduction-94245-image-build-and-push-8vh25": {
                "pipelineTaskName": "image-build-and-push",
                "status": {
                    "conditions": [
                        {
                            "lastTransitionTime": "2020-11-30T22:39:24Z",
                            "message": "build step \"step-build-and-push\" is pending with reason \"container's runAsUser breaks non-root policy\"",
                            "reason": "CreateContainerConfigError",
                            "status": "Unknown",
                            "type": "Succeeded"
                        }
                    ],

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants