Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"internal server error" on the UI when viewing the logs ("no template found" error) #9644

Closed
2 of 3 tasks
gattma opened this issue Sep 21, 2022 · 3 comments
Closed
2 of 3 tasks
Assignees
Labels

Comments

@gattma
Copy link

gattma commented Sep 21, 2022

Pre-requisites

  • I have double-checked my configuration
  • I can confirm the issues exists when I tested with :latest
  • I'd like to contribute the fix myself (see contributing guide)

What happened/what you expected to happen?

Upgraded to version 3.4.0 with the new helm chart version 0.18.0.

After upgrading we see the logs of a step of the main container while it is currently running. As soon as that step is finished we are getting "internal server error" on the UI when viewing the logs for the main container (we can see the logs for the init and the wait container).

This only happens when you call a cluster workflow which in turn calls another cluster workflow (see my example).
For example, if you call the cluster workflow cwt-2 directly without cwt-1, you can see the logs in the UI.

It works even if the workflow fails. For example, if you remove "serviceAccountName: argo-wf-sa", the workflow is marked as failed (Error (exit code 1): pods ... is forbidden: User "...:default" cannot patch resource "pods" in API group "" in the namespace ...), but you can then see the logs in the UI.

Log output from the workflow server pod:

time="2022-09-21T07:47:12.626Z" level=info msg="Error while SSO Delegation" error="no service account rule matches"
time="2022-09-21T07:47:12.626Z" level=info msg="selected SSO RBAC service account for user" loginServiceAccount=workflows-super-admin serviceAccount=workflows-super-admin ssoDelegated=false ssoDelegationAllowed=true subject=48549dc2-b9df-4320-a26a-b2a0c73828db
time="2022-09-21T07:47:12.627Z" level=info msg="Get artifact file" artifactName=main-logs namespace=test nodeId=cwt-hello-world-xsz72-3907235591 workflowName=cwt-hello-world-xsz72
time="2022-09-21T07:47:12.634Z" level=error msg="Artifact Server returned internal error" error="no template found by the name of '' (which is the template associated with nodeId 'cwt-hello-world-xsz72-3907235591'??"
time="2022-09-21T07:47:12.634Z" level=info duration=11.287553ms method=GET path=/artifact-files/test/workflows/cwt-hello-world-xsz72/cwt-hello-world-xsz72-3907235591/outputs/main-logs size=22 status=500
time="2022-09-21T07:47:13.806Z" level=info msg="Alloc=28496 TotalAlloc=1653887 Sys=72209 NumGC=766 Goroutines=488"

Version

3.4.0

Paste a small workflow that reproduces the issue. We must be able to run the workflow; don't enter a workflows that uses private images.

---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: argo-workflow
rules:
- apiGroups:
  - ""
  resources:
  - pods
  verbs:
  - get
  - watch
  - patch
  - list
  - create
- apiGroups:
  - ""
  resources:
  - pods/log
  verbs:
  - get
  - watch
  - list
  - create
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: argo-wf-sa
  namespace: test
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: argo-default-workflow
  namespace: test
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: argo-workflow
subjects:
- kind: ServiceAccount
  name: argo-wf-sa
  namespace: test
---
apiVersion: argoproj.io/v1alpha1
kind: ClusterWorkflowTemplate
metadata:
  name: cwt-0
spec:
  entrypoint: pipeline
  templates:
  - name: pipeline
    dag:
      tasks:
      - name: callClusterWorkflowTemplate
        templateRef:
          clusterScope: true
          name: cwt-1
          template: whalesay-template
---
apiVersion: argoproj.io/v1alpha1
kind: ClusterWorkflowTemplate
metadata:
  name: cwt-1
spec:
  entrypoint: whalesay-template
  templates:
    - name: whalesay-template
      container:
        image: docker/whalesay
        command: [cowsay]
        args: ["Hello World"]
---
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: cwt-hello-world-
  namespace: test
spec:
  serviceAccountName: argo-wf-sa
  workflowTemplateRef:
    name: cwt-0
    clusterScope: true

Logs from the workflow controller

time="2022-09-21T07:50:55.501Z" level=info msg="node cwt-hello-world-m8dkb phase Running -> Succeeded" namespace=test workflow=cwt-hello-world-m8dkb
time="2022-09-21T07:50:55.501Z" level=info msg="node cwt-hello-world-m8dkb finished: 2022-09-21 07:50:55.501216294 +0000 UTC" namespace=test workflow=cwt-hello-world-m8dkb
time="2022-09-21T07:50:55.501Z" level=info msg="Checking daemoned children of cwt-hello-world-m8dkb" namespace=test workflow=cwt-hello-world-m8dkb
time="2022-09-21T07:50:55.501Z" level=info msg="TaskSet Reconciliation" namespace=test workflow=cwt-hello-world-m8dkb
time="2022-09-21T07:50:55.501Z" level=info msg=reconcileAgentPod namespace=test workflow=cwt-hello-world-m8dkb
time="2022-09-21T07:50:55.501Z" level=info msg="Updated phase Running -> Succeeded" namespace=test workflow=cwt-hello-world-m8dkb
time="2022-09-21T07:50:55.501Z" level=info msg="Marking workflow completed" namespace=test workflow=cwt-hello-world-m8dkb
time="2022-09-21T07:50:55.501Z" level=info msg="Marking workflow as pending archiving" namespace=test workflow=cwt-hello-world-m8dkb
time="2022-09-21T07:50:55.501Z" level=info msg="Checking daemoned children of " namespace=test workflow=cwt-hello-world-m8dkb
time="2022-09-21T07:50:55.501Z" level=info msg="Workflow to be dehydrated" Workflow Size=2161
time="2022-09-21T07:50:55.507Z" level=info msg="cleaning up pod" action=deletePod key=test/cwt-hello-world-m8dkb-1340600742-agent/deletePod
time="2022-09-21T07:50:55.516Z" level=info msg="Create events 201"
time="2022-09-21T07:50:55.528Z" level=info msg="Update workflows 200"
time="2022-09-21T07:50:55.528Z" level=info msg="Workflow update successful" namespace=test phase=Succeeded resourceVersion=149311216 workflow=cwt-hello-world-m8dkb
time="2022-09-21T07:50:55.529Z" level=info msg="Delete pods 404"
time="2022-09-21T07:50:55.537Z" level=info msg="DeleteCollection workflowtaskresults 200"
time="2022-09-21T07:50:55.538Z" level=info msg="archiving workflow" namespace=test uid=436a8fc3-2060-4954-8ecf-3c5676a959b0 workflow=cwt-hello-world-m8dkb
time="2022-09-21T07:50:55.543Z" level=info msg="cleaning up pod" action=labelPodCompleted key=test/cwt-hello-world-m8dkb-1426523485/labelPodCompleted
time="2022-09-21T07:50:55.603Z" level=info msg="Queueing Succeeded workflow test/cwt-hello-world-m8dkb for delete in 1h0m0s due to TTL"

Logs from in your workflow's wait container

time="2022-09-21T07:21:35.998Z" level=info msg="Starting Workflow Executor" version=v3.4.0
time="2022-09-21T07:21:36.003Z" level=info msg="Using executor retry strategy" Duration=1s Factor=1.6 Jitter=0.5 Steps=5
time="2022-09-21T07:21:36.004Z" level=info msg="Executor initialized" deadline="0001-01-01 00:00:00 +0000 UTC" includeScriptOutput=false namespace=test podName=cwt-hello-worl
d-8ltr6-2614080844 template="{\"name\":\"whalesay-template\",\"inputs\":{},\"outputs\":{},\"metadata\":{},\"container\":{\"name\":\"\",\"image\":\"docker/whalesay\",\"command\":[\"cowsay\"],\"args\":[\"Hello Worl
d\"],\"resources\":{}},\"archiveLocation\":{\"archiveLogs\":true,\"s3\":{\"endpoint\":\"minio-artifact-repository.gepaplexx-cicd-tools:9000\",\"bucket\":\"argo-workflows\",\"insecure\":true,\"accessKeySecret\":{\
"name\":\"minio-artifact-repository\",\"key\":\"accesskey\"},\"secretKeySecret\":{\"name\":\"minio-artifact-repository\",\"key\":\"secretkey\"},\"key\":\"cwt-hello-world-8ltr6/cluster-workfl
ow-template-hello-world-8ltr6-2614080844\"}}}" version="&Version{Version:v3.4.0,BuildDate:2022-09-19T03:47:58Z,GitCommit:047952afd539d06cae2fd6ba0b608b19c1194bba,GitTag:v3.4.0,GitTreeState:clean,GoVersion:go1.18.
6,Compiler:gc,Platform:linux/amd64,}"
time="2022-09-21T07:21:36.005Z" level=info msg="Starting deadline monitor"
time="2022-09-21T07:21:38.007Z" level=info msg="Main container completed" error="<nil>"
time="2022-09-21T07:21:38.007Z" level=info msg="No Script output reference in workflow. Capturing script output ignored"
time="2022-09-21T07:21:38.007Z" level=info msg="No output parameters"
time="2022-09-21T07:21:38.007Z" level=info msg="No output artifacts"
time="2022-09-21T07:21:38.008Z" level=info msg="S3 Save path: /tmp/argo/outputs/logs/main.log, key: cwt-hello-world-8ltr6/cwt-hello-world-8ltr6-2614080844/main.log"
time="2022-09-21T07:21:38.008Z" level=info msg="Creating minio client using static credentials" endpoint="minio-artifact-repository.gepaplexx-cicd-tools:9000"
time="2022-09-21T07:21:38.008Z" level=info msg="Saving file to s3" bucket=argo-workflows endpoint="minio-artifact-repository.gepaplexx-cicd-tools:9000" key=cwt-hello-world-8ltr6/cluster-work
flow-template-hello-world-8ltr6-2614080844/main.log path=/tmp/argo/outputs/logs/main.log
time="2022-09-21T07:21:38.074Z" level=info msg="Save artifact" artifactName=main-logs duration=66.561346ms error="<nil>" key=cwt-hello-world-8ltr6/cwt-hello-world-8ltr6
-2614080844/main.log
time="2022-09-21T07:21:38.074Z" level=info msg="not deleting local artifact" localArtPath=/tmp/argo/outputs/logs/main.log
time="2022-09-21T07:21:38.074Z" level=info msg="Successfully saved file: /tmp/argo/outputs/logs/main.log"
time="2022-09-21T07:21:38.089Z" level=info msg="Create workflowtaskresults 403"
time="2022-09-21T07:21:38.090Z" level=warning msg="failed to patch task set, falling back to legacy/insecure pod patch, see https://argoproj.github.io/argo-workflows/workflow-rbac/" error="workflowtaskresults.arg
oproj.io is forbidden: User \"system:serviceaccount:test:argo-wf-sa\" cannot create resource \"workflowtaskresults\" in API group \"argoproj.io\" in the namespace \"test\""
time="2022-09-21T07:21:38.198Z" level=info msg="Patch pods 200"
time="2022-09-21T07:21:38.203Z" level=info msg="stopping progress monitor (context done)" error="context canceled"
time="2022-09-21T07:21:38.203Z" level=info msg="Deadline monitor stopped"
time="2022-09-21T07:21:38.204Z" level=info msg="Alloc=6879 TotalAlloc=13250 Sys=23506 NumGC=4 Goroutines=7"
@Transmitt0r
Copy link
Contributor

I think this also relates to #9631

@juliev0
Copy link
Contributor

juliev0 commented Sep 21, 2022

@brianloss are you able to assign yourself this? It looks like I can't assign it to you. (if you add a comment here I believe I can, though) Thank you for looking into it!

@brianloss
Copy link
Contributor

@juliev0 No, I don't have permissions to assign, but let's see if this comment helps.

terrytangyuan pushed a commit that referenced this issue Sep 21, 2022
…9644. (#9648)

* fix: Fixed artifact retrieval when templateRef in use. Fixes #9631.

Signed-off-by: Brian Loss <[email protected]>

* chore: Address review feedback - use util method

* Expose getTemplateFromNode in workflow/util/util.go
* Update uses of getTemplateFromNode to GetTemplateFromNode
* Call GetTemplateFromNode in artifact_server.go

Signed-off-by: Brian Loss <[email protected]>

Signed-off-by: Brian Loss <[email protected]>
juchaosong pushed a commit to juchaosong/argo-workflows that referenced this issue Nov 3, 2022
…#9631, argoproj#9644. (argoproj#9648)

* fix: Fixed artifact retrieval when templateRef in use. Fixes argoproj#9631.

Signed-off-by: Brian Loss <[email protected]>

* chore: Address review feedback - use util method

* Expose getTemplateFromNode in workflow/util/util.go
* Update uses of getTemplateFromNode to GetTemplateFromNode
* Call GetTemplateFromNode in artifact_server.go

Signed-off-by: Brian Loss <[email protected]>

Signed-off-by: Brian Loss <[email protected]>
Signed-off-by: juchao <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants