Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workflow-level exit lifecycle hook ignores expression #8742

Closed
3 tasks done
roofurmston opened this issue May 12, 2022 · 16 comments · Fixed by #8744
Closed
3 tasks done

Workflow-level exit lifecycle hook ignores expression #8742

roofurmston opened this issue May 12, 2022 · 16 comments · Fixed by #8744
Labels
area/exit-handler area/hooks type/support User support issue - likely not a bug

Comments

@roofurmston
Copy link
Contributor

Checklist

  • Double-checked my configuration.
  • Tested using the latest version.
  • Used the Emissary executor.

Summary

What happened/what you expected to happen?

Expected Behaviour:
I have a workflow with a workflow-level exit lifecycle hook. I expect that using expression means that the hook only runs when the expression evaluates to true.

Actual Behaviour:

The exit hook runs regardless of the expression

What version are you running?

3.3.4

Diagnostics

Paste the smallest workflow that reproduces the bug. We must be able to run the workflow.

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: lifecycle-hook-
spec:
  entrypoint: main
  hooks:
    exit:
      expression: workflow.status != "Succeeded"
      template: tails
  templates:
    - name: main
      steps:
      - - name: step1
          template: heads

    - name: heads
      container:
        image: alpine:3.6
        command: [sh, -c]
        args: ["echo \"it was heads\""]

    - name: tails
      container:
        image: alpine:3.6
        command: [sh, -c]
        args: ["echo \"it was tails\""]
# Logs from the workflow controller:
kubectl logs -n argo deploy/workflow-controller | grep ${workflow} 

time="2022-05-12T09:50:43.371Z" level=info msg="Processing workflow" namespace=mlplatform-example workflow=lifecycle-hook-w2nr9
time="2022-05-12T09:50:43.382Z" level=info msg="Updated phase  -> Running" namespace=mlplatform-example workflow=lifecycle-hook-w2nr9
time="2022-05-12T09:50:43.387Z" level=info msg="Steps node lifecycle-hook-w2nr9 initialized Running" namespace=mlplatform-example workflow=lifecycle-hook-w2nr9
time="2022-05-12T09:50:43.388Z" level=info msg="StepGroup node lifecycle-hook-w2nr9-1006668038 initialized Running" namespace=mlplatform-example workflow=lifecycle-hook-w2nr9
time="2022-05-12T09:50:43.388Z" level=info msg="Pod node lifecycle-hook-w2nr9-2182753681 initialized Pending" namespace=mlplatform-example workflow=lifecycle-hook-w2nr9
time="2022-05-12T09:50:43.414Z" level=info msg="Created pod: lifecycle-hook-w2nr9[0].step1 (lifecycle-hook-w2nr9-2182753681)" namespace=mlplatform-example workflow=lifecycle-hook-w2nr9
time="2022-05-12T09:50:43.414Z" level=info msg="Workflow step group node lifecycle-hook-w2nr9-1006668038 not yet completed" namespace=mlplatform-example workflow=lifecycle-hook-w2nr9
time="2022-05-12T09:50:43.414Z" level=info msg="TaskSet Reconciliation" namespace=mlplatform-example workflow=lifecycle-hook-w2nr9
time="2022-05-12T09:50:43.414Z" level=info msg=reconcileAgentPod namespace=mlplatform-example workflow=lifecycle-hook-w2nr9
time="2022-05-12T09:50:43.432Z" level=info msg="Workflow update successful" namespace=mlplatform-example phase=Running resourceVersion=902 workflow=lifecycle-hook-w2nr9
time="2022-05-12T09:50:53.422Z" level=info msg="Processing workflow" namespace=mlplatform-example workflow=lifecycle-hook-w2nr9
time="2022-05-12T09:50:53.423Z" level=info msg="Task-result reconciliation" namespace=mlplatform-example numObjs=0 workflow=lifecycle-hook-w2nr9
time="2022-05-12T09:50:53.424Z" level=info msg="node changed" new.message=PodInitializing new.phase=Pending new.progress=0/1 nodeID=lifecycle-hook-w2nr9-2182753681 old.message= old.phase=Pending old.progress=0/1
time="2022-05-12T09:50:53.425Z" level=info msg="Workflow step group node lifecycle-hook-w2nr9-1006668038 not yet completed" namespace=mlplatform-example workflow=lifecycle-hook-w2nr9
time="2022-05-12T09:50:53.425Z" level=info msg="TaskSet Reconciliation" namespace=mlplatform-example workflow=lifecycle-hook-w2nr9
time="2022-05-12T09:50:53.425Z" level=info msg=reconcileAgentPod namespace=mlplatform-example workflow=lifecycle-hook-w2nr9
time="2022-05-12T09:50:53.442Z" level=info msg="Workflow update successful" namespace=mlplatform-example phase=Running resourceVersion=918 workflow=lifecycle-hook-w2nr9
time="2022-05-12T09:51:03.409Z" level=info msg="Processing workflow" namespace=mlplatform-example workflow=lifecycle-hook-w2nr9
time="2022-05-12T09:51:03.410Z" level=info msg="Task-result reconciliation" namespace=mlplatform-example numObjs=0 workflow=lifecycle-hook-w2nr9
time="2022-05-12T09:51:03.410Z" level=info msg="node unchanged" nodeID=lifecycle-hook-w2nr9-2182753681
time="2022-05-12T09:51:03.410Z" level=info msg="Workflow step group node lifecycle-hook-w2nr9-1006668038 not yet completed" namespace=mlplatform-example workflow=lifecycle-hook-w2nr9
time="2022-05-12T09:51:03.410Z" level=info msg="TaskSet Reconciliation" namespace=mlplatform-example workflow=lifecycle-hook-w2nr9
time="2022-05-12T09:51:03.410Z" level=info msg=reconcileAgentPod namespace=mlplatform-example workflow=lifecycle-hook-w2nr9
time="2022-05-12T09:51:42.292Z" level=info msg="Processing workflow" namespace=mlplatform-example workflow=lifecycle-hook-w2nr9
time="2022-05-12T09:51:42.294Z" level=info msg="Task-result reconciliation" namespace=mlplatform-example numObjs=0 workflow=lifecycle-hook-w2nr9
time="2022-05-12T09:51:42.295Z" level=info msg="node changed" new.message= new.phase=Succeeded new.progress=0/1 nodeID=lifecycle-hook-w2nr9-2182753681 old.message=PodInitializing old.phase=Pending old.progress=0/1
time="2022-05-12T09:51:42.295Z" level=info msg="Step group node lifecycle-hook-w2nr9-1006668038 successful" namespace=mlplatform-example workflow=lifecycle-hook-w2nr9
time="2022-05-12T09:51:42.296Z" level=info msg="node lifecycle-hook-w2nr9-1006668038 phase Running -> Succeeded" namespace=mlplatform-example workflow=lifecycle-hook-w2nr9
time="2022-05-12T09:51:42.296Z" level=info msg="node lifecycle-hook-w2nr9-1006668038 finished: 2022-05-12 09:51:42.2962086 +0000 UTC" namespace=mlplatform-example workflow=lifecycle-hook-w2nr9
time="2022-05-12T09:51:42.296Z" level=info msg="Outbound nodes of lifecycle-hook-w2nr9-2182753681 is [lifecycle-hook-w2nr9-2182753681]" namespace=mlplatform-example workflow=lifecycle-hook-w2nr9
time="2022-05-12T09:51:42.296Z" level=info msg="Outbound nodes of lifecycle-hook-w2nr9 is [lifecycle-hook-w2nr9-2182753681]" namespace=mlplatform-example workflow=lifecycle-hook-w2nr9
time="2022-05-12T09:51:42.297Z" level=info msg="node lifecycle-hook-w2nr9 phase Running -> Succeeded" namespace=mlplatform-example workflow=lifecycle-hook-w2nr9
time="2022-05-12T09:51:42.297Z" level=info msg="node lifecycle-hook-w2nr9 finished: 2022-05-12 09:51:42.2975573 +0000 UTC" namespace=mlplatform-example workflow=lifecycle-hook-w2nr9
time="2022-05-12T09:51:42.297Z" level=info msg="Checking daemoned children of lifecycle-hook-w2nr9" namespace=mlplatform-example workflow=lifecycle-hook-w2nr9
time="2022-05-12T09:51:42.297Z" level=info msg="TaskSet Reconciliation" namespace=mlplatform-example workflow=lifecycle-hook-w2nr9
time="2022-05-12T09:51:42.297Z" level=info msg=reconcileAgentPod namespace=mlplatform-example workflow=lifecycle-hook-w2nr9
time="2022-05-12T09:51:42.298Z" level=info msg="Running OnExit handler: " namespace=mlplatform-example workflow=lifecycle-hook-w2nr9
time="2022-05-12T09:51:42.298Z" level=info msg="Pod node lifecycle-hook-w2nr9-225539071 initialized Pending" namespace=mlplatform-example workflow=lifecycle-hook-w2nr9
time="2022-05-12T09:51:42.323Z" level=info msg="Created pod: lifecycle-hook-w2nr9.onExit (lifecycle-hook-w2nr9-225539071)" namespace=mlplatform-example workflow=lifecycle-hook-w2nr9
time="2022-05-12T09:51:42.349Z" level=info msg="Workflow update successful" namespace=mlplatform-example phase=Running resourceVersion=980 workflow=lifecycle-hook-w2nr9
time="2022-05-12T09:51:42.357Z" level=info msg="cleaning up pod" action=labelPodCompleted key=mlplatform-example/lifecycle-hook-w2nr9-2182753681/labelPodCompleted
time="2022-05-12T09:51:52.324Z" level=info msg="Processing workflow" namespace=mlplatform-example workflow=lifecycle-hook-w2nr9
time="2022-05-12T09:51:52.325Z" level=info msg="Task-result reconciliation" namespace=mlplatform-example numObjs=0 workflow=lifecycle-hook-w2nr9
time="2022-05-12T09:51:52.325Z" level=info msg="node changed" new.message= new.phase=Succeeded new.progress=0/1 nodeID=lifecycle-hook-w2nr9-225539071 old.message= old.phase=Pending old.progress=0/1
time="2022-05-12T09:51:52.326Z" level=info msg="TaskSet Reconciliation" namespace=mlplatform-example workflow=lifecycle-hook-w2nr9
time="2022-05-12T09:51:52.326Z" level=info msg=reconcileAgentPod namespace=mlplatform-example workflow=lifecycle-hook-w2nr9
time="2022-05-12T09:51:52.326Z" level=info msg="Running OnExit handler: " namespace=mlplatform-example workflow=lifecycle-hook-w2nr9
time="2022-05-12T09:51:52.326Z" level=info msg="Updated phase Running -> Succeeded" namespace=mlplatform-example workflow=lifecycle-hook-w2nr9
time="2022-05-12T09:51:52.326Z" level=info msg="Marking workflow completed" namespace=mlplatform-example workflow=lifecycle-hook-w2nr9
time="2022-05-12T09:51:52.326Z" level=info msg="Checking daemoned children of " namespace=mlplatform-example workflow=lifecycle-hook-w2nr9
time="2022-05-12T09:51:52.332Z" level=info msg="cleaning up pod" action=deletePod key=mlplatform-example/lifecycle-hook-w2nr9-1340600742-agent/deletePod
time="2022-05-12T09:51:52.344Z" level=info msg="Queueing Succeeded workflow mlplatform-example/lifecycle-hook-w2nr9 for delete in 720h0m0s due to TTL"
time="2022-05-12T09:51:52.345Z" level=info msg="Workflow update successful" namespace=mlplatform-example phase=Succeeded resourceVersion=1013 workflow=lifecycle-hook-w2nr9
time="2022-05-12T09:51:52.362Z" level=info msg="cleaning up pod" action=labelPodCompleted key=mlplatform-example/lifecycle-hook-w2nr9-225539071/labelPodCompleted

# If the workflow's pods have not been created, you can skip the rest of the diagnostics.

# The workflow's pods that are problematic:
kubectl get pod -o yaml -l workflows.argoproj.io/workflow=${workflow},workflow.argoproj.io/phase!=Succeeded

# Logs from in your workflow's wait container, something like:
kubectl logs -c wait -l workflows.argoproj.io/workflow=${workflow},workflow.argoproj.io/phase!=Succeeded

Message from the maintainers:

Impacted by this bug? Give it a 👍. We prioritise the issues with the most 👍.

@sandeepk8s
Copy link
Contributor

@roofurmston @terrytangyuan The issue is not resolved yet.
When I was trying to setup failed conditions on our cron workflows - tried out exit hook with expression.
Lifecycle hook with name 'exit' is ignoring the expression

Version running - v3.3.5

@sandeepk8s
Copy link
Contributor

Hi @alexec this issue needs to be reopened 😅

@roofurmston
Copy link
Contributor Author

Indeed, I have tested it on master and the issue is still present.

@terrytangyuan
Copy link
Member

It's not in 3.3.5. Please try again once it's included in the next release, e.g. 3.4.

@roofurmston
Copy link
Contributor Author

I tested the fix directly from the master branch. It is slightly confusing as the unit test was removed from the master branch.

You are sure that the fix is working in master as expected?

@terrytangyuan
Copy link
Member

terrytangyuan commented Jun 7, 2022

The fix and the removed test case need to be revisited when I get a chance.

@terrytangyuan terrytangyuan reopened this Jun 7, 2022
@sarabala1979 sarabala1979 mentioned this issue Jun 20, 2022
55 tasks
@stale

This comment was marked as resolved.

@stale stale bot added the problem/stale This has not had a response in some time label Jun 23, 2022
@sandeepk8s
Copy link
Contributor

Just tested with latest image. Issue still exists.
exit hook ignores the expression

@stale stale bot removed the problem/stale This has not had a response in some time label Jun 27, 2022
@stale
Copy link

stale bot commented Jul 12, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If this is a mentoring request, please provide an update here. Thank you for your contributions.

@stale stale bot added the problem/stale This has not had a response in some time label Jul 12, 2022
@stale
Copy link

stale bot commented Aug 13, 2022

This issue has been closed due to inactivity. Feel free to re-open if you still encounter this issue.

@stale stale bot closed this as completed Aug 13, 2022
@djaustin
Copy link

djaustin commented Sep 16, 2022

I came across this issue when searching for the same. This is sill an issue in 3.3.9.

@brendanstennett
Copy link

brendanstennett commented Oct 7, 2022

+1

Seeing it as well on v3.3.9. Have not tried v3.4.X yet

spec:
  entrypoint: coinflip
  hooks:
    exit:
      expression: workflow.status == "Failed"
      template: notify

Will always execute on success or failure

@rdean150
Copy link

+1

I'm still seeing this issue on v3.4.4 as well. Exit hook ignores expression and always executes.

@agilgur5
Copy link

If I'm reading this correctly, this sounds like correct behavior. Specifically, the Lifecycle Hooks docs say that you should not name a hook exit, as that will make it behave as a regular exit handler

@agilgur5 agilgur5 added type/support User support issue - likely not a bug and removed problem/stale This has not had a response in some time type/bug labels Sep 17, 2023
@brendanstennett
Copy link

@agilgur5 I think the general issue here is the expression section seems to be ignored. The hook is being labelled as exit but with workflow.status == "Failed" which is outlined in the docs you linked to.

This expression does not seem to be evaluated and will run if workflow.status == "Failed" or workflow.status != "Failed"

@agilgur5
Copy link

The hook is being labelled as exit but with workflow.status == "Failed"

Yes, an exit handler does not use an expression. Name your hook anything other than exit and the expression should work fine

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/exit-handler area/hooks type/support User support issue - likely not a bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants