Skip to content

Commit

Permalink
OnFailurePolicy docs & endtoend tests (#356)
Browse files Browse the repository at this point in the history
* Retryable Workflow

* Update image

* wip

* wip

* wip

* wip

* add run to completion tests

* Update deps

* Update admin

* Update admin

* Make kustomize

* Really update admin-prop

* Update admin

* update flytetester

* Update flytester

* Bigger run-to-completion-wf

* wip

* Update flytetester

* update admin

* Update admin

* Fix endtoend tests

* fix trigger

* update propeller and admin

* Add docs for OnFailurePolicy
  • Loading branch information
EngHabu authored Jun 23, 2020
1 parent 8f32851 commit d52f0cb
Show file tree
Hide file tree
Showing 9 changed files with 75 additions and 30 deletions.
23 changes: 11 additions & 12 deletions .github/workflows/tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,25 +4,24 @@ on:
branches:
- master
pull_request:
branches:
- master
jobs:
end-to-end:
runs-on: ubuntu-latest
strategy:
max-parallel: 4
matrix:
go-version: [1.10]
steps:
- uses: actions/checkout@v1
- name: Kustomize and diff
run: DELTA_CHECK=true make kustomize
- name: Set up Go@${{ matrix.go-version }}
uses: actions/setup-go@v1
with:
go-version: ${{ matrix.go-version }}
- name: Run end-to-end tests
run: make end2end
- uses: engineerd/[email protected]
- name: End2End
env:
DOCKER_USERNAME: ${{ github.actor }}
DOCKER_PASSWORD: "${{ secrets.GITHUB_TOKEN }}"
run: |
kubectl cluster-info
kubectl get pods -n kube-system
echo "current-context:" $(kubectl config current-context)
echo "environment-kubeconfig:" ${KUBECONFIG}
make end2end_execute
docs:
runs-on: ubuntu-latest
strategy:
Expand Down
8 changes: 4 additions & 4 deletions deployment/eks/flyte_generated.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -594,7 +594,7 @@ spec:
labels:
app: flyteadmin
app.kubernetes.io/name: flyteadmin
app.kubernetes.io/version: 0.2.8
app.kubernetes.io/version: 0.2.9
spec:
containers:
- command:
Expand All @@ -603,7 +603,7 @@ spec:
- --config
- /etc/flyte/config/flyteadmin_config.yaml
- serve
image: docker.io/lyft/flyteadmin:v0.2.8
image: docker.io/lyft/flyteadmin:v0.2.9
imagePullPolicy: IfNotPresent
name: flyteadmin
ports:
Expand Down Expand Up @@ -754,7 +754,7 @@ spec:
labels:
app: flytepropeller
app.kubernetes.io/name: flytepropeller
app.kubernetes.io/version: 0.2.45
app.kubernetes.io/version: 0.2.63
spec:
containers:
- args:
Expand All @@ -767,7 +767,7 @@ spec:
valueFrom:
fieldRef:
fieldPath: metadata.name
image: docker.io/lyft/flytepropeller:v0.2.45
image: docker.io/lyft/flytepropeller:v0.2.63
imagePullPolicy: IfNotPresent
name: flytepropeller
ports:
Expand Down
8 changes: 4 additions & 4 deletions deployment/sandbox/flyte_generated.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -1024,7 +1024,7 @@ spec:
labels:
app: flyteadmin
app.kubernetes.io/name: flyteadmin
app.kubernetes.io/version: 0.2.8
app.kubernetes.io/version: 0.2.9
spec:
containers:
- command:
Expand All @@ -1033,7 +1033,7 @@ spec:
- --config
- /etc/flyte/config/flyteadmin_config.yaml
- serve
image: docker.io/lyft/flyteadmin:v0.2.8
image: docker.io/lyft/flyteadmin:v0.2.9
imagePullPolicy: IfNotPresent
name: flyteadmin
ports:
Expand Down Expand Up @@ -1191,7 +1191,7 @@ spec:
labels:
app: flytepropeller
app.kubernetes.io/name: flytepropeller
app.kubernetes.io/version: 0.2.45
app.kubernetes.io/version: 0.2.63
spec:
containers:
- args:
Expand All @@ -1206,7 +1206,7 @@ spec:
valueFrom:
fieldRef:
fieldPath: metadata.name
image: docker.io/lyft/flytepropeller:v0.2.45
image: docker.io/lyft/flytepropeller:v0.2.63
imagePullPolicy: IfNotPresent
name: flytepropeller
ports:
Expand Down
8 changes: 4 additions & 4 deletions deployment/test/flyte_generated.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -644,7 +644,7 @@ spec:
labels:
app: flyteadmin
app.kubernetes.io/name: flyteadmin
app.kubernetes.io/version: 0.2.8
app.kubernetes.io/version: 0.2.9
spec:
containers:
- command:
Expand All @@ -653,7 +653,7 @@ spec:
- --config
- /etc/flyte/config/flyteadmin_config.yaml
- serve
image: docker.io/lyft/flyteadmin:v0.2.8
image: docker.io/lyft/flyteadmin:v0.2.9
imagePullPolicy: IfNotPresent
name: flyteadmin
ports:
Expand Down Expand Up @@ -772,7 +772,7 @@ spec:
labels:
app: flytepropeller
app.kubernetes.io/name: flytepropeller
app.kubernetes.io/version: 0.2.45
app.kubernetes.io/version: 0.2.63
spec:
containers:
- args:
Expand All @@ -785,7 +785,7 @@ spec:
valueFrom:
fieldRef:
fieldPath: metadata.name
image: docker.io/lyft/flytepropeller:v0.2.45
image: docker.io/lyft/flytepropeller:v0.2.63
imagePullPolicy: IfNotPresent
name: flytepropeller
ports:
Expand Down
2 changes: 1 addition & 1 deletion end2end/tests/endtoend.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ spec:
command:
- bash
- -c
image: docker.io/lyft/flytetester:v0.1.6
image: docker.io/lyft/flytetester:e29ac562f053741213efcead5950b4b8bc28cfcf
imagePullPolicy: IfNotPresent
name: flytetester
resources:
Expand Down
6 changes: 3 additions & 3 deletions kustomize/base/admindeployment/deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ spec:
labels:
app: flyteadmin
app.kubernetes.io/name: flyteadmin
app.kubernetes.io/version: 0.2.8
app.kubernetes.io/version: 0.2.9
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "10254"
Expand All @@ -31,15 +31,15 @@ spec:
name: flyte-admin-config
initContainers:
- name: run-migrations
image: docker.io/lyft/flyteadmin:v0.2.8
image: docker.io/lyft/flyteadmin:v0.2.9
imagePullPolicy: IfNotPresent
command: ["flyteadmin", "--logtostderr", "--config", "/etc/flyte/config/flyteadmin_config.yaml", "migrate", "run"]
volumeMounts:
- name: config-volume
mountPath: /etc/flyte/config
containers:
- name: flyteadmin
image: docker.io/lyft/flyteadmin:v0.2.8
image: docker.io/lyft/flyteadmin:v0.2.9
imagePullPolicy: IfNotPresent
command: ["flyteadmin", "--logtostderr", "--config", "/etc/flyte/config/flyteadmin_config.yaml", "serve"]
ports:
Expand Down
4 changes: 2 additions & 2 deletions kustomize/base/propeller/deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ spec:
labels:
app: flytepropeller
app.kubernetes.io/name: flytepropeller
app.kubernetes.io/version: 0.2.45
app.kubernetes.io/version: 0.2.63
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "10254"
Expand All @@ -31,7 +31,7 @@ spec:
name: flyte-plugin-config
containers:
- name: flytepropeller
image: docker.io/lyft/flytepropeller:v0.2.45
image: docker.io/lyft/flytepropeller:v0.2.63
command:
- flytepropeller
args:
Expand Down
1 change: 1 addition & 0 deletions rsts/user/features/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,3 +15,4 @@ Flyte Features
task_cache
roles
single_task_execution
on_failure_policy
45 changes: 45 additions & 0 deletions rsts/user/features/on_failure_policy.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
.. _on-failuire-policy:

What is it
==========

The default behavior for when a node fails in a workflow is to immediately abort the entire workflow. The reasoning behind this thinking
is to avoid wasting resources since the workflow will end up failing anyway. There are certain cases however, when it's desired for the
workflow to carry on executing the branches it can execute.

For example when the remaining tasks are marked as :ref:`cacheable <features-task_cache>`.
Once the failure has been fixed and the workflow is relaunched, cached tasks will be bypassed quickly.

How to use it
-------------

Use on_failure attribute on workflow_class.

.. code:: python
from flytekit.models.core.workflow import WorkflowMetadata
@workflow_class(on_failure=WorkflowMetadata.OnFailurePolicy.FAIL_AFTER_EXECUTABLE_NODES_COMPLETE)
class RunToCompletionWF(object):
pass
Available values in the policy:

.. code:: python
class OnFailurePolicy(object):
"""
Defines the execution behavior of the workflow when a failure is detected.
Attributes:
FAIL_IMMEDIATELY Instructs the system to fail as soon as a node fails in the
workflow. It'll automatically abort all currently running nodes and
clean up resources before finally marking the workflow executions as failed.
FAIL_AFTER_EXECUTABLE_NODES_COMPLETE Instructs the system to make as much progress as it can. The system
will not alter the dependencies of the execution graph so any node
that depend on the failed node will not be run. Other nodes that will
be executed to completion before cleaning up resources and marking
the workflow execution as failed.
"""
FAIL_IMMEDIATELY = _core_workflow.WorkflowMetadata.FAIL_IMMEDIATELY
FAIL_AFTER_EXECUTABLE_NODES_COMPLETE = _core_workflow.WorkflowMetadata.FAIL_AFTER_EXECUTABLE_NODES_COMPLETE

0 comments on commit d52f0cb

Please sign in to comment.