Skip to content

Commit

Permalink
docs(kfp): choose a workflow executor. Part of kubeflow/pipelines#5718 (
Browse files Browse the repository at this point in the history
#2857)

* docs(kfp): choose a workflow executor

* clean up

* address feedback

* add deprecation pointers from different pages
  • Loading branch information
Bobgy authored Aug 11, 2021
1 parent 8939db0 commit 2d910be
Show file tree
Hide file tree
Showing 2 changed files with 214 additions and 0 deletions.
207 changes: 207 additions & 0 deletions content/en/docs/components/pipelines/installation/choose-executor.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,207 @@
+++
title = "Argo Workflow Executors"
description = "How to choose and configure the Argo Workflow Executor?"
weight = 40
+++

An Argo workflow executor is a process that conforms to a specific interface that allows Argo to perform certain actions like monitoring pod logs, collecting artifacts, managing container lifecycles, etc.

Kubeflow Pipelines runs on [Argo Workflows](https://argoproj.github.io/workflows/) as the workflow engine, so Kubeflow Pipelines users need to choose a workflow executor.

## Choosing the Workflow Executor

1. Some users may value stability and backward compatibility. For example, if you
are running Kubeflow Pipelines in a production cluster or you maintain production
pipelines that you don't want to break or migrate.

In this case, we recommend you use [docker executor](#docker-executor) and configure your Kubernetes nodes to use docker container runtime.

However, Kubernetes is deprecating docker as a container runtime, so we recommend
starting to try out emissary and prepare for a migration when it's stable.

1. For users less concerned with stability and backwards compatibility, we
recommend trying out the new [emissary executor](#emissary-executor).

Note that Argo Workflows support other workflow executors, but the Kubeflow Pipelines
team only recommend choosing between docker executor and emissary executor.

### Docker Executor

Docker executor is the **default** workflow executor.

{{% alert title="Warning" color="warning" %}}
Docker executor depends on docker container runtime, which will be deprecated on Kubernetes 1.20+.
{{% /alert %}}

* Container Runtime: docker only. However, [Kubernetes is deprecating Docker as a container runtime after v1.20](https://kubernetes.io/blog/2020/12/02/dont-panic-kubernetes-and-docker/).
On Google Kubernetes Engine (GKE) 1.19+, container runtime already defaults to containerd.
* Reliability: most well-tested and most popular argo workflows executor
* Security: least secure
* It requires `privileged` access to `docker.sock` of the host to be mounted which.
Often rejected by Open Policy Agent (OPA) or your Pod Security Policy (PSP).
GKE autopilot mode also rejects it, because [No privileged Pods](https://cloud.google.com/kubernetes-engine/docs/concepts/autopilot-overview#no_privileged_pods).
* It can escape the privileges of the pod's service account.

#### Prepare a GKE cluster for Docker Executor

For GKE, the node image decides which container runtime is used. To use docker
container runtime, you need to [specify a node image](https://cloud.google.com/kubernetes-engine/docs/how-to/node-images) with Docker.

You must use one of the following node images:

* Container-Optimized OS with Docker (cos)
* Ubuntu with Docker (ubuntu)

If your nodes are not using docker as container runtime, when you run pipelines
you will always find error messages like:

> This step is in Error state with this message: failed to save outputs: Error response from daemon: No such container: XXXXXX
### Emissary Executor

{{% alpha-status feedbacklink="https://github.com/kubeflow/pipelines/issues/6249" %}}

Emissary executor is a new workflow executor. It was first released in Argo Workflows v3.1 (June 2021).
However, the Kubeflow Pipelines team believe that its architectural and portability
improvements can make it the default executor that most people should use in the
future.

Therefore, the team makes a commitment to actively collect feedback and fix bugs
for the emissary executor, so that we can stablize it faster. So far, Kubeflow
Pipelines test infrastructure has been running stably with the emissary executor.

* Container Runtime: any
* Reliability: not yet well-tested and not yet popular, but the Kubeflow Pipelines
team supports it.
* Security: more secure
* No `privileged` access.
* Cannot escape the privileges of the pod's service account.
* Migration: `command` must be specified in [Kubeflow Pipelines component specification](https://www.kubeflow.org/docs/components/pipelines/reference/component-spec/).

Note, the same migration requirement is required by [Kubeflow Pipelines v2 compatible mode](https://www.kubeflow.org/docs/components/pipelines/sdk/v2/v2-compatibility/), refer to
[known caveats & breaking changes](https://github.com/kubeflow/pipelines/issues/6133).

#### Migrate to Emissary Executor

Prerequisite: emissary executor is only available in Kubeflow Pipelines backend version 1.7+.
To upgrade, refer to [upgrading Kubeflow Pipelines](/docs/components/pipelines/upgrade/).

##### Configure an existing Kubeflow Pipelines cluster to use emissary executor

1. Install [kubectl](https://kubernetes.io/docs/tasks/tools/#kubectl).
1. Connect to your cluster via kubectl.
1. Switch to the namespace you installed Kubeflow Pipelines:

```bash
kubectl config set-context --current --namespace <your-kfp-namespace>
```

Note, usually it's `kubeflow` or `default`.
1. Confirm current workflow executor:
```bash
kubectl describe configmap workflow-controller-configmap | grep -A 2 containerRuntimeExecutor
```
You'll see output like the following when using docker executor:

```text
containerRuntimeExecutor:
----
docker
```

1. Configure workflow executor to emissary:

```bash
kubectl patch configmap workflow-controller-configmap --patch '{"data":{"containerRuntimeExecutor":"emissary"}}'
```

1. Confirm workflow executor is changed successfully:

```bash
kubectl describe configmap workflow-controller-configmap | grep -A 2 containerRuntimeExecutor
```

You'll see output like the following:
```text
containerRuntimeExecutor:
----
emissary
```
##### Deploy a new Kubeflow Pipelines cluster with emissary executor
For [AI Platform Pipelines](https://cloud.google.com/ai-platform/pipelines/docs), check the "Use emissary executor" checkbox during installation.
For [Kubeflow Pipelines Standalone](https://www.kubeflow.org/docs/components/pipelines/installation/standalone-deployment/), install `env/platform-agnostic-emissary`:
```bash
kubectl apply -k "github.com/kubeflow/pipelines/manifests/kustomize/env/platform-agnostic-emissary?ref=$PIPELINE_VERSION"
```
When in doubt, you can always deploy your Kubeflow Pipelines cluster first and
configure workflow executor after installation using the instructions for
existing clusters.
##### Migrate pipeline components to run on emissary executor
Some pipeline components require manual updates to run on emissary executor.
For [Kubeflow Pipelines component specification](https://www.kubeflow.org/docs/components/pipelines/reference/component-spec/) YAML,
the `command` field must be specified.
Step by step component migration tutorial:
1. There is a hello world component:
```yaml
name: hello-world
implementation:
container:
image: hello-world
```
1. We can run the container without command/args:
```bash
$ docker run hello-world
Hello from Docker!
...
```
1. Find out what the default ENTRYPOINT and CMD is in the image:
```bash
$ docker image inspect -f '{{.Config.Entrypoint}} {{.Config.Cmd}}' hello-world
[] [/hello]
```
So ENTRYPOINT is not specified, and CMD is ["/hello"].
Note, ENTRYPOINT roughly means `command` and CMD roughly
means `arguments`. `command` and `arguments` are concatenated as the user
command.
1. Update the component YAML:
```yaml
name: hello-world
implementation:
container:
image: hello-world
command: ["/hello"]
```
1. The updated component can run on emissary executor now.
Note: Kubeflow Pipelines SDK compiler always specifies a command for
[python function based components](https://www.kubeflow.org/docs/components/pipelines/sdk/python-function-components/).
Therefore, these components will continue to work on emissary executor without
modifications.
## References
* [Argo Workflow Executors documentation](https://argoproj.github.io/argo-workflows/workflow-executors/)
* KFP docker executor doesn't support Kubernetes 1.19 or above [kubeflow/pipelines#5714](https://github.com/kubeflow/pipelines/issues/5714)
* Feature request - default to emissary executor [kubeflow/pipelines#5718](https://github.com/kubeflow/pipelines/issues/5718)
Original file line number Diff line number Diff line change
Expand Up @@ -97,6 +97,13 @@ See the Google Kubernetes Engine (GKE) guide to
**Note**: `kubectl apply -k` accepts local paths and paths that are formatted as [hashicorp/go-getter URLs](https://github.com/kubernetes-sigs/kustomize/blob/master/examples/remoteBuild.md#url-format). While the paths in the preceding commands look like URLs, the paths are not valid URLs.
{{% alert title="Deprecation Notice" color="warning" %}}
Kubeflow Pipelines default to the docker executor, but docker executor will be
deprecated on Kubernetes 1.20+.
To prepare for the deprecation, refer to [Argo Workflow Executors](/docs/components/pipelines/installation/choose-executor)
for migration instructions.
{{% /alert %}}
1. Get the public URL for the Kubeflow Pipelines UI and use it to access the Kubeflow Pipelines UI:
```
Expand Down

0 comments on commit 2d910be

Please sign in to comment.