diff --git a/README.md b/README.md index d708bef88..6fac6f228 100644 --- a/README.md +++ b/README.md @@ -1,10 +1,15 @@ # Kubeflow Spark Operator + [![Go Report Card](https://goreportcard.com/badge/github.com/kubeflow/spark-operator)](https://goreportcard.com/report/github.com/kubeflow/spark-operator) -## Overview +## What is Kubeflow Spark Operator? + The Kubernetes Operator for Apache Spark aims to make specifying and running [Spark](https://github.com/apache/spark) applications as easy and idiomatic as running other workloads on Kubernetes. It uses -[Kubernetes custom resources](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/) -for specifying, running, and surfacing status of Spark applications. For a complete reference of the custom resource definitions, please refer to the [API Definition](docs/api-docs.md). For details on its design, please refer to the [design doc](docs/design.md). It requires Spark 2.3 and above that supports Kubernetes as a native scheduler backend. +[Kubernetes custom resources](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/) for specifying, running, and surfacing status of Spark applications. + +## Overview + + For a complete reference of the custom resource definitions, please refer to the [API Definition](docs/api-docs.md). For details on its design, please refer to the [design doc](docs/design.md). It requires Spark 2.3 and above that supports Kubernetes as a native scheduler backend. The Kubernetes Operator for Apache Spark currently supports the following list of features: @@ -36,61 +41,41 @@ Customization of Spark pods, e.g., mounting arbitrary volumes and setting pod af * Version >= 1.16 of Kubernetes to use the `MutatingWebhook` and `ValidatingWebhook` of `apiVersion: admissionregistration.k8s.io/v1`. -## Installation - -The easiest way to install the Kubernetes Operator for Apache Spark is to use the Helm [chart](charts/spark-operator-chart/). +## Getting Started -```bash -$ helm repo add spark-operator https://kubeflow.github.io/spark-operator +For getting started with Spark operator, please refer to [Getting Started](https://www.kubeflow.org/docs/components/spark-operator/getting-started/). -$ helm install my-release spark-operator/spark-operator --namespace spark-operator --create-namespace -``` +## User Guide -This will install the Kubernetes Operator for Apache Spark into the namespace `spark-operator`. The operator by default watches and handles `SparkApplication`s in every namespaces. If you would like to limit the operator to watch and handle `SparkApplication`s in a single namespace, e.g., `default` instead, add the following option to the `helm install` command: +For detailed user guide, please refer to [User Guide](https://www.kubeflow.org/docs/components/spark-operator/user-guide/). -``` ---set "sparkJobNamespaces={default}" -``` +For API documentation, please refer to [API Specification](docs/api-docs.md). -For configuration options available in the Helm chart, please refer to the chart's [README](charts/spark-operator-chart/README.md). +If you are running Spark operator on Google Kubernetes Engine (GKE) and want to use Google Cloud Storage (GCS) and/or BigQuery for reading/writing data, also refer to the [GCP guide](https://www.kubeflow.org/docs/components/spark-operator/user-guide/gcp/). ## Version Matrix The following table lists the most recent few versions of the operator. -| Operator Version | API Version | Kubernetes Version | Base Spark Version | Operator Image Tag | -| ------------- | ------------- | ------------- | ------------- | ------------- | -| `latest` (master HEAD) | `v1beta2` | 1.13+ | `3.0.0` | `latest` | -| `v1beta2-1.3.3-3.1.1` | `v1beta2` | 1.16+ | `3.1.1` | `v1beta2-1.3.3-3.1.1` | -| `v1beta2-1.3.2-3.1.1` | `v1beta2` | 1.16+ | `3.1.1` | `v1beta2-1.3.2-3.1.1` | -| `v1beta2-1.3.0-3.1.1` | `v1beta2` | 1.16+ | `3.1.1` | `v1beta2-1.3.0-3.1.1` | -| `v1beta2-1.2.3-3.1.1` | `v1beta2` | 1.13+ | `3.1.1` | `v1beta2-1.2.3-3.1.1` | -| `v1beta2-1.2.0-3.0.0` | `v1beta2` | 1.13+ | `3.0.0` | `v1beta2-1.2.0-3.0.0` | -| `v1beta2-1.1.2-2.4.5` | `v1beta2` | 1.13+ | `2.4.5` | `v1beta2-1.1.2-2.4.5` | -| `v1beta2-1.0.1-2.4.4` | `v1beta2` | 1.13+ | `2.4.4` | `v1beta2-1.0.1-2.4.4` | -| `v1beta2-1.0.0-2.4.4` | `v1beta2` | 1.13+ | `2.4.4` | `v1beta2-1.0.0-2.4.4` | -| `v1beta1-0.9.0` | `v1beta1` | 1.13+ | `2.4.0` | `v2.4.0-v1beta1-0.9.0` | - -When installing using the Helm chart, you can choose to use a specific image tag instead of the default one, using the following option: - -``` ---set image.tag= -``` - -## Get Started - -Get started quickly with the Kubernetes Operator for Apache Spark using the [Quick Start Guide](docs/quick-start-guide.md). - -If you are running the Kubernetes Operator for Apache Spark on Google Kubernetes Engine and want to use Google Cloud Storage (GCS) and/or BigQuery for reading/writing data, also refer to the [GCP guide](docs/gcp.md). - -For more information, check the [Design](docs/design.md), [API Specification](docs/api-docs.md) and detailed [User Guide](docs/user-guide.md). +| Operator Version | API Version | Kubernetes Version | Base Spark Version | +| ------------- | ------------- | ------------- | ------------- | +| `v1beta2-1.6.x-3.5.0` | `v1beta2` | 1.16+ | `3.5.0` | +| `v1beta2-1.5.x-3.5.0` | `v1beta2` | 1.16+ | `3.5.0` | +| `v1beta2-1.4.x-3.5.0` | `v1beta2` | 1.16+ | `3.5.0` | +| `v1beta2-1.3.x-3.1.1` | `v1beta2` | 1.16+ | `3.1.1` | +| `v1beta2-1.2.3-3.1.1` | `v1beta2` | 1.13+ | `3.1.1` | +| `v1beta2-1.2.2-3.0.0` | `v1beta2` | 1.13+ | `3.0.0` | +| `v1beta2-1.2.1-3.0.0` | `v1beta2` | 1.13+ | `3.0.0` | +| `v1beta2-1.2.0-3.0.0` | `v1beta2` | 1.13+ | `3.0.0` | +| `v1beta2-1.1.x-2.4.5` | `v1beta2` | 1.13+ | `2.4.5` | +| `v1beta2-1.0.x-2.4.4` | `v1beta2` | 1.13+ | `2.4.4` | ## Contributing -Please check [CONTRIBUTING.md](CONTRIBUTING.md) and the [Developer Guide](docs/developer-guide.md) out. +For contributing, please refer to [CONTRIBUTING.md](CONTRIBUTING.md) and [Developer Guide](https://www.kubeflow.org/docs/components/spark-operator/developer-guide/). ## Community -* Join the [CNCF Slack Channel](https://www.kubeflow.org/docs/about/community/#kubeflow-slack-channels) and then join ```#kubeflow-spark-operator``` Channel. +* Join the [CNCF Slack Channel](https://www.kubeflow.org/docs/about/community/#kubeflow-slack-channels) and then join `#kubeflow-spark-operator` Channel. * Check out our blog post [Announcing the Kubeflow Spark Operator: Building a Stronger Spark on Kubernetes Community](https://blog.kubeflow.org/operators/2024/04/15/kubeflow-spark-operator.html) -* Check out [who is using the Kubernetes Operator for Apache Spark](docs/who-is-using.md). \ No newline at end of file +* Check out [who is using the Spark Operator](docs/adopters.md). diff --git a/docs/who-is-using.md b/docs/adopters.md similarity index 91% rename from docs/who-is-using.md rename to docs/adopters.md index 04ed6cf75..bf7df2a03 100644 --- a/docs/who-is-using.md +++ b/docs/adopters.md @@ -1,48 +1,50 @@ -## Who Is Using the Kubernetes Operator for Apache Spark? +# Adopters of Kubeflow Spark Operator + +Below are the adopters of project Spark Operator. If you are using Spark Operator please add yourself into the following list by a pull request. Please keep the list in alphabetical order. | Organization | Contact (GitHub User Name) | Environment | Description of Use | | ------------- | ------------- | ------------- | ------------- | -| [Caicloud](https://intl.caicloud.io/) | @gaocegege | Production | Cloud-Native AI Platform | -| Microsoft (MileIQ) | @dharmeshkakadia | Production | AI & Analytics | -| Lightbend | @yuchaoran2011 | Production | Data Infrastructure & Operations | -| StackTome | @emiliauskas-fuzzy | Production | Data pipelines | -| Salesforce | @khogeland | Production | Data transformation | +| [Beeline](https://beeline.ru) | @spestua | Evaluation | ML & Data Infrastructure | | Bringg | @EladDolev | Production | ML & Analytics Data Platform | -| [Siigo](https://www.siigo.com) | @Juandavi1 | Production | Data Migrations & Analytics Data Platform | +| [Caicloud](https://intl.caicloud.io/) | @gaocegege | Production | Cloud-Native AI Platform | +| Carrefour | @AliGouta | Production | Data Platform | | CERN|@mrow4a| Evaluation | Data Mining & Analytics | -| Lyft |@kumare3| Evaluation | ML & Data Infrastructure | -| MapR Technologies |@sarjeet2013| Evaluation | ML/AI & Analytics Data Platform | -| Uber| @chenqin| Evaluation| Spark / ML | -| HashmapInc| @prem0132 | Evaluation | Analytics Data Platform | -| Tencent | @runzhliu | Evaluation | ML Analytics Platform | -| Exacaster | @minutis | Evaluation | Data pipelines | -| Riskified | @henbh | Evaluation | Analytics Data Platform | +| [CloudPhysics](https://www.cloudphysics.com) | @jkleckner | Production | ML/AI & Analytics | | CloudZone | @iftachsc | Evaluation | Big Data Analytics Consultancy | | Cyren | @avnerl | Evaluation | Data pipelines | -| Shell (Agile Hub) | @TomLous | Production | Data pipelines | -| Nielsen Identity Engine | @roitvt | Evaluation | Data pipelines | +| [C2FO](https://www.c2fo.com/) | @vanhoale | Production | Data Platform / Data Infrastructure | | [Data Mechanics](https://www.datamechanics.co) | @jrj-d | Production | Managed Spark Platform | -| [PUBG](https://careers.pubg.com/#/en/) | @jacobhjkim | Production | ML & Data Infrastructure | -| [Beeline](https://beeline.ru) | @spestua | Evaluation | ML & Data Infrastructure | -| [Stitch Fix](https://multithreaded.stitchfix.com/) | @nssalian | Evaluation | Data pipelines | -| [Typeform](https://typeform.com/) | @afranzi | Production | Data & ML pipelines | -| incrmntal(https://incrmntal.com/) | @scravy | Production | ML & Data Infrastructure | -| [CloudPhysics](https://www.cloudphysics.com) | @jkleckner | Production | ML/AI & Analytics | -| [MongoDB](https://www.mongodb.com) | @chickenpopcorn | Production | Data Infrastructure | -| [MavenCode](https://www.mavencode.com) | @charlesa101 | Production | MLOps & Data Infrastructure | -| [Gojek](https://www.gojek.io/) | @pradithya | Production | Machine Learning Platform | -| Fossil | @duyet | Production | Data Platform | -| Carrefour | @AliGouta | Production | Data Platform | -| Scaling Smart | @tarek-izemrane | Evaluation | Data Platform | -| [Tongdun](https://www.tongdun.net/) | @lomoJG | Production | AI/ML & Analytics | -| [Totvs Labs](https://www.totvslabs.com) | @luizm | Production | Data Platform | -| [DiDi](https://www.didiglobal.com) | @Run-Lin | Evaluation | Data Infrastructure | | [DeepCure](https://www.deepcure.ai) | @mschroering | Production | Spark / ML | -| [C2FO](https://www.c2fo.com/) | @vanhoale | Production | Data Platform / Data Infrastructure | -| [Timo](https://timo.vn) | @vanducng | Production | Data Platform | +| [DiDi](https://www.didiglobal.com) | @Run-Lin | Evaluation | Data Infrastructure | +| Exacaster | @minutis | Evaluation | Data pipelines | +| Fossil | @duyet | Production | Data Platform | +| [Gojek](https://www.gojek.io/) | @pradithya | Production | Machine Learning Platform | +| HashmapInc| @prem0132 | Evaluation | Analytics Data Platform | +| [incrmntal](https://incrmntal.com/) | @scravy | Production | ML & Data Infrastructure | +| [Inter&Co](https://inter.co/) | @ignitz | Production | Data pipelines | | [Kognita](https://kognita.com.br/) | @andreclaudino | Production | MLOps, Data Platform / Data Infrastructure, ML/AI | +| Lightbend | @yuchaoran2011 | Production | Data Infrastructure & Operations | +| Lyft |@kumare3| Evaluation | ML & Data Infrastructure | +| MapR Technologies |@sarjeet2013| Evaluation | ML/AI & Analytics Data Platform | +| [MavenCode](https://www.mavencode.com) | @charlesa101 | Production | MLOps & Data Infrastructure | +| Microsoft (MileIQ) | @dharmeshkakadia | Production | AI & Analytics | | [Molex](https://www.molex.com/) | @AshishPushpSingh | Evaluation/Production | Data Platform | +| [MongoDB](https://www.mongodb.com) | @chickenpopcorn | Production | Data Infrastructure | +| Nielsen Identity Engine | @roitvt | Evaluation | Data pipelines | +| [PUBG](https://careers.pubg.com/#/en/) | @jacobhjkim | Production | ML & Data Infrastructure | | [Qualytics](https://www.qualytics.co/) | @josecsotomorales | Production | Data Quality Platform | +| Riskified | @henbh | Evaluation | Analytics Data Platform | | [Roblox](https://www.roblox.com/) | @matschaffer-roblox | Evaluation | Data Infrastructure | | [Rokt](https://www.rokt.com) | @jacobsalway | Production | Data Infrastructure | -| [Inter&Co](https://inter.co/) | @ignitz | Production | Data pipelines | +| Salesforce | @khogeland | Production | Data transformation | +| Scaling Smart | @tarek-izemrane | Evaluation | Data Platform | +| Shell (Agile Hub) | @TomLous | Production | Data pipelines | +| [Siigo](https://www.siigo.com) | @Juandavi1 | Production | Data Migrations & Analytics Data Platform | +| StackTome | @emiliauskas-fuzzy | Production | Data pipelines | +| [Stitch Fix](https://multithreaded.stitchfix.com/) | @nssalian | Evaluation | Data pipelines | +| Tencent | @runzhliu | Evaluation | ML Analytics Platform | +| [Timo](https://timo.vn) | @vanducng | Production | Data Platform | +| [Tongdun](https://www.tongdun.net/) | @lomoJG | Production | AI/ML & Analytics | +| [Totvs Labs](https://www.totvslabs.com) | @luizm | Production | Data Platform | +| [Typeform](https://typeform.com/) | @afranzi | Production | Data & ML pipelines | +| Uber| @chenqin| Evaluation| Spark / ML | diff --git a/docs/developer-guide.md b/docs/developer-guide.md index 846762ea2..edaa176cc 100644 --- a/docs/developer-guide.md +++ b/docs/developer-guide.md @@ -27,7 +27,7 @@ pre-commit install-hooks In case you want to build the operator from the source code, e.g., to test a fix or a feature you write, you can do so following the instructions below. -The easiest way to build the operator without worrying about its dependencies is to just build an image using the [Dockerfile](../Dockerfile). +The easiest way to build the operator without worrying about its dependencies is to just build an image using the [Dockerfile](https://github.com/kubeflow/spark-operator/Dockerfile). ```bash docker build -t . @@ -39,8 +39,6 @@ The operator image is built upon a base Spark image that defaults to `spark:3.5. docker build --build-arg SPARK_IMAGE= -t . ``` -If you want to use the operator on OpenShift clusters, first make sure you have Docker version 18.09.3 or above, then build your operator image using the [OpenShift-specific Dockerfile](../Dockerfile.rh). - ```bash export DOCKER_BUILDKIT=1 docker build -t -f Dockerfile.rh . diff --git a/docs/quick-start-guide.md b/docs/getting-started.md similarity index 65% rename from docs/quick-start-guide.md rename to docs/getting-started.md index e7045c082..241a7e52a 100644 --- a/docs/quick-start-guide.md +++ b/docs/getting-started.md @@ -1,145 +1,102 @@ -# Quick Start Guide +# Getting Started For a more detailed guide on how to use, compose, and work with `SparkApplication`s, please refer to the -[User Guide](user-guide.md). If you are running the Kubernetes Operator for Apache Spark on Google Kubernetes Engine and want to use Google Cloud Storage (GCS) and/or BigQuery for reading/writing data, also refer to the [GCP guide](gcp.md). The Kubernetes Operator for Apache Spark will simply be referred to as the operator for the rest of this guide. - -## Table of Contents -- [Quick Start Guide](#quick-start-guide) - - [Table of Contents](#table-of-contents) - - [Installation](#installation) - - [Running the Examples](#running-the-examples) - - [Configuration](#configuration) - - [Upgrade](#upgrade) - - [About the Spark Job Namespace](#about-the-spark-job-namespace) - - [About the Service Account for Driver Pods](#about-the-service-account-for-driver-pods) - - [About the Service Account for Executor Pods](#about-the-service-account-for-executor-pods) - - [Enable Metric Exporting to Prometheus](#enable-metric-exporting-to-prometheus) - - [Spark Application Metrics](#spark-application-metrics) - - [Work Queue Metrics](#work-queue-metrics) - - [Driver UI Access and Ingress](#driver-ui-access-and-ingress) - - [About the Mutating Admission Webhook](#about-the-mutating-admission-webhook) - - [Mutating Admission Webhooks on a private GKE or EKS cluster](#mutating-admission-webhooks-on-a-private-gke-or-eks-cluster) +User Guide. If you are running the Kubernetes Operator for Apache Spark on Google Kubernetes Engine and want to use Google Cloud Storage (GCS) and/or BigQuery for reading/writing data, also refer to the [GCP guide](user-guide/gcp.md). The Kubernetes Operator for Apache Spark will simply be referred to as the operator for the rest of this guide. + +## Prerequisites + +- Helm >= 3 +- Kubernetes >= 1.16 ## Installation -To install the operator, use the Helm [chart](../charts/spark-operator-chart). +### Add Helm Repo -```bash -$ helm repo add spark-operator https://kubeflow.github.io/spark-operator +```shell +helm repo add spark-operator https://kubeflow.github.io/spark-operator -$ helm install my-release spark-operator/spark-operator --namespace spark-operator --create-namespace +helm repo update ``` -Installing the chart will create a namespace `spark-operator` if it doesn't exist, and helm will set up RBAC for the operator to run in the namespace. It will also set up RBAC in the `default` namespace for driver pods of your Spark applications to be able to manipulate executor pods. In addition, the chart will create a Deployment in the namespace `spark-operator`. The chart's [Spark Job Namespace](#about-the-spark-job-namespace) is set to `release namespace` by default. The chart by default does not enable [Mutating Admission Webhook](https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/) for Spark pod customization. When enabled, a webhook service and a secret storing the x509 certificate called `spark-webhook-certs` are created for that purpose. To install the operator **with** the mutating admission webhook on a Kubernetes cluster, install the chart with the flag `webhook.enable=true`: +See [helm repo](https://helm.sh/docs/helm/helm_repo) for command documentation. -```bash -$ helm install my-release spark-operator/spark-operator --namespace spark-operator --set webhook.enable=true +### Install the chart + +```shell +helm install [RELEASE_NAME] spark-operator/spark-operator ``` -Due to a [known issue](https://cloud.google.com/kubernetes-engine/docs/how-to/role-based-access-control#defining_permissions_in_a_role) in GKE, you will need to first grant yourself cluster-admin privileges before you can create custom roles and role bindings on a GKE cluster versioned 1.6 and up. Run the following command before installing the chart on GKE: +For example, if you want to create a release with name `spark-operator` in the `spark-operator` namespace: -```bash -$ kubectl create clusterrolebinding -cluster-admin-binding --clusterrole=cluster-admin --user=@ +```shell +helm install spark-operator spark-operator/spark-operator \ + --namespace spark-operator \ + --create-namespace ``` -Now you should see the operator running in the cluster by checking the status of the Helm release. +See [helm install](https://helm.sh/docs/helm/helm_install) for command documentation. -```bash -$ helm status --namespace spark-operator my-release +Installing the chart will create a namespace `spark-operator` if it doesn't exist, and helm will set up RBAC for the operator to run in the namespace. It will also set up RBAC in the `default` namespace for driver pods of your Spark applications to be able to manipulate executor pods. In addition, the chart will create a Deployment in the namespace `spark-operator`. The chart by default does not enable [Mutating Admission Webhook](https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/) for Spark pod customization. When enabled, a webhook service and a secret storing the x509 certificate called `spark-webhook-certs` are created for that purpose. To install the operator with the mutating admission webhook on a Kubernetes cluster, install the chart with the flag `webhook.enable=true`: + +```shell +helm install my-release spark-operator/spark-operator \ + --namespace spark-operator \ + --create-namespace \ + --set webhook.enable=true ``` -### Installation using kustomize -You can also install `spark-operator` using [kustomize](https://github.com/kubernetes-sigs/kustomize). Run +If you want to deploy the chart to GKE cluster, you will first need to [grant yourself cluster-admin privileges](https://cloud.google.com/kubernetes-engine/docs/how-to/role-based-access-control#defining_permissions_in_a_role) before you can create custom roles and role bindings on a GKE cluster versioned 1.6 and up. Run the following command before installing the chart on GKE: +```shell +kubectl create clusterrolebinding -cluster-admin-binding --clusterrole=cluster-admin --user=@ ``` -kubectl apply -k {manifest_directory} -``` -Kustomize default manifest directory is part of the repo [here](https://github.com/kubeflow/spark-operator/tree/master/manifest/spark-operator-with-webhook-install) - -The manifest directory contains primarily the `crds` and `spark-operator-with-webhook.yaml` which holds configurations of spark operator init job, a webhook service and finally a deployemnt. -Spark operator with above manifest installs `spark-operator` in default namespace `spark-operator` with default webhook service `spark-webhook`. If you wish to install `spark-operator` in a namespace other than `spark-opertor` and webhook service name other than `spark-webhook`, `Job` manifest in `spark-operator-with-webhook.yaml` should look like below. You need to pass the desired namespace name and service name as arguements in `command` field in `containers`. +Now you should see the operator running in the cluster by checking the status of the Helm release. +```shell +helm status --namespace spark-operator my-release ``` -apiVersion: batch/v1 -kind: Job -metadata: - name: sparkoperator-init - namespace: myorg-spark-operator - labels: - app.kubernetes.io/name: sparkoperator - app.kubernetes.io/version: v2.4.0-v1beta1 -spec: - backoffLimit: 3 - template: - metadata: - labels: - app.kubernetes.io/name: sparkoperator - app.kubernetes.io/version: v2.4.0-v1beta1 - spec: - serviceAccountName: sparkoperator - restartPolicy: Never - containers: - - name: main - image: gcr.io/spark-operator/spark-operator:v2.4.0-v1beta1-latest - imagePullPolicy: IfNotPresent - command: ["/usr/bin/gencerts.sh", "-p", "--namespace", "myorg-spark-operator", "--service", "myorg-spark-webhook"] -``` -And Service will be -``` -kind: Service -apiVersion: v1 -metadata: - name: myorg-spark-webhook -... +### Upgrade the Chart + +```shell +helm upgrade [RELEASE_NAME] spark-operator/spark-operator [flags] ``` -And `args` in `Deployement` will look like: +See [helm upgrade](https://helm.sh/docs/helm/helm_upgrade) for command documentation. -``` -apiVersion: apps/v1 -kind: Deployment -metadata: - name: sparkoperator -... - - args: - - -logtostderr - - -enable-webhook=true - - -v=2 - - webhook-svc-namespace=myorg-spark-operator - - webhook-svc-name=myorg-spark-webhook +### Uninstall the Chart + +```shell +helm uninstall [RELEASE_NAME] ``` -This will install `spark-operator` in `myorg-spark-operator` namespace and the webhook service will be called `myorg-spark-webhook`. +This removes all the Kubernetes resources associated with the chart and deletes the release, except for the `crds`, those will have to be removed manually. + +See [helm uninstall](https://helm.sh/docs/helm/helm_uninstall) for command documentation. -To unintall operator, run -``` -kustomize build '{manifest_directory}' | kubectl delete -f - -``` ## Running the Examples -To run the Spark Pi example, run the following command: +To run the Spark PI example, run the following command: -```bash -$ kubectl apply -f examples/spark-pi.yaml +```shell +kubectl apply -f examples/spark-pi.yaml ``` Note that `spark-pi.yaml` configures the driver pod to use the `spark` service account to communicate with the Kubernetes API server. You might need to replace it with the appropriate service account before submitting the job. If you installed the operator using the Helm chart and overrode `sparkJobNamespaces`, the service account name ends with `-spark` and starts with the Helm release name. For example, if you would like to run your Spark jobs to run in a namespace called `test-ns`, first make sure it already exists, and then install the chart with the command: -```bash -$ helm install my-release spark-operator/spark-operator --namespace spark-operator --set "sparkJobNamespaces={test-ns}" +```shell +helm install my-release spark-operator/spark-operator --namespace spark-operator --set "sparkJobNamespaces={test-ns}" ``` Then the chart will set up a service account for your Spark jobs to use in that namespace. -See the section on the [Spark Job Namespace](#about-the-spark-job-namespace) for details on the behavior of the default Spark Job Namespace. +See the section on the [Spark Job Namespace](#about-spark-job-namespaces) for details on the behavior of the default Spark Job Namespace. Running the above command will create a `SparkApplication` object named `spark-pi`. Check the object by running the following command: -```bash -$ kubectl get sparkapplications spark-pi -o=yaml +```shell +kubectl get sparkapplication spark-pi -o=yaml ``` This will show something similar to the following: @@ -194,13 +151,13 @@ status: To check events for the `SparkApplication` object, run the following command: -```bash -$ kubectl describe sparkapplication spark-pi +```shell +kubectl describe sparkapplication spark-pi ``` This will show the events similarly to the following: -``` +```text Events: Type Reason Age From Message ---- ------ ---- ---- ------- @@ -226,17 +183,17 @@ By default, the operator will manage custom resource objects of the managed CRD ## Upgrade -To upgrade the the operator, e.g., to use a newer version container image with a new tag, run the following command with updated parameters for the Helm release: +To upgrade the operator, e.g., to use a newer version container image with a new tag, run the following command with updated parameters for the Helm release: -```bash -$ helm upgrade --set image.repository=org/image --set image.tag=newTag +```shell +helm upgrade --set image.repository=org/image --set image.tag=newTag ``` Refer to the Helm [documentation](https://helm.sh/docs/helm/helm_upgrade/) for more details on `helm upgrade`. ## About Spark Job Namespaces -The Spark Job Namespaces value defines the namespaces where `SparkApplications` can be deployed. The Helm chart value for the Spark Job Namespaces is `sparkJobNamespaces`, and its default value is `[]`. As defined in the Helm chart's [README](../charts/spark-operator-chart/README.md), when the list of namespaces is empty the Helm chart will create a service account in the namespace where the spark-operator is deployed. +The Spark Job Namespaces value defines the namespaces where `SparkApplications` can be deployed. The Helm chart value for the Spark Job Namespaces is `sparkJobNamespaces`, and its default value is `[]`. When the list of namespaces is empty the Helm chart will create a service account in the namespace where the spark-operator is deployed. If you installed the operator using the Helm chart and overrode the `sparkJobNamespaces` to some other, pre-existing namespace, the Helm chart will create the necessary service account and RBAC in the specified namespace. @@ -244,7 +201,7 @@ The Spark Operator uses the Spark Job Namespace to identify and filter relevant ## About the Service Account for Driver Pods -A Spark driver pod need a Kubernetes service account in the pod's namespace that has permissions to create, get, list, and delete executor pods, and create a Kubernetes headless service for the driver. The driver will fail and exit without the service account, unless the default service account in the pod's namespace has the needed permissions. To submit and run a `SparkApplication` in a namespace, please make sure there is a service account with the permissions in the namespace and set `.spec.driver.serviceAccount` to the name of the service account. Please refer to [spark-rbac.yaml](../manifest/spark-rbac.yaml) for an example RBAC setup that creates a driver service account named `spark` in the `default` namespace, with a RBAC role binding giving the service account the needed permissions. +A Spark driver pod need a Kubernetes service account in the pod's namespace that has permissions to create, get, list, and delete executor pods, and create a Kubernetes headless service for the driver. The driver will fail and exit without the service account, unless the default service account in the pod's namespace has the needed permissions. To submit and run a `SparkApplication` in a namespace, please make sure there is a service account with the permissions in the namespace and set `.spec.driver.serviceAccount` to the name of the service account. Please refer to [spark-rbac.yaml](https://github.com/kubeflow/spark-operator/blob/master/manifest/spark-application-rbac/spark-application-rbac.yaml) for an example RBAC setup that creates a driver service account named `spark` in the `default` namespace, with a RBAC role binding giving the service account the needed permissions. ## About the Service Account for Executor Pods @@ -254,13 +211,17 @@ A Spark executor pod may be configured with a Kubernetes service account in the The operator exposes a set of metrics via the metric endpoint to be scraped by `Prometheus`. The Helm chart by default installs the operator with the additional flag to enable metrics (`-enable-metrics=true`) as well as other annotations used by Prometheus to scrape the metric endpoint. If `podMonitor.enable` is enabled, the helm chart will submit a pod monitor for the operator's pod. To install the operator **without** metrics enabled, pass the appropriate flag during `helm install`: -```bash -$ helm install my-release spark-operator/spark-operator --namespace spark-operator --set metrics.enable=false +```shell +helm install my-release spark-operator/spark-operator \ + --namespace spark-operator \ + --create-namespace \ + --set metrics.enable=false ``` If enabled, the operator generates the following metrics: -#### Spark Application Metrics +### Spark Application Metrics + | Metric | Description | | ------------- | ------------- | | `spark_app_count` | Total number of SparkApplication handled by the Operator.| @@ -277,6 +238,7 @@ If enabled, the operator generates the following metrics: | `spark_app_executor_running_count` | Total number of Spark Executors which are currently running. | #### Work Queue Metrics + | Metric | Description | | ------------- | ------------- | | `spark_application_controller_depth` | Current depth of workqueue | @@ -287,10 +249,9 @@ If enabled, the operator generates the following metrics: | `spark_application_controller_unfinished_work_seconds` | Unfinished work in seconds | | `spark_application_controller_longest_running_processor_microseconds` | Longest running processor in microseconds | - The following is a list of all the configurations the operators supports for metrics: -```bash +```shell -enable-metrics=true -metrics-port=10254 -metrics-endpoint=/metrics @@ -298,12 +259,12 @@ The following is a list of all the configurations the operators supports for met -metrics-label=label1Key -metrics-label=label2Key ``` + All configs except `-enable-metrics` are optional. If port and/or endpoint are specified, please ensure that the annotations `prometheus.io/port`, `prometheus.io/path` and `containerPort` in `spark-operator-with-metrics.yaml` are updated as well. -A note about `metrics-labels`: In `Prometheus`, every unique combination of key-value label pair represents a new time series, which can dramatically increase the amount of data stored. Hence labels should not be used to store dimensions with high cardinality with potentially a large or unbounded value range. +A note about `metrics-labels`: In `Prometheus`, every unique combination of key-value label pairs represents a new time series, which can dramatically increase the amount of data stored. Hence, labels should not be used to store dimensions with high cardinality with potentially a large or unbounded value range. -Additionally, these metrics are best-effort for the current operator run and will be reset on an operator restart. Also some of these metrics are generated by listening to pod state updates for the driver/executors -and deleting the pods outside the operator might lead to incorrect metric values for some of these metrics. +Additionally, these metrics are best-effort for the current operator run and will be reset on an operator restart. Also, some of these metrics are generated by listening to pod state updates for the driver/executors and deleting the pods outside the operator might lead to incorrect metric values for some of these metrics. ## Driver UI Access and Ingress @@ -331,8 +292,8 @@ The Kubernetes Operator for Spark ships with a tool at `hack/gencerts.sh` for ge Run the following command to create the secret with a certificate and key files using a batch Job, and install the operator Deployment with the mutating admission webhook: -```bash -$ kubectl apply -f manifest/spark-operator-with-webhook.yaml +```shell +kubectl apply -f manifest/spark-operator-with-webhook.yaml ``` This will create a Deployment named `sparkoperator` and a Service named `spark-webhook` for the webhook in namespace `spark-operator`. @@ -342,10 +303,15 @@ This will create a Deployment named `sparkoperator` and a Service named `spark-w If you are deploying the operator on a GKE cluster with the [Private cluster](https://cloud.google.com/kubernetes-engine/docs/how-to/private-clusters) setting enabled, or on an enterprise AWS EKS cluster and you wish to deploy the cluster with the [Mutating Admission Webhook](https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/), then make sure to change the `webhookPort` to `443`. Alternatively you can choose to allow connections to the default port (8080). > By default, firewall rules restrict your cluster master to only initiate TCP connections to your nodes on ports 443 (HTTPS) and 10250 (kubelet). For some Kubernetes features, you might need to add firewall rules to allow access on additional ports. For example, in Kubernetes 1.9 and older, kubectl top accesses heapster, which needs a firewall rule to allow TCP connections on port 8080. To grant such access, you can add firewall rules. -[From the docs](https://cloud.google.com/kubernetes-engine/docs/how-to/private-clusters#add_firewall_rules) +For GCP, refer to [this link](https://cloud.google.com/kubernetes-engine/docs/how-to/private-clusters#add_firewall_rules) To install the operator with a custom port, pass the appropriate flag during `helm install`: -```bash -$ helm install my-release spark-operator/spark-operator --namespace spark-operator --set "sparkJobNamespaces={spark}" --set webhook.enable=true --set webhook.port=443 +```shell +helm install my-release spark-operator/spark-operator \ + --namespace spark-operator \ + --create-namespace \ + --set "sparkJobNamespaces={spark}" \ + --set webhook.enable=true \ + --set webhook.port=443 ``` diff --git a/docs/architecture-diagram.png b/docs/overview/architecture-diagram.png similarity index 100% rename from docs/architecture-diagram.png rename to docs/overview/architecture-diagram.png diff --git a/docs/design.md b/docs/overview/index.md similarity index 62% rename from docs/design.md rename to docs/overview/index.md index 737e8bd66..2a852ef6c 100644 --- a/docs/design.md +++ b/docs/overview/index.md @@ -1,54 +1,65 @@ -# Kubernetes Operator for Apache Spark Design +# An overview for Spark Operator -## Table of Contents -* [Introduction](#introduction) -* [Architecture](#architecture) -* [The CRD Controller](#the-crd-controller) -* [Handling Application Restart and Failures](#handling-application-restart-and-failures) -* [Mutating Admission Webhook](#mutating-admission-webhook) -* [Command-line Tool: Sparkctl](#command-line-tool-sparkctl) +## What is Kubeflow Spark Operator? + +The Kubernetes Operator for Apache Spark aims to make specifying and running Spark applications as easy and idiomatic as running other workloads on Kubernetes. It uses Kubernetes custom resources for specifying, running, and surfacing status of Spark applications. ## Introduction In Spark 2.3, Kubernetes becomes an official scheduler backend for Spark, additionally to the standalone scheduler, Mesos, and Yarn. Compared with the alternative approach of deploying a standalone Spark cluster on top of Kubernetes and submit applications to run on the standalone cluster, having Kubernetes as a native scheduler backend offers some important benefits as discussed in [SPARK-18278](https://issues.apache.org/jira/browse/SPARK-18278) and is a huge leap forward. However, the way life cycle of Spark applications are managed, e.g., how applications get submitted to run on Kubernetes and how application status is tracked, are vastly different from that of other types of workloads on Kubernetes, e.g., Deployments, DaemonSets, and StatefulSets. The Kubernetes Operator for Apache Spark reduces the gap and allow Spark applications to be specified, run, and monitored idiomatically on Kubernetes. -Specifically, the Kubernetes Operator for Apache Spark follows the recent trend of leveraging the [operator](https://coreos.com/blog/introducing-operators.html) pattern for managing the life cycle of Spark applications on a Kubernetes cluster. The operator allows Spark applications to be specified in a declarative manner (e.g., in a YAML file) and run without the need to deal with the spark submission process. It also enables status of Spark applications to be tracked and presented idiomatically like other types of workloads on Kubernetes. This document discusses the design and architecture of the operator. For documentation of the [CustomResourceDefinition](https://kubernetes.io/docs/concepts/api-extension/custom-resources/) for specification of Spark applications, please refer to [API Definition](api-docs.md) +Specifically, the Kubernetes Operator for Apache Spark follows the recent trend of leveraging the [operator](https://coreos.com/blog/introducing-operators.html) pattern for managing the life cycle of Spark applications on a Kubernetes cluster. The operator allows Spark applications to be specified in a declarative manner (e.g., in a YAML file) and run without the need to deal with the spark submission process. It also enables status of Spark applications to be tracked and presented idiomatically like other types of workloads on Kubernetes. This document discusses the design and architecture of the operator. For documentation of the [CustomResourceDefinition](https://kubernetes.io/docs/concepts/api-extension/custom-resources/) for specification of Spark applications, please refer to [API Definition](https://github.com/kubeflow/spark-operator/blob/master/docs/api-docs.md). + +The Kubernetes Operator for Apache Spark currently supports the following list of features: + +- Supports Spark 2.3 and up. +- Enables declarative application specification and management of applications through custom resources. +- Automatically runs `spark-submit` on behalf of users for each `SparkApplication` eligible for submission. +- Provides native cron support for running scheduled applications. +- Supports customization of Spark pods beyond what Spark natively is able to do through the mutating admission webhook, e.g., mounting ConfigMaps and volumes, and setting pod affinity/anti-affinity. +- Supports automatic application re-submission for updated `SparkApplication` objects with updated specification. +- Supports automatic application restart with a configurable restart policy. +- Supports automatic retries of failed submissions with optional linear back-off. +- Supports mounting local Hadoop configuration as a Kubernetes ConfigMap automatically via `sparkctl`. +- Supports automatically staging local application dependencies to Google Cloud Storage (GCS) via `sparkctl`. +- Supports collecting and exporting application-level metrics and driver/executor metrics to Prometheus. ## Architecture The operator consists of: -* a `SparkApplication` controller that watches events of creation, updates, and deletion of + +- a `SparkApplication` controller that watches events of creation, updates, and deletion of `SparkApplication` objects and acts on the watch events, -* a *submission runner* that runs `spark-submit` for submissions received from the controller, -* a *Spark pod monitor* that watches for Spark pods and sends pod status updates to the controller, -* a [Mutating Admission Webhook](https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/) that handles customizations for Spark driver and executor pods based on the annotations on the pods added by the controller, -* and also a command-line tool named `sparkctl` for working with the operator. +- a *submission runner* that runs `spark-submit` for submissions received from the controller, +- a *Spark pod monitor* that watches for Spark pods and sends pod status updates to the controller, +- a [Mutating Admission Webhook](https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/) that handles customizations for Spark driver and executor pods based on the annotations on the pods added by the controller, +- and also a command-line tool named `sparkctl` for working with the operator. The following diagram shows how different components interact and work together. ![Architecture Diagram](architecture-diagram.png) -Specifically, a user uses the `sparkctl` (or `kubectl`) to create a `SparkApplication` object. The `SparkApplication` controller receives the object through a watcher from the API server, creates a submission carrying the `spark-submit` arguments, and sends the submission to the *submission runner*. The submission runner submits the application to run and creates the driver pod of the application. Upon starting, the driver pod creates the executor pods. While the application is running, the *Spark pod monitor* watches the pods of the application and sends status updates of the pods back to the controller, which then updates the status of the application accordingly. +Specifically, a user uses the `sparkctl` (or `kubectl`) to create a `SparkApplication` object. The `SparkApplication` controller receives the object through a watcher from the API server, creates a submission carrying the `spark-submit` arguments, and sends the submission to the *submission runner*. The submission runner submits the application to run and creates the driver pod of the application. Upon starting, the driver pod creates the executor pods. While the application is running, the *Spark pod monitor* watches the pods of the application and sends status updates of the pods back to the controller, which then updates the status of the application accordingly. ## The CRD Controller The `SparkApplication` controller, or CRD controller in short, watches events of creation, updates, and deletion of `SparkApplication` objects in any namespaces in a Kubernetes cluster, and acts on the watch events. When a new `SparkApplication` object is added (i.e., when the `AddFunc` callback function of the `ResourceEventHandlerFuncs` is called), it enqueues the object into an internal work queue, from which a worker picks it up prepares a submission and sends the submission to the submission runner, which actually submits the application to run in the Kubernetes cluster. The submission includes the list of arguments for the `spark-submit` command. The submission runner has a configurable number of workers for submitting applications to run in the cluster. When a `SparkApplication` object is deleted, the object is dequeued from the internal work queue and all the Kubernetes resources associated with the application get deleted or garbage collected. -When a `SparkApplication` object gets updated (i.e., when the `UpdateFunc` callback function of the `ResourceEventHandlerFuncs` is called), e.g., from the user who used `kubectl apply` to apply the update. The controller checks if the application specification in `SparkApplicationSpec` has changed. If the application specification remains the same, the controller simply ignores the update. This ensures that updates without application specification changes, e.g., those triggered by cache re-synchronization, won't result in application a re-submission. If otherwise the update was made to the application specification, the controller cancels the current run of the application by deleting the driver pod of the current run, and submits a new run of the application with the updated specification. Note that deleting the driver pod of the old run of the application effectively kills the run and causes the executor pods to be deleted as well because the driver is the owner of the executor pods. +When a `SparkApplication` object gets updated (i.e., when the `UpdateFunc` callback function of the `ResourceEventHandlerFuncs` is called), e.g., from the user who used `kubectl apply` to apply the update. The controller checks if the application specification in `SparkApplicationSpec` has changed. If the application specification remains the same, the controller simply ignores the update. This ensures that updates without application specification changes, e.g., those triggered by cache re-synchronization, won't result in application a re-submission. If otherwise the update was made to the application specification, the controller cancels the current run of the application by deleting the driver pod of the current run, and submits a new run of the application with the updated specification. Note that deleting the driver pod of the old run of the application effectively kills the run and causes the executor pods to be deleted as well because the driver is the owner of the executor pods. -The controller is also responsible for updating the status of a `SparkApplication` object with the help of the Spark pod monitor, which watches Spark pods and update the `SparkApplicationStatus` field of corresponding `SparkApplication` objects based on the status of the pods. The Spark pod monitor watches events of creation, updates, and deletion of Spark pods, creates status update messages based on the status of the pods, and sends the messages to the controller to process. When the controller receives a status update message, it gets the corresponding `SparkApplication` object from the cache store and updates the the `Status` accordingly. +The controller is also responsible for updating the status of a `SparkApplication` object with the help of the Spark pod monitor, which watches Spark pods and update the `SparkApplicationStatus` field of corresponding `SparkApplication` objects based on the status of the pods. The Spark pod monitor watches events of creation, updates, and deletion of Spark pods, creates status update messages based on the status of the pods, and sends the messages to the controller to process. When the controller receives a status update message, it gets the corresponding `SparkApplication` object from the cache store and updates the the `Status` accordingly. -As described in [API Definition](api-docs.md), the `Status` field (of type `SparkApplicationStatus`) records the overall state of the application as well as the state of each executor pod. Note that the overall state of an application is determined by the driver pod state, except when submission fails, in which case no driver pod gets launched. Particularly, the final application state is set to the termination state of the driver pod when applicable, i.e., `COMPLETED` if the driver pod completed or `FAILED` if the driver pod failed. If the driver pod gets deleted while running, the final application state is set to `FAILED`. If submission fails, the application state is set to `FAILED_SUBMISSION`. There are two terminal states: `COMPLETED` and `FAILED` which means that any Application in these states will never be retried by the Operator. All other states are non-terminal and based on the State as well as RestartPolicy (discussed below) can be retried. +As described in [API Definition](https://github.com/kubeflow/spark-operator/blob/master/docs/api-docs.md), the `Status` field (of type `SparkApplicationStatus`) records the overall state of the application as well as the state of each executor pod. Note that the overall state of an application is determined by the driver pod state, except when submission fails, in which case no driver pod gets launched. Particularly, the final application state is set to the termination state of the driver pod when applicable, i.e., `COMPLETED` if the driver pod completed or `FAILED` if the driver pod failed. If the driver pod gets deleted while running, the final application state is set to `FAILED`. If submission fails, the application state is set to `FAILED_SUBMISSION`. There are two terminal states: `COMPLETED` and `FAILED` which means that any Application in these states will never be retried by the Operator. All other states are non-terminal and based on the State as well as RestartPolicy (discussed below) can be retried. As part of preparing a submission for a newly created `SparkApplication` object, the controller parses the object and adds configuration options for adding certain annotations to the driver and executor pods of the application. The annotations are later used by the mutating admission webhook to configure the pods before they start to run. For example,if a Spark application needs a certain Kubernetes ConfigMap to be mounted into the driver and executor pods, the controller adds an annotation that specifies the name of the ConfigMap to mount. Later the mutating admission webhook sees the annotation on the pods and mount the ConfigMap to the pods. ## Handling Application Restart And Failures -The operator provides a configurable option through the `RestartPolicy` field of `SparkApplicationSpec` (see the [Configuring Automatic Application Restart and Failure Handling](user-guide.md#configuring-automatic-application-restart-and-failure-handling) for more details) for specifying the application restart policy. The operator determines if an application should be restarted based on its termination state and the restart policy. As discussed above, the termination state of an application is based on the termination state of the driver pod. So effectively the decision is based on the termination state of the driver pod and the restart policy. Specifically, one of the following conditions applies: +The operator provides a configurable option through the `RestartPolicy` field of `SparkApplicationSpec` (see the [Configuring Automatic Application Restart and Failure Handling](../user-guide/working-with-sparkapplication.md#configuring-automatic-application-restart-and-failure-handling) for more details) for specifying the application restart policy. The operator determines if an application should be restarted based on its termination state and the restart policy. As discussed above, the termination state of an application is based on the termination state of the driver pod. So effectively the decision is based on the termination state of the driver pod and the restart policy. Specifically, one of the following conditions applies: -* If the restart policy type is `Never`, the application is not restarted upon terminating. -* If the restart policy type is `Always`, the application gets restarted regardless of the termination state of the application. Please note that such an Application will never end up in a terminal state of `COMPLETED` or `FAILED`. -* If the restart policy type is `OnFailure`, the application gets restarted if and only if the application failed and the retry limit is not reached. Note that in case the driver pod gets deleted while running, the application is considered being failed as discussed above. In this case, the application gets restarted if the restart policy is `OnFailure`. +- If the restart policy type is `Never`, the application is not restarted upon terminating. +- If the restart policy type is `Always`, the application gets restarted regardless of the termination state of the application. Please note that such an Application will never end up in a terminal state of `COMPLETED` or `FAILED`. +- If the restart policy type is `OnFailure`, the application gets restarted if and only if the application failed and the retry limit is not reached. Note that in case the driver pod gets deleted while running, the application is considered being failed as discussed above. In this case, the application gets restarted if the restart policy is `OnFailure`. When the operator decides to restart an application, it cleans up the Kubernetes resources associated with the previous terminated run of the application and enqueues the `SparkApplication` object of the application into the internal work queue, from which it gets picked up by a worker who will handle the submission. Note that instead of restarting the driver pod, the operator simply re-submits the application and lets the submission client create a new driver pod. @@ -56,6 +67,6 @@ When the operator decides to restart an application, it cleans up the Kubernetes The operator comes with an optional mutating admission webhook for customizing Spark driver and executor pods based on certain annotations on the pods added by the CRD controller. The annotations are set by the operator based on the application specifications. All Spark pod customization needs except for those natively support by Spark on Kubernetes are handled by the mutating admission webhook. -## Command-line Tool: Sparkctl +## Command-line Tool: Sparkctl -[sparkctl](../sparkctl/README.md) is a command-line tool for working with the operator. It supports creating a `SparkApplication`object from a YAML file, listing existing `SparkApplication` objects, checking status of a `SparkApplication`, forwarding from a local port to the remote port on which the Spark driver runs, and deleting a `SparkApplication` object. For more details on `sparkctl`, please refer to [README](../sparkctl/README.md). +[sparkctl](https://github.com/kubeflow/spark-operator/blob/master/sparkctl/README.md) is a command-line tool for working with the operator. It supports creating a `SparkApplication`object from a YAML file, listing existing `SparkApplication` objects, checking status of a `SparkApplication`, forwarding from a local port to the remote port on which the Spark driver runs, and deleting a `SparkApplication` object. For more details on `sparkctl`, please refer to [README](https://github.com/kubeflow/spark-operator/blob/master/sparkctl/README.md). diff --git a/docs/user-guide/customizing-spark-operator.md b/docs/user-guide/customizing-spark-operator.md new file mode 100644 index 000000000..355cb97a2 --- /dev/null +++ b/docs/user-guide/customizing-spark-operator.md @@ -0,0 +1,9 @@ +# Customizing Spark Operator + +To customize the operator, you can follow the steps below: + +1. Compile Spark distribution with Kubernetes support as per [Spark documentation](https://spark.apache.org/docs/latest/building-spark.html#building-with-kubernetes-support). +2. Create docker images to be used for Spark with [docker-image tool](https://spark.apache.org/docs/latest/running-on-kubernetes.html#docker-images). +3. Create a new operator image based on the above image. You need to modify the `FROM` tag in the [Dockerfile](https://github.com/kubeflow/spark-operator/blob/master/Dockerfile) with your Spark image. +4. Build and push your operator image built above. +5. Deploy the new image by modifying the [/manifest/spark-operator-install/spark-operator.yaml](https://github.com/kubeflow/spark-operator/blob/master/manifest/spark-operator-install/spark-operator.yaml) file and specifying your operator image. diff --git a/docs/gcp.md b/docs/user-guide/gcp.md similarity index 75% rename from docs/gcp.md rename to docs/user-guide/gcp.md index 9e6f0f3e5..af1103785 100644 --- a/docs/gcp.md +++ b/docs/user-guide/gcp.md @@ -1,45 +1,45 @@ # Integration with Google Cloud Storage and BigQuery -This document describes how to use Google Cloud services, e.g., Google Cloud Storage (GCS) and BigQuery as data sources -or sinks in `SparkApplication`s. For a detailed tutorial on building Spark applications that access GCS and BigQuery, +This document describes how to use Google Cloud services, e.g., Google Cloud Storage (GCS) and BigQuery as data sources +or sinks in `SparkApplication`s. For a detailed tutorial on building Spark applications that access GCS and BigQuery, please refer to [Using Spark on Kubernetes Engine to Process Data in BigQuery](https://cloud.google.com/solutions/spark-on-kubernetes-engine). -A Spark application requires the [GCS](https://cloud.google.com/dataproc/docs/concepts/connectors/cloud-storage) and -[BigQuery](https://cloud.google.com/dataproc/docs/concepts/connectors/bigquery) connectors to access GCS and BigQuery -using the Hadoop `FileSystem` API. One way to make the connectors available to the driver and executors is to use a +A Spark application requires the [GCS](https://cloud.google.com/dataproc/docs/concepts/connectors/cloud-storage) and +[BigQuery](https://cloud.google.com/dataproc/docs/concepts/connectors/bigquery) connectors to access GCS and BigQuery +using the Hadoop `FileSystem` API. One way to make the connectors available to the driver and executors is to use a custom Spark image with the connectors built-in, as this example [Dockerfile](https://github.com/GoogleCloudPlatform/spark-on-k8s-gcp-examples/blob/master/dockerfiles/spark-gcs/Dockerfile) shows. -An image built from this Dockerfile is located at `gcr.io/ynli-k8s/spark:v2.3.0-gcs`. +An image built from this Dockerfile is located at `gcr.io/ynli-k8s/spark:v2.3.0-gcs`. -The connectors require certain Hadoop properties to be set properly to function. Setting Hadoop properties can be done -both through a custom Hadoop configuration file, namely, `core-site.xml` in a custom image, or via the `spec.hadoopConf` -section in a `SparkApplication`. The example Dockerfile mentioned above shows the use of a custom `core-site.xml` and a -custom `spark-env.sh` that points the environment variable `HADOOP_CONF_DIR` to the directory in the container where -`core-site.xml` is located. The example `core-site.xml` and `spark-env.sh` can be found +The connectors require certain Hadoop properties to be set properly to function. Setting Hadoop properties can be done +both through a custom Hadoop configuration file, namely, `core-site.xml` in a custom image, or via the `spec.hadoopConf` +section in a `SparkApplication`. The example Dockerfile mentioned above shows the use of a custom `core-site.xml` and a +custom `spark-env.sh` that points the environment variable `HADOOP_CONF_DIR` to the directory in the container where +`core-site.xml` is located. The example `core-site.xml` and `spark-env.sh` can be found [here](https://github.com/GoogleCloudPlatform/spark-on-k8s-gcp-examples/tree/master/conf). The GCS and BigQuery connectors need to authenticate with the GCS and BigQuery services before they can use the services. -The connectors support using a [GCP service account JSON key file](https://cloud.google.com/iam/docs/creating-managing-service-account-keys) -for authentication. The service account must have the necessary IAM roles for access GCS and/or BigQuery granted. The -[tutorial](https://cloud.google.com/solutions/spark-on-kubernetes-engine) has detailed information on how to create an -service account, grant it the right roles, furnish a key, and download a JSON key file. To tell the connectors to use +The connectors support using a [GCP service account JSON key file](https://cloud.google.com/iam/docs/creating-managing-service-account-keys) +for authentication. The service account must have the necessary IAM roles for access GCS and/or BigQuery granted. The +[tutorial](https://cloud.google.com/solutions/spark-on-kubernetes-engine) has detailed information on how to create a +service account, grant it the right roles, furnish a key, and download a JSON key file. To tell the connectors to use a service JSON key file for authentication, the following Hadoop configuration properties must be set: -``` +```properties google.cloud.auth.service.account.enable=true google.cloud.auth.service.account.json.keyfile= -``` +``` The most common way of getting the service account JSON key file into the driver and executor containers is mount the key -file in through a Kubernetes secret volume. Detailed information on how to create a secret can be found in the +file in through a Kubernetes secret volume. Detailed information on how to create a secret can be found in the [tutorial](https://cloud.google.com/solutions/spark-on-kubernetes-engine). -Below is an example `SparkApplication` using the custom image at `gcr.io/ynli-k8s/spark:v2.3.0-gcs` with the GCS/BigQuery -connectors and the custom Hadoop configuration files above built-in. Note that some of the necessary Hadoop configuration -properties are set using `spec.hadoopConf`. Those Hadoop configuration properties are additional to the ones set in the -built-in `core-site.xml`. They are set here instead of in `core-site.xml` because of their application-specific nature. -The ones set in `core-site.xml` apply to all applications using the image. Also note how the Kubernetes secret named -`gcs-bg` that stores the service account JSON key file gets mounted into both the driver and executors. The environment +Below is an example `SparkApplication` using the custom image at `gcr.io/ynli-k8s/spark:v2.3.0-gcs` with the GCS/BigQuery +connectors and the custom Hadoop configuration files above built-in. Note that some of the necessary Hadoop configuration +properties are set using `spec.hadoopConf`. Those Hadoop configuration properties are additional to the ones set in the +built-in `core-site.xml`. They are set here instead of in `core-site.xml` because of their application-specific nature. +The ones set in `core-site.xml` apply to all applications using the image. Also note how the Kubernetes secret named +`gcs-bg` that stores the service account JSON key file gets mounted into both the driver and executors. The environment variable `GCS_PROJECT_ID` must be set when using the image at `gcr.io/ynli-k8s/spark:v2.3.0-gcs`. ```yaml diff --git a/docs/user-guide/leader-election.md b/docs/user-guide/leader-election.md new file mode 100644 index 000000000..0bd8fdd21 --- /dev/null +++ b/docs/user-guide/leader-election.md @@ -0,0 +1,12 @@ +# Enabling Leader Election for High Availability + +The operator supports a high-availability (HA) mode, in which there can be more than one replicas of the operator, with only one of the replicas (the leader replica) actively operating. If the leader replica fails, the leader election process is engaged again to determine a new leader from the replicas available. The HA mode can be enabled through an optional leader election process. Leader election is disabled by default but can be enabled via a command-line flag. The following table summarizes the command-line flags relevant to leader election: + +| Flag | Default Value | Description | +| ------------- | ------------- | ------------- | +| `leader-election` | `false` | Whether to enable leader election (or the HA mode) or not. | +| `leader-election-lock-namespace` | `spark-operator` | Kubernetes namespace of the lock resource used for leader election. | +| `leader-election-lock-name` | `spark-operator-lock` | Name of the lock resource used for leader election. | +| `leader-election-lease-duration` | 15 seconds | Leader election lease duration. | +| `leader-election-renew-deadline` | 14 seconds | Leader election renew deadline. | +| `leader-election-retry-period` | 4 seconds | Leader election retry period. | diff --git a/docs/user-guide/resource-quota-enforcement.md b/docs/user-guide/resource-quota-enforcement.md new file mode 100644 index 000000000..436ba341f --- /dev/null +++ b/docs/user-guide/resource-quota-enforcement.md @@ -0,0 +1,5 @@ +# Enabling Resource Quota Enforcement + +The Spark Operator provides limited support for resource quota enforcement using a validating webhook. It will count the resources of non-terminal-phase SparkApplications and Pods, and determine whether a requested SparkApplication will fit given the remaining resources. ResourceQuota scope selectors are not supported, any ResourceQuota object that does not match the entire namespace will be ignored. Like the native Pod quota enforcement, current usage is updated asynchronously, so some overscheduling is possible. + +If you are running Spark applications in namespaces that are subject to resource quota constraints, consider enabling this feature to avoid driver resource starvation. Quota enforcement can be enabled with the command line arguments `-enable-resource-quota-enforcement=true`. It is recommended to also set `-webhook-fail-on-error=true`. diff --git a/docs/user-guide/running-multiple-instances-of-the-operator.md b/docs/user-guide/running-multiple-instances-of-the-operator.md new file mode 100644 index 000000000..b85d581e9 --- /dev/null +++ b/docs/user-guide/running-multiple-instances-of-the-operator.md @@ -0,0 +1,28 @@ +# Running Multiple Instances Of The Operator Within The Same K8s Cluster + +If you need to run multiple instances of the operator within the same k8s cluster. Therefore, you need to make sure that the running instances should not compete for the same custom resources or pods. You can achieve this: + +Either: + +- By specifying a different `namespace` flag for each instance of the operator. + +Or if you want your operator to watch specific resources that may exist in different namespaces: + +- You need to add custom labels on resources by defining for each instance of the operator a different set of labels in `-label-selector-filter (e.g. env=dev,app-type=spark)`. +- Run different `webhook` instances by specifying different `-webhook-config-name` flag for each deployment of the operator. +- Specify different `webhook-svc-name` and/or `webhook-svc-namespace` for each instance of the operator. +- Edit the job that generates the certificates `webhook-init` by specifying the namespace and the service name of each instance of the operator, `e.g. command: ["/usr/bin/gencerts.sh", "-n", "ns-op1", "-s", "spark-op1-webhook", "-p"]`. Where `spark-op1-webhook` should match what you have specified in `webhook-svc-name`. For instance, if you use the following [helm chart](https://github.com/helm/charts/tree/master/incubator/sparkoperator) to deploy the operator you may specify for each instance of the operator a different `--namespace` and `--name-template` arguments to make sure you generate a different certificate for each instance, e.g: + +```shell +helm install spark-op1 incubator/sparkoperator --namespace ns-op1 +helm install spark-op2 incubator/sparkoperator --namespace ns-op2 +``` + +Will run 2 `webhook-init` jobs. Each job executes respectively the command: + +```yaml +command: ["/usr/bin/gencerts.sh", "-n", "ns-op1", "-s", "spark-op1-webhook", "-p"`] +command: ["/usr/bin/gencerts.sh", "-n", "ns-op2", "-s", "spark-op2-webhook", "-p"`] +``` + +- Although resources are already filtered with respect to the specified labels on resources. You may also specify different labels in `-webhook-namespace-selector` and attach these labels to the namespaces on which you want the webhook to listen to. diff --git a/docs/user-guide/running-sparkapplication-on-schedule.md b/docs/user-guide/running-sparkapplication-on-schedule.md new file mode 100644 index 000000000..9e2e1a130 --- /dev/null +++ b/docs/user-guide/running-sparkapplication-on-schedule.md @@ -0,0 +1,43 @@ +# Running Spark Applications on a Schedule using a ScheduledSparkApplication + +The operator supports running a Spark application on a standard [cron](https://en.wikipedia.org/wiki/Cron) schedule using objects of the `ScheduledSparkApplication` custom resource type. A `ScheduledSparkApplication` object specifies a cron schedule on which the application should run and a `SparkApplication` template from which a `SparkApplication` object for each run of the application is created. The following is an example `ScheduledSparkApplication`: + +```yaml +apiVersion: "sparkoperator.k8s.io/v1beta2" +kind: ScheduledSparkApplication +metadata: + name: spark-pi-scheduled + namespace: default +spec: + schedule: "@every 5m" + concurrencyPolicy: Allow + successfulRunHistoryLimit: 1 + failedRunHistoryLimit: 3 + template: + type: Scala + mode: cluster + image: gcr.io/spark/spark:v3.1.1 + mainClass: org.apache.spark.examples.SparkPi + mainApplicationFile: local:///opt/spark/examples/jars/spark-examples_2.12-3.1.1.jar + driver: + cores: 1 + memory: 512m + executor: + cores: 1 + instances: 1 + memory: 512m + restartPolicy: + type: Never +``` + +The concurrency of runs of an application is controlled by `.spec.concurrencyPolicy`, whose valid values are `Allow`, `Forbid`, and `Replace`, with `Allow` being the default. The meanings of each value is described below: + +* `Allow`: more than one run of an application are allowed if for example the next run of the application is due even though the previous run has not completed yet. +* `Forbid`: no more than one run of an application is allowed. The next run of the application can only start if the previous run has completed. +* `Replace`: no more than one run of an application is allowed. When the next run of the application is due, the previous run is killed and the next run starts as a replacement. + +A scheduled `ScheduledSparkApplication` can be temporarily suspended (no future scheduled runs of the application will be triggered) by setting `.spec.suspend` to `true`. The schedule can be resumed by removing `.spec.suspend` or setting it to `false`. A `ScheduledSparkApplication` can have names of `SparkApplication` objects for the past runs of the application tracked in the `Status` section as discussed below. The numbers of past successful runs and past failed runs to keep track of are controlled by field `.spec.successfulRunHistoryLimit` and field `.spec.failedRunHistoryLimit`, respectively. The example above allows 1 past successful run and 3 past failed runs to be tracked. + +The `Status` section of a `ScheduledSparkApplication` object shows the time of the last run and the proposed time of the next run of the application, through `.status.lastRun` and `.status.nextRun`, respectively. The names of the `SparkApplication` object for the most recent run (which may or may not be running) of the application are stored in `.status.lastRunName`. The names of `SparkApplication` objects of the past successful runs of the application are stored in `.status.pastSuccessfulRunNames`. Similarly, the names of `SparkApplication` objects of the past failed runs of the application are stored in `.status.pastFailedRunNames`. + +Note that certain restart policies (specified in `.spec.template.restartPolicy`) may not work well with the specified schedule and concurrency policy of a `ScheduledSparkApplication`. For example, a restart policy of `Always` should never be used with a `ScheduledSparkApplication`. In most cases, a restart policy of `OnFailure` may not be a good choice as the next run usually picks up where the previous run left anyway. For these reasons, it's often the right choice to use a restart policy of `Never` as the example above shows. diff --git a/docs/user-guide/using-sparkapplication.md b/docs/user-guide/using-sparkapplication.md new file mode 100644 index 000000000..7af07d894 --- /dev/null +++ b/docs/user-guide/using-sparkapplication.md @@ -0,0 +1,25 @@ +# Using SparkApplications + +The operator runs Spark applications specified in Kubernetes objects of the `SparkApplication` custom resource type. The most common way of using a `SparkApplication` is store the `SparkApplication` specification in a YAML file and use the `kubectl` command or alternatively the `sparkctl` command to work with the `SparkApplication`. The operator automatically submits the application as configured in a `SparkApplication` to run on the Kubernetes cluster and uses the `SparkApplication` to collect and surface the status of the driver and executors to the user. + +As with all other Kubernetes API objects, a `SparkApplication` needs the `apiVersion`, `kind`, and `metadata` fields. For general information about working with manifests, see [object management using kubectl](https://kubernetes.io/docs/concepts/overview/object-management-kubectl/overview/). + +A `SparkApplication` also needs a [`.spec` section](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#spec-and-status). This section contains fields for specifying various aspects of an application including its type (`Scala`, `Java`, `Python`, or `R`), deployment mode (`cluster` or `client`), main application resource URI (e.g., the URI of the application jar), main class, arguments, etc. Node selectors are also supported via the optional field `.spec.nodeSelector`. + +It also has fields for specifying the unified container image (to use for both the driver and executors) and the image pull policy, namely, `.spec.image` and `.spec.imagePullPolicy` respectively. If a custom init-container (in both the driver and executor pods) image needs to be used, the optional field `.spec.initContainerImage` can be used to specify it. If set, `.spec.initContainerImage` overrides `.spec.image` for the init-container image. Otherwise, the image specified by `.spec.image` will be used for the init-container. It is invalid if both `.spec.image` and `.spec.initContainerImage` are not set. + +Below is an example showing part of a `SparkApplication` specification: + +```yaml +apiVersion: sparkoperator.k8s.io/v1beta2 +kind: SparkApplication +metadata: + name: spark-pi + namespace: default +spec: + type: Scala + mode: cluster + image: spark:3.5.1 + mainClass: org.apache.spark.examples.SparkPi + mainApplicationFile: local:///opt/spark/examples/jars/spark-examples_2.12-3.5.1.jar +``` diff --git a/docs/volcano-integration.md b/docs/user-guide/volcano-integration.md similarity index 75% rename from docs/volcano-integration.md rename to docs/user-guide/volcano-integration.md index 7d67276a9..dbfe32fc1 100644 --- a/docs/volcano-integration.md +++ b/docs/user-guide/volcano-integration.md @@ -4,7 +4,6 @@ currently missing from Kubernetes that are commonly required by many classes of batch & elastic workloads. With the integration with Volcano, Spark application pods can be scheduled for better scheduling efficiency. -# Requirements ## Volcano components @@ -14,16 +13,22 @@ same environment, please refer [Quick Start Guide](https://github.com/volcano-sh ## Install Kubernetes Operator for Apache Spark with Volcano enabled Within the help of Helm chart, Kubernetes Operator for Apache Spark with Volcano can be easily installed with the command below: -```bash -$ helm repo add spark-operator https://kubeflow.github.io/spark-operator -$ helm install my-release spark-operator/spark-operator --namespace spark-operator --set batchScheduler.enable=true --set webhook.enable=true + +```shell +helm repo add spark-operator https://kubeflow.github.io/spark-operator + +helm install my-release spark-operator/spark-operator \ + --namespace spark-operator \ + --set webhook.enable=true \ + --set batchScheduler.enable=true ``` -# Run Spark Application with Volcano scheduler +## Run Spark Application with Volcano scheduler + +Now, we can run an updated version of spark application (with `batchScheduler` configured), for instance: -Now, we can run a updated version of spark application (with `batchScheduler` configured), for instance: ```yaml -apiVersion: "sparkoperator.k8s.io/v1beta2" +apiVersion: sparkoperator.k8s.io/v1beta2 kind: SparkApplication metadata: name: spark-pi @@ -31,40 +36,42 @@ metadata: spec: type: Scala mode: cluster - image: "gcr.io/spark-operator/spark:v3.1.1" + image: spark:3.5.1 imagePullPolicy: Always mainClass: org.apache.spark.examples.SparkPi - mainApplicationFile: "local:///opt/spark/examples/jars/spark-examples_2.12-v3.1.1.jar" - sparkVersion: "3.1.1" - batchScheduler: "volcano" #Note: the batch scheduler name must be specified with `volcano` + mainApplicationFile: local:///opt/spark/examples/jars/spark-examples_2.12-v3.5.1.jar + sparkVersion: 3.5.1 + batchScheduler: volcano # Note: the batch scheduler name must be specified with `volcano` restartPolicy: type: Never volumes: - - name: "test-volume" + - name: test-volume hostPath: - path: "/tmp" + path: /tmp type: Directory driver: cores: 1 - coreLimit: "1200m" - memory: "512m" + coreLimit: 1200m + memory: 512m labels: - version: 3.1.1 + version: 3.5.1 serviceAccount: spark volumeMounts: - - name: "test-volume" - mountPath: "/tmp" + - name: test-volume + mountPath: /tmp executor: cores: 1 instances: 1 - memory: "512m" + memory: 512m labels: - version: 3.1.1 + version: 3.5.1 volumeMounts: - - name: "test-volume" + - name: test-volume mountPath: "/tmp" ``` + When running, the Pods Events can be used to verify that whether the pods have been scheduled via Volcano. + ``` Type Reason Age From Message ---- ------ ---- ---- ------- @@ -76,13 +83,12 @@ Normal Scheduled 23s volcano Successfully assigned def If SparkApplication is configured to run with Volcano, there are some details underground that make the two systems integrated: 1. Kubernetes Operator for Apache Spark's webhook will patch pods' `schedulerName` according to the `batchScheduler` in SparkApplication Spec. -2. Before submitting spark application, Kubernetes Operator for Apache Spark will create a Volcano native resource +2. Before submitting spark application, Kubernetes Operator for Apache Spark will create a Volcano native resource `PodGroup`[here](https://github.com/volcano-sh/volcano/blob/a8fb05ce6c6902e366cb419d6630d66fc759121e/pkg/apis/scheduling/v1alpha2/types.go#L93) for the whole application. - and as a brief introduction, most of the Volcano's advanced scheduling features, such as pod delay creation, resource fairness and gang scheduling are all depend on this resource. - Also a new pod annotation named `scheduling.k8s.io/group-name` will be added. + and as a brief introduction, most of the Volcano's advanced scheduling features, such as pod delay creation, resource fairness and gang scheduling are all depend on this resource. + Also, a new pod annotation named `scheduling.k8s.io/group-name` will be added. 3. Volcano scheduler will take over all of the pods that both have schedulerName and annotation correctly configured for scheduling. - Kubernetes Operator for Apache Spark enables end user to have fine-grained controlled on batch scheduling via attribute `BatchSchedulerOptions`. `BatchSchedulerOptions` is a string dictionary that different batch scheduler can utilize it to expose different attributes. For now, volcano support these attributes below: @@ -91,4 +97,3 @@ For now, volcano support these attributes below: |-------|----------------------------------------------------------------------------|----------------------------------------------------------------| | queue | Used to specify which volcano queue will this spark application belongs to | batchSchedulerOptions:
    queue: "queue1" | | priorityClassName | Used to specify which priorityClass this spark application will use | batchSchedulerOptions:
    priorityClassName: "pri1" | - diff --git a/docs/user-guide/working-with-sparkapplication.md b/docs/user-guide/working-with-sparkapplication.md new file mode 100644 index 000000000..6e88a6cf5 --- /dev/null +++ b/docs/user-guide/working-with-sparkapplication.md @@ -0,0 +1,51 @@ +# Working with SparkApplications + +A `SparkApplication` can be created from a YAML file storing the `SparkApplication` specification using either the `kubectl apply -f ` command or the `sparkctl create ` command. Please refer to the `sparkctl` [README](../../sparkctl/README.md#create) for usage of the `sparkctl create` command. Once a `SparkApplication` is successfully created, the operator will receive it and submits the application as configured in the specification to run on the Kubernetes cluster. Please note, that `SparkOperator` submits `SparkApplication` in `Cluster` mode only. + +## Deleting a SparkApplication + +A `SparkApplication` can be deleted using either the `kubectl delete ` command or the `sparkctl delete ` command. Please refer to the `sparkctl` [README](../../sparkctl/README.md#delete) for usage of the `sparkctl delete` +command. Deleting a `SparkApplication` deletes the Spark application associated with it. If the application is running when the deletion happens, the application is killed and all Kubernetes resources associated with the application are deleted or garbage collected. + +## Updating a SparkApplication + +A `SparkApplication` can be updated using the `kubectl apply -f ` command. When a `SparkApplication` is successfully updated, the operator will receive both the updated and old `SparkApplication` objects. If the specification of the `SparkApplication` has changed, the operator submits the application to run, using the updated specification. If the application is currently running, the operator kills the running application before submitting a new run with the updated specification. There is planned work to enhance the way `SparkApplication` updates are handled. For example, if the change was to increase the number of executor instances, instead of killing the currently running application and starting a new run, it is a much better user experience to incrementally launch the additional executor pods. + +## Checking a SparkApplication + +A `SparkApplication` can be checked using the `kubectl describe sparkapplications ` command. The output of the command shows the specification and status of the `SparkApplication` as well as events associated with it. The events communicate the overall process and errors of the `SparkApplication`. + +## Configuring Automatic Application Restart and Failure Handling + +The operator supports automatic application restart with a configurable `RestartPolicy` using the optional field +`.spec.restartPolicy`. The following is an example of a sample `RestartPolicy`: + +```yaml + restartPolicy: + type: OnFailure + onFailureRetries: 3 + onFailureRetryInterval: 10 + onSubmissionFailureRetries: 5 + onSubmissionFailureRetryInterval: 20 +``` + +The valid types of restartPolicy include `Never`, `OnFailure`, and `Always`. Upon termination of an application, +the operator determines if the application is subject to restart based on its termination state and the +`RestartPolicy` in the specification. If the application is subject to restart, the operator restarts it by +submitting a new run of it. For `OnFailure`, the Operator further supports setting limits on number of retries +via the `onFailureRetries` and `onSubmissionFailureRetries` fields. Additionally, if the submission retries has not been reached, +the operator retries submitting the application using a linear backoff with the interval specified by +`onFailureRetryInterval` and `onSubmissionFailureRetryInterval` which are required for both `OnFailure` and `Always` `RestartPolicy`. +The old resources like driver pod, ui service/ingress etc. are deleted if it still exists before submitting the new run, and a new driver pod is created by the submission +client so effectively the driver gets restarted. + +## Setting TTL for a SparkApplication + +The `v1beta2` version of the `SparkApplication` API starts having TTL support for `SparkApplication`s through a new optional field named `.spec.timeToLiveSeconds`, which if set, defines the Time-To-Live (TTL) duration in seconds for a SparkApplication after its termination. The `SparkApplication` object will be garbage collected if the current time is more than the `.spec.timeToLiveSeconds` since its termination. The example below illustrates how to use the field: + +```yaml +spec: + timeToLiveSeconds: 3600 +``` + +Note that this feature requires that informer cache resync to be enabled, which is true by default with a resync internal of 30 seconds. You can change the resync interval by setting the flag `-resync-interval=`. diff --git a/docs/user-guide.md b/docs/user-guide/writing-sparkapplication.md similarity index 56% rename from docs/user-guide.md rename to docs/user-guide/writing-sparkapplication.md index 60354843b..894c1fc3b 100644 --- a/docs/user-guide.md +++ b/docs/user-guide/writing-sparkapplication.md @@ -1,61 +1,4 @@ -# User Guide - -For a quick introduction on how to build and install the Kubernetes Operator for Apache Spark, and how to run some example applications, please refer to the [Quick Start Guide](quick-start-guide.md). For a complete reference of the API definition of the `SparkApplication` and `ScheduledSparkApplication` custom resources, please refer to the [API Specification](api-docs.md). - -The Kubernetes Operator for Apache Spark ships with a command-line tool called `sparkctl` that offers additional features beyond what `kubectl` is able to do. Documentation on `sparkctl` can be found in [README](../sparkctl/README.md). If you are running the Spark Operator on Google Kubernetes Engine and want to use Google Cloud Storage (GCS) and/or BigQuery for reading/writing data, also refer to the [GCP guide](gcp.md). The Kubernetes Operator for Apache Spark will simply be referred to as the operator for the rest of this guide. - -## Table of Contents - -- [User Guide](#user-guide) - - [Table of Contents](#table-of-contents) - - [Using a SparkApplication](#using-a-sparkapplication) - - [Writing a SparkApplication Spec](#writing-a-sparkapplication-spec) - - [Specifying Deployment Mode](#specifying-deployment-mode) - - [Specifying Application Dependencies](#specifying-application-dependencies) - - [Specifying Spark Configuration](#specifying-spark-configuration) - - [Specifying Hadoop Configuration](#specifying-hadoop-configuration) - - [Writing Driver Specification](#writing-driver-specification) - - [Writing Executor Specification](#writing-executor-specification) - - [Specifying Extra Java Options](#specifying-extra-java-options) - - [Specifying Environment Variables](#specifying-environment-variables) - - [Requesting GPU Resources](#requesting-gpu-resources) - - [Host Network](#host-network) - - [Mounting Secrets](#mounting-secrets) - - [Mounting ConfigMaps](#mounting-configmaps) - - [Mounting a ConfigMap storing Spark Configuration Files](#mounting-a-configmap-storing-spark-configuration-files) - - [Mounting a ConfigMap storing Hadoop Configuration Files](#mounting-a-configmap-storing-hadoop-configuration-files) - - [Mounting Volumes](#mounting-volumes) - - [Using Secrets As Environment Variables](#using-secrets-as-environment-variables) - - [Using Image Pull Secrets](#using-image-pull-secrets) - - [Using Pod Affinity](#using-pod-affinity) - - [Using Tolerations](#using-tolerations) - - [Using Security Context](#using-security-context) - - [Using Sidecar Containers](#using-sidecar-containers) - - [Using Init-Containers](#using-init-containers) - - [Using DNS Settings](#using-dns-settings) - - [Using Volume For Scratch Space](#using-volume-for-scratch-space) - - [Using Termination Grace Period](#using-termination-grace-period) - - [Using Container LifeCycle Hooks](#using-container-lifecycle-hooks) - - [Python Support](#python-support) - - [Monitoring](#monitoring) - - [Dynamic Allocation](#dynamic-allocation) - - [Working with SparkApplications](#working-with-sparkapplications) - - [Creating a New SparkApplication](#creating-a-new-sparkapplication) - - [Deleting a SparkApplication](#deleting-a-sparkapplication) - - [Updating a SparkApplication](#updating-a-sparkapplication) - - [Checking a SparkApplication](#checking-a-sparkapplication) - - [Configuring Automatic Application Restart and Failure Handling](#configuring-automatic-application-restart-and-failure-handling) - - [Setting TTL for a SparkApplication](#setting-ttl-for-a-sparkapplication) - - [Running Spark Applications on a Schedule using a ScheduledSparkApplication](#running-spark-applications-on-a-schedule-using-a-scheduledsparkapplication) - - [Enabling Leader Election for High Availability](#enabling-leader-election-for-high-availability) - - [Enabling Resource Quota Enforcement](#enabling-resource-quota-enforcement) - - [Running Multiple Instances Of The Operator Within The Same K8s Cluster](#running-multiple-instances-of-the-operator-within-the-same-k8s-cluster) - - [Customizing the Operator](#customizing-the-operator) - -## Using a SparkApplication -The operator runs Spark applications specified in Kubernetes objects of the `SparkApplication` custom resource type. The most common way of using a `SparkApplication` is store the `SparkApplication` specification in a YAML file and use the `kubectl` command or alternatively the `sparkctl` command to work with the `SparkApplication`. The operator automatically submits the application as configured in a `SparkApplication` to run on the Kubernetes cluster and uses the `SparkApplication` to collect and surface the status of the driver and executors to the user. - -## Writing a SparkApplication Spec +# Writing a SparkApplication As with all other Kubernetes API objects, a `SparkApplication` needs the `apiVersion`, `kind`, and `metadata` fields. For general information about working with manifests, see [object management using kubectl](https://kubernetes.io/docs/concepts/overview/object-management-kubectl/overview/). @@ -74,17 +17,16 @@ metadata: spec: type: Scala mode: cluster - image: gcr.io/spark/spark:v3.1.1 + image: spark:3.5.1 mainClass: org.apache.spark.examples.SparkPi - mainApplicationFile: local:///opt/spark/examples/jars/spark-examples_2.12-3.1.1.jar + mainApplicationFile: local:///opt/spark/examples/jars/spark-examples_2.12-3.5.1.jar ``` -### Specifying Deployment Mode +## Specifying Deployment Mode -A `SparkApplication` should set `.spec.deployMode` to `cluster`, as `client` is not currently implemented. The driver pod will then run `spark-submit` in `client` mode internally to run the driver program. Additional details of how `SparkApplication`s are run can be found in the [design documentation](design.md#architecture). +A `SparkApplication` should set `.spec.deployMode` to `cluster`, as `client` is not currently implemented. The driver pod will then run `spark-submit` in `client` mode internally to run the driver program. Additional details of how `SparkApplication`s are run can be found in the [design documentation](../overview/index.md#architecture). - -### Specifying Application Dependencies +## Specifying Application Dependencies Often Spark applications need additional files additionally to the main application resource to run. Such application dependencies can include for example jars and data files the application needs at runtime. When using the `spark-submit` script to submit a Spark application, such dependencies are specified using the `--jars` and `--files` options. To support specification of application dependencies, a `SparkApplication` uses an optional field `.spec.deps` that in turn supports specifying jars and files, respectively. More specifically, the optional fields `.spec.deps.jars` and`.spec.deps.files` correspond to the `--jars` and `--files` options of the `spark-submit` script, respectively. @@ -105,6 +47,7 @@ spec: It's also possible to specify additional jars to obtain from a remote repository by adding maven coordinates to `.spec.deps.packages`. Conflicting transitive dependencies can be addressed by adding to the exclusion list with `.spec.deps.excludePackages`. Additional repositories can be added to the `.spec.deps.repositories` list. These directly translate to the `spark-submit` parameters `--packages`, `--exclude-packages`, and `--repositories`. NOTE: + - Each package in the `packages` list must be of the form "groupId:artifactId:version" - Each package in the `excludePackages` list must be of the form "groupId:artifactId" @@ -121,7 +64,7 @@ spec: - com.example:other-package ``` -### Specifying Spark Configuration +## Specifying Spark Configuration There are two ways to add Spark configuration: setting individual Spark configuration properties using the optional field `.spec.sparkConf` or mounting a special Kubernetes ConfigMap storing Spark configuration files (e.g. `spark-defaults.conf`, `spark-env.sh`, `log4j.properties`) using the optional field `.spec.sparkConfigMap`. If `.spec.sparkConfigMap` is used, additionally to mounting the ConfigMap into the driver and executors, the operator additionally sets the environment variable `SPARK_CONF_DIR` to point to the mount path of the ConfigMap. @@ -133,7 +76,7 @@ spec: spark.eventLog.dir: "hdfs://hdfs-namenode-1:8020/spark/spark-events" ``` -### Specifying Hadoop Configuration +## Specifying Hadoop Configuration There are two ways to add Hadoop configuration: setting individual Hadoop configuration properties using the optional field `.spec.hadoopConf` or mounting a special Kubernetes ConfigMap storing Hadoop configuration files (e.g. `core-site.xml`) using the optional field `.spec.hadoopConfigMap`. The operator automatically adds the prefix `spark.hadoop.` to the names of individual Hadoop configuration properties in `.spec.hadoopConf`. If `.spec.hadoopConfigMap` is used, additionally to mounting the ConfigMap into the driver and executors, the operator additionally sets the environment variable `HADOOP_CONF_DIR` to point to the mount path of the ConfigMap. @@ -148,7 +91,7 @@ spec: "google.cloud.auth.service.account.json.keyfile": /mnt/secrets/key.json ``` -### Writing Driver Specification +## Writing Driver Specification The `.spec` section of a `SparkApplication` has a `.spec.driver` field for configuring the driver. It allows users to set the memory and CPU resources to request for the driver pod, and the container image the driver should use. It also has fields for optionally specifying labels, annotations, and environment variables for the driver pod. By default, the driver pod name of an application is automatically generated by the Spark submission client. If instead you want to use a particular name for the driver pod, the optional field `.spec.driver.podName` can be used. The driver pod by default uses the `default` service account in the namespace it is running in to talk to the Kubernetes API server. The `default` service account, however, may or may not have sufficient permissions to create executor pods and the headless service used by the executors to connect to the driver. If it does not and a custom service account that has the right permissions should be used instead, the optional field `.spec.driver.serviceAccount` can be used to specify the name of the custom service account. When a custom container image is needed for the driver, the field `.spec.driver.image` can be used to specify it. This overrides the image specified in `.spec.image` if it is also set. It is invalid if both `.spec.image` and `.spec.driver.image` are not set. @@ -168,7 +111,7 @@ spec: serviceAccount: spark ``` -### Writing Executor Specification +## Writing Executor Specification The `.spec` section of a `SparkApplication` has a `.spec.executor` field for configuring the executors. It allows users to set the memory and CPU resources to request for the executor pods, and the container image the executors should use. It also has fields for optionally specifying labels, annotations, and environment variables for the executor pods. By default, a single executor is requested for an application. If more than one executor are needed, the optional field `.spec.executor.instances` can be used to specify the number of executors to request. When a custom container image is needed for the executors, the field `.spec.executor.image` can be used to specify it. This overrides the image specified in `.spec.image` if it is also set. It is invalid if both `.spec.image` and `.spec.executor.image` are not set. @@ -188,7 +131,7 @@ spec: serviceAccount: spark ``` -### Specifying Extra Java Options +## Specifying Extra Java Options A `SparkApplication` can specify extra Java options for the driver or executors, using the optional field `.spec.driver.javaOptions` for the driver and `.spec.executor.javaOptions` for executors. Below is an example: @@ -200,9 +143,9 @@ spec: Values specified using those two fields get converted to Spark configuration properties `spark.driver.extraJavaOptions` and `spark.executor.extraJavaOptions`, respectively. **Prefer using the above two fields over configuration properties `spark.driver.extraJavaOptions` and `spark.executor.extraJavaOptions`** as the fields work well with other fields that might modify what gets set for `spark.driver.extraJavaOptions` or `spark.executor.extraJavaOptions`. -### Specifying Environment Variables +## Specifying Environment Variables -There are two fields for specifying environment variables for the driver and/or executor containers, namely `.spec.driver.env` (or `.spec.executor.env` for the executor container) and `.spec.driver.envFrom` (or `.spec.executor.envFrom` for the executor container). Specifically, `.spec.driver.env` (and `.spec.executor.env`) takes a list of [EnvVar](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.17/#envvar-v1-core), each of which specifies an environment variable or the source of an environment variable, e.g., a name-value pair, a ConfigMap key, a Secret key, etc. Alternatively, `.spec.driver.envFrom` (and `.spec.executor.envFrom`) takes a list of [EnvFromSource](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.17/#envfromsource-v1-core) and allows [using all key-value pairs in a ConfigMap or Secret as environment variables](https://v1-15.docs.kubernetes.io/docs/tasks/configure-pod-container/configure-pod-configmap/#configure-all-key-value-pairs-in-a-configmap-as-container-environment-variables). The `SparkApplication` snippet below shows the use of both fields: +There are two fields for specifying environment variables for the driver and/or executor containers, namely `.spec.driver.env` (or `.spec.executor.env` for the executor container) and `.spec.driver.envFrom` (or `.spec.executor.envFrom` for the executor container). Specifically, `.spec.driver.env` (and `.spec.executor.env`) takes a list of [EnvVar](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.30/#envvar-v1-core), each of which specifies an environment variable or the source of an environment variable, e.g., a name-value pair, a ConfigMap key, a Secret key, etc. Alternatively, `.spec.driver.envFrom` (and `.spec.executor.envFrom`) takes a list of [EnvFromSource](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.30/#envfromsource-v1-core) and allows [using all key-value pairs in a ConfigMap or Secret as environment variables](https://kubernetes.io/docs/tasks/configure-pod-container/configure-pod-configmap/#configure-all-key-value-pairs-in-a-configmap-as-container-environment-variables). The `SparkApplication` snippet below shows the use of both fields: ```yaml spec: @@ -252,7 +195,7 @@ spec: **Note: legacy field `envVars` that can also be used for specifying environment variables is deprecated and will be removed in a future API version.** -### Requesting GPU Resources +## Requesting GPU Resources A `SparkApplication` can specify GPU resources for the driver or executor pod, using the optional field `.spec.driver.gpu` or `.spec.executor.gpu`. Below is an example: @@ -277,9 +220,10 @@ spec: name: "nvidia.com/gpu" quantity: 1 ``` -Note that the mutating admission webhook is needed to use this feature. Please refer to the [Quick Start Guide](quick-start-guide.md) on how to enable the mutating admission webhook. -### Host Network +Note that the mutating admission webhook is needed to use this feature. Please refer to the [Getting Started](../getting-started.md) on how to enable the mutating admission webhook. + +## Host Network A `SparkApplication` can specify `hostNetwork` for the driver or executor pod, using the optional field `.spec.driver.hostNetwork` or `.spec.executor.hostNetwork`. When `hostNetwork` is `true`, the operator sets pods' `spec.hostNetwork` to `true` and sets pods' `spec.dnsPolicy` to `ClusterFirstWithHostNet`. Below is an example: @@ -298,10 +242,10 @@ spec: instances: 1 memory: "512m" ``` -Note that the mutating admission webhook is needed to use this feature. Please refer to the [Quick Start Guide](quick-start-guide.md) on how to enable the mutating admission webhook. +Note that the mutating admission webhook is needed to use this feature. Please refer to the [Getting Started](../getting-started.md) on how to enable the mutating admission webhook. -### Mounting Secrets +## Mounting Secrets As mentioned above, both the driver specification and executor specification have an optional field `secrets` for configuring the list of Kubernetes Secrets to be mounted into the driver and executors, respectively. The field is a map with the names of the Secrets as keys and values specifying the mount path and type of each Secret. For instance, the following example shows a driver specification with a Secret named `gcp-svc-account` of type `GCPServiceAccount` to be mounted to `/mnt/secrets` in the driver pod. @@ -318,9 +262,9 @@ The type of a Secret as specified by the `secretType` field is a hint to the ope [Getting Started with Authentication](https://cloud.google.com/docs/authentication/getting-started) for more information on how to authenticate with GCP services using a service account JSON key file. Note that the operator assumes that the key of the service account JSON key file in the Secret data map is **`key.json`** so it is able to set the environment variable automatically. Similarly, if the type of a Secret is **`HadoopDelegationToken`**, the operator additionally sets the environment variable **`HADOOP_TOKEN_FILE_LOCATION`** to point to the file storing the Hadoop delegation token. In this case, the operator assumes that the key of the delegation token file in the Secret data map is **`hadoop.token`**. The `secretType` field should have the value `Generic` if no extra configuration is required. -Note that the mutating admission webhook is needed to use this feature. Please refer to the [Quick Start Guide](quick-start-guide.md) on how to enable the mutating admission webhook. +Note that the mutating admission webhook is needed to use this feature. Please refer to the [Getting Started](../getting-started.md) on how to enable the mutating admission webhook. -### Mounting ConfigMaps +## Mounting ConfigMaps Both the driver specification and executor specifications have an optional field for configuring the list of Kubernetes ConfigMaps to be mounted into the driver and executors, respectively. The field is a map with keys being the names of the ConfigMaps and values specifying the mount path of each ConfigMap. For instance, the following example shows a driver specification with a ConfigMap named `configmap1` to be mounted to `/mnt/config-maps` in the driver pod. @@ -333,25 +277,25 @@ spec: path: /mnt/config-maps ``` -Note that the mutating admission webhook is needed to use this feature. Please refer to the [Quick Start Guide](quick-start-guide.md) on how to enable the mutating admission webhook. +Note that the mutating admission webhook is needed to use this feature. Please refer to the [Getting Started](../getting-started.md) on how to enable the mutating admission webhook. -#### Mounting a ConfigMap storing Spark Configuration Files +## Mounting a ConfigMap storing Spark Configuration Files A `SparkApplication` can specify a Kubernetes ConfigMap storing Spark configuration files such as `spark-env.sh` or `spark-defaults.conf` using the optional field `.spec.sparkConfigMap` whose value is the name of the ConfigMap. The ConfigMap is assumed to be in the same namespace as that of the `SparkApplication`. The operator mounts the ConfigMap onto path `/etc/spark/conf` in both the driver and executors. Additionally, it also sets the environment variable `SPARK_CONF_DIR` to point to `/etc/spark/conf` in the driver and executors. Note that the mutating admission webhook is needed to use this feature. Please refer to the -[Quick Start Guide](quick-start-guide.md) on how to enable the mutating admission webhook. +[Getting Started](../getting-started.md) on how to enable the mutating admission webhook. -#### Mounting a ConfigMap storing Hadoop Configuration Files +## Mounting a ConfigMap storing Hadoop Configuration Files A `SparkApplication` can specify a Kubernetes ConfigMap storing Hadoop configuration files such as `core-site.xml` using the optional field `.spec.hadoopConfigMap` whose value is the name of the ConfigMap. The ConfigMap is assumed to be in the same namespace as that of the `SparkApplication`. The operator mounts the ConfigMap onto path `/etc/hadoop/conf` in both the driver and executors. Additionally, it also sets the environment variable `HADOOP_CONF_DIR` to point to `/etc/hadoop/conf` in the driver and executors. -Note that the mutating admission webhook is needed to use this feature. Please refer to the [Quick Start Guide](quick-start-guide.md) on how to enable the mutating admission webhook. +Note that the mutating admission webhook is needed to use this feature. Please refer to the [Getting Started](../getting-started.md) on how to enable the mutating admission webhook. -### Mounting Volumes +## Mounting Volumes The operator also supports mounting user-specified Kubernetes volumes into the driver and executors. A -`SparkApplication` has an optional field `.spec.volumes` for specifying the list of [volumes](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.9/#volume-v1-core) the driver and the executors need collectively. Then both the driver and executor specifications have an optional field `volumeMounts` that specifies the [volume mounts](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.9/#volumemount-v1-core) for the volumes needed by the driver and executors, respectively. The following is an example showing a `SparkApplication` with both driver and executor volume mounts. +`SparkApplication` has an optional field `.spec.volumes` for specifying the list of [volumes](https://kubernetes.io/docs/reference/kubernetes-api/config-and-storage-resources/volume/) the driver and the executors need collectively. Then both the driver and executor specifications have an optional field `volumeMounts` that specifies the [volume mounts](https://kubernetes.io/docs/reference/kubernetes-api/workload-resources/pod-v1/#volumes-1) for the volumes needed by the driver and executors, respectively. The following is an example showing a `SparkApplication` with both driver and executor volume mounts. ```yaml spec: @@ -375,9 +319,9 @@ spec: ``` -Note that the mutating admission webhook is needed to use this feature. Please refer to the [Quick Start Guide](quick-start-guide.md) on how to enable the mutating admission webhook. +Note that the mutating admission webhook is needed to use this feature. Please refer to the [Getting Started](../getting-started.md) on how to enable the mutating admission webhook. -### Using Secrets As Environment Variables +## Using Secrets As Environment Variables **Note: `envSecretKeyRefs` is deprecated and will be removed in a future API version.** @@ -396,7 +340,7 @@ spec: key: password ``` -### Using Image Pull Secrets +## Using Image Pull Secrets **Note that this feature requires an image based on the latest Spark master branch.** @@ -409,7 +353,7 @@ spec: - secret2 ``` -### Using Pod Affinity +## Using Pod Affinity A `SparkApplication` can specify an `Affinity` for the driver or executor pod, using the optional field `.spec.driver.affinity` or `.spec.executor.affinity`. Below is an example: @@ -427,9 +371,9 @@ spec: ... ``` -Note that the mutating admission webhook is needed to use this feature. Please refer to the [Quick Start Guide](quick-start-guide.md) on how to enable the mutating admission webhook. +Note that the mutating admission webhook is needed to use this feature. Please refer to the [Getting Started](../getting-started.md) on how to enable the mutating admission webhook. -### Using Tolerations +## Using Tolerations A `SparkApplication` can specify an `Tolerations` for the driver or executor pod, using the optional field `.spec.driver.tolerations` or `.spec.executor.tolerations`. Below is an example: @@ -450,9 +394,9 @@ spec: ``` Note that the mutating admission webhook is needed to use this feature. Please refer to the -[Quick Start Guide](quick-start-guide.md) on how to enable the mutating admission webhook. +[Getting Started](../getting-started.md) on how to enable the mutating admission webhook. -### Using Security Context +## Using Security Context A `SparkApplication` can specify a `SecurityContext` for the driver or executor containers, using the optional field `.spec.driver.securityContext` or `.spec.executor.securityContext`. `SparkApplication` can also specify a `PodSecurityContext` for the driver or executor pod, using the optional field `.spec.driver.podSecurityContext` or `.spec.executor.podSecurityContext`. Below is an example: @@ -474,11 +418,11 @@ spec: ``` Note that the mutating admission webhook is needed to use this feature. Please refer to the -[Quick Start Guide](quick-start-guide.md) on how to enable the mutating admission webhook. +[Getting Started](../getting-started.md) on how to enable the mutating admission webhook. -### Using Sidecar Containers +## Using Sidecar Containers -A `SparkApplication` can specify one or more optional sidecar containers for the driver or executor pod, using the optional field `.spec.driver.sidecars` or `.spec.executor.sidecars`. The specification of each sidecar container follows the [Container](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.14/#container-v1-core) API definition. Below is an example: +A `SparkApplication` can specify one or more optional sidecar containers for the driver or executor pod, using the optional field `.spec.driver.sidecars` or `.spec.executor.sidecars`. The specification of each sidecar container follows the [Container](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.30/#container-v1-core) API definition. Below is an example: ```yaml spec: @@ -495,11 +439,11 @@ spec: ``` Note that the mutating admission webhook is needed to use this feature. Please refer to the -[Quick Start Guide](quick-start-guide.md) on how to enable the mutating admission webhook. +[Getting Started](../getting-started.md) on how to enable the mutating admission webhook. -### Using Init-Containers +## Using Init-Containers -A `SparkApplication` can optionally specify one or more [init-containers](https://kubernetes.io/docs/concepts/workloads/pods/init-containers/) for the driver or executor pod, using the optional field `.spec.driver.initContainers` or `.spec.executor.initContainers`, respectively. The specification of each init-container follows the [Container](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.14/#container-v1-core) API definition. Below is an example: +A `SparkApplication` can optionally specify one or more [init-containers](https://kubernetes.io/docs/concepts/workloads/pods/init-containers/) for the driver or executor pod, using the optional field `.spec.driver.initContainers` or `.spec.executor.initContainers`, respectively. The specification of each init-container follows the [Container](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.30/#container-v1-core) API definition. Below is an example: ```yaml spec: @@ -516,9 +460,10 @@ spec: ``` Note that the mutating admission webhook is needed to use this feature. Please refer to the -[Quick Start Guide](quick-start-guide.md) on how to enable the mutating admission webhook. +[Getting Started](../getting-started.md) on how to enable the mutating admission webhook. + +## Using DNS Settings -### Using DNS Settings A `SparkApplication` can define DNS settings for the driver and/or executor pod, by adding the standard [DNS](https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pod-s-dns-config) kubernetes settings. Fields to add such configuration are `.spec.driver.dnsConfig` and `.spec.executor.dnsConfig`. Example: ```yaml @@ -537,15 +482,15 @@ spec: ``` Note that the mutating admission webhook is needed to use this feature. Please refer to the -[Quick Start Guide](quick-start-guide.md) on how to enable the mutating admission webhook. +[Getting Started](../getting-started.md) on how to enable the mutating admission webhook. + +## Using Volume For Scratch Space -### Using Volume For Scratch Space By default, Spark uses temporary scratch space to spill data to disk during shuffles and other operations. The scratch directory defaults to `/tmp` of the container. If that storage isn't enough or you want to use a specific path, you can use one or more volumes. The volume names should start with `spark-local-dir-`. - ```yaml spec: volumes: @@ -569,7 +514,6 @@ Environment: SPARK_CONF_DIR: /opt/spark/conf ``` - > Note: Multiple volumes can be used together ```yaml @@ -604,7 +548,7 @@ spec: mountPath: "/tmp/dir1" ``` -### Using Termination Grace Period +## Using Termination Grace Period A Spark Application can optionally specify a termination grace Period seconds to the driver and executor pods. More [info](https://kubernetes.io/docs/concepts/workloads/pods/pod/#termination-of-pods) @@ -614,7 +558,8 @@ spec: terminationGracePeriodSeconds: 60 ``` -### Using Container LifeCycle Hooks +## Using Container LifeCycle Hooks + A Spark Application can optionally specify a [Container Lifecycle Hooks](https://kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/#container-hooks) for a driver. It is useful in cases where you need a PreStop or PostStart hooks to driver and executor. ```yaml @@ -628,10 +573,10 @@ spec: - -c - touch /var/run/killspark && sleep 65 ``` -In cases like Spark Streaming or Spark Structured Streaming applications, you can test if a file exists to start a graceful shutdown and stop all streaming queries manually. +In cases like Spark Streaming or Spark Structured Streaming applications, you can test if a file exists to start a graceful shutdown and stop all streaming queries manually. -### Python Support +## Python Support Python support can be enabled by setting `.spec.mainApplicationFile` with path to your python application. Optionally, the `.spec.pythonVersion` field can be used to set the major Python version of the docker image used to run the driver and executor containers. Below is an example showing part of a `SparkApplication` specification: @@ -654,20 +599,20 @@ spec: In order to use the dependencies that are hosted remotely, the following PySpark code can be used in Spark 2.4. -``` +```python python_dep_file_path = SparkFiles.get("python-dep.zip") spark.sparkContext.addPyFile(dep_file_path) ``` Note that Python binding for PySpark is available in Apache Spark 2.4. -### Monitoring +## Monitoring -The operator supports using the Spark metric system to expose metrics to a variety of sinks. Particularly, it is able to automatically configure the metric system to expose metrics to [Prometheus](https://prometheus.io/). Specifically, the field `.spec.monitoring` specifies how application monitoring is handled and particularly how metrics are to be reported. The metric system is configured through the configuration file `metrics.properties`, which gets its content from the field `.spec.monitoring.metricsProperties`. The content of [metrics.properties](../spark-docker/conf/metrics.properties) will be used by default if `.spec.monitoring.metricsProperties` is not specified. `.spec.monitoring.metricsPropertiesFile` overwrite the value `spark.metrics.conf` in spark.properties, and will not use content from `.spec.monitoring.metricsProperties`. You can choose to enable or disable reporting driver and executor metrics using the fields `.spec.monitoring.exposeDriverMetrics` and `.spec.monitoring.exposeExecutorMetrics`, respectively. +The operator supports using the Spark metric system to expose metrics to a variety of sinks. Particularly, it is able to automatically configure the metric system to expose metrics to [Prometheus](https://prometheus.io/). Specifically, the field `.spec.monitoring` specifies how application monitoring is handled and particularly how metrics are to be reported. The metric system is configured through the configuration file `metrics.properties`, which gets its content from the field `.spec.monitoring.metricsProperties`. The content of [metrics.properties](https://github.com/kubeflow/spark-operator/blob/master/spark-docker/conf/metrics.properties) will be used by default if `.spec.monitoring.metricsProperties` is not specified. `.spec.monitoring.metricsPropertiesFile` overwrite the value `spark.metrics.conf` in spark.properties, and will not use content from `.spec.monitoring.metricsProperties`. You can choose to enable or disable reporting driver and executor metrics using the fields `.spec.monitoring.exposeDriverMetrics` and `.spec.monitoring.exposeExecutorMetrics`, respectively. -Further, the field `.spec.monitoring.prometheus` specifies how metrics are exposed to Prometheus using the [Prometheus JMX exporter](https://github.com/prometheus/jmx_exporter). When `.spec.monitoring.prometheus` is specified, the operator automatically configures the JMX exporter to run as a Java agent. The only required field of `.spec.monitoring.prometheus` is `jmxExporterJar`, which specified the path to the Prometheus JMX exporter Java agent jar in the container. If you use the image `gcr.io/spark-operator/spark:v3.1.1-gcs-prometheus`, the jar is located at `/prometheus/jmx_prometheus_javaagent-0.11.0.jar`. The field `.spec.monitoring.prometheus.port` specifies the port the JMX exporter Java agent binds to and defaults to `8090` if not specified. The field `.spec.monitoring.prometheus.configuration` specifies the content of the configuration to be used with the JMX exporter. The content of [prometheus.yaml](../spark-docker/conf/prometheus.yaml) will be used by default if `.spec.monitoring.prometheus.configuration` is not specified. +Further, the field `.spec.monitoring.prometheus` specifies how metrics are exposed to Prometheus using the [Prometheus JMX exporter](https://github.com/prometheus/jmx_exporter). When `.spec.monitoring.prometheus` is specified, the operator automatically configures the JMX exporter to run as a Java agent. The only required field of `.spec.monitoring.prometheus` is `jmxExporterJar`, which specified the path to the Prometheus JMX exporter Java agent jar in the container. If you use the image `gcr.io/spark-operator/spark:v3.1.1-gcs-prometheus`, the jar is located at `/prometheus/jmx_prometheus_javaagent-0.11.0.jar`. The field `.spec.monitoring.prometheus.port` specifies the port the JMX exporter Java agent binds to and defaults to `8090` if not specified. The field `.spec.monitoring.prometheus.configuration` specifies the content of the configuration to be used with the JMX exporter. The content of [prometheus.yaml](https://github.com/kubeflow/spark-operator/blob/master/spark-docker/conf/prometheus.yaml) will be used by default if `.spec.monitoring.prometheus.configuration` is not specified. -Below is an example that shows how to configure the metric system to expose metrics to Prometheus using the Prometheus JMX exporter. Note that the JMX exporter Java agent jar is listed as a dependency and will be downloaded to where `.spec.dep.jarsDownloadDir` points to in Spark 2.3.x, which is `/var/spark-data/spark-jars` by default. Things are different in Spark 2.4 as dependencies will be downloaded to the local working directory instead in Spark 2.4. A complete example can be found in [examples/spark-pi-prometheus.yaml](../examples/spark-pi-prometheus.yaml). +Below is an example that shows how to configure the metric system to expose metrics to Prometheus using the Prometheus JMX exporter. Note that the JMX exporter Java agent jar is listed as a dependency and will be downloaded to where `.spec.dep.jarsDownloadDir` points to in Spark 2.3.x, which is `/var/spark-data/spark-jars` by default. Things are different in Spark 2.4 as dependencies will be downloaded to the local working directory instead in Spark 2.4. A complete example can be found in [examples/spark-pi-prometheus.yaml](https://github.com/kubeflow/spark-operator/blob/master/examples/spark-pi-prometheus.yaml). ```yaml spec: @@ -682,7 +627,7 @@ spec: The operator automatically adds the annotations such as `prometheus.io/scrape=true` on the driver and/or executor pods (depending on the values of `.spec.monitoring.exposeDriverMetrics` and `.spec.monitoring.exposeExecutorMetrics`) so the metrics exposed on the pods can be scraped by the Prometheus server in the same cluster. -### Dynamic Allocation +## Dynamic Allocation The operator supports a limited form of [Spark Dynamic Resource Allocation](http://spark.apache.org/docs/latest/job-scheduling.html#dynamic-resource-allocation) through the shuffle tracking enhancement introduced in Spark 3.0.0 *without needing an external shuffle service* (not available in the Kubernetes mode). See this [issue](https://issues.apache.org/jira/browse/SPARK-27963) for details on the enhancement. To enable this limited form of dynamic allocation, follow the example below: @@ -696,153 +641,3 @@ spec: ``` Note that if dynamic allocation is enabled, the number of executors to request initially is set to the bigger of `.spec.dynamicAllocation.initialExecutors` and `.spec.executor.instances` if both are set. - -## Working with SparkApplications - -### Creating a New SparkApplication - -A `SparkApplication` can be created from a YAML file storing the `SparkApplication` specification using either the `kubectl apply -f ` command or the `sparkctl create ` command. Please refer to the `sparkctl` [README](../sparkctl/README.md#create) for usage of the `sparkctl create` command. Once a `SparkApplication` is successfully created, the operator will receive it and submits the application as configured in the specification to run on the Kubernetes cluster. - -### Deleting a SparkApplication - -A `SparkApplication` can be deleted using either the `kubectl delete ` command or the `sparkctl delete ` command. Please refer to the `sparkctl` [README](../sparkctl/README.md#delete) for usage of the `sparkctl delete` -command. Deleting a `SparkApplication` deletes the Spark application associated with it. If the application is running when the deletion happens, the application is killed and all Kubernetes resources associated with the application are deleted or garbage collected. - -### Updating a SparkApplication - -A `SparkApplication` can be updated using the `kubectl apply -f ` command. When a `SparkApplication` is successfully updated, the operator will receive both the updated and old `SparkApplication` objects. If the specification of the `SparkApplication` has changed, the operator submits the application to run, using the updated specification. If the application is currently running, the operator kills the running application before submitting a new run with the updated specification. There is planned work to enhance the way `SparkApplication` updates are handled. For example, if the change was to increase the number of executor instances, instead of killing the currently running application and starting a new run, it is a much better user experience to incrementally launch the additional executor pods. - -### Checking a SparkApplication - -A `SparkApplication` can be checked using the `kubectl describe sparkapplications ` command. The output of the command shows the specification and status of the `SparkApplication` as well as events associated with it. The events communicate the overall process and errors of the `SparkApplication`. - -### Configuring Automatic Application Restart and Failure Handling - -The operator supports automatic application restart with a configurable `RestartPolicy` using the optional field -`.spec.restartPolicy`. The following is an example of a sample `RestartPolicy`: - - ```yaml - restartPolicy: - type: OnFailure - onFailureRetries: 3 - onFailureRetryInterval: 10 - onSubmissionFailureRetries: 5 - onSubmissionFailureRetryInterval: 20 -``` -The valid types of restartPolicy include `Never`, `OnFailure`, and `Always`. Upon termination of an application, -the operator determines if the application is subject to restart based on its termination state and the -`RestartPolicy` in the specification. If the application is subject to restart, the operator restarts it by -submitting a new run of it. For `OnFailure`, the Operator further supports setting limits on number of retries -via the `onFailureRetries` and `onSubmissionFailureRetries` fields. Additionally, if the submission retries has not been reached, -the operator retries submitting the application using a linear backoff with the interval specified by -`onFailureRetryInterval` and `onSubmissionFailureRetryInterval` which are required for both `OnFailure` and `Always` `RestartPolicy`. -The old resources like driver pod, ui service/ingress etc. are deleted if it still exists before submitting the new run, and a new driver pod is created by the submission -client so effectively the driver gets restarted. - -### Setting TTL for a SparkApplication - -The `v1beta2` version of the `SparkApplication` API starts having TTL support for `SparkApplication`s through a new optional field named `.spec.timeToLiveSeconds`, which if set, defines the Time-To-Live (TTL) duration in seconds for a SparkApplication after its termination. The `SparkApplication` object will be garbage collected if the current time is more than the `.spec.timeToLiveSeconds` since its termination. The example below illustrates how to use the field: - -```yaml -spec: - timeToLiveSeconds: 3600 -``` - -Note that this feature requires that informer cache resync to be enabled, which is true by default with a resync internal of 30 seconds. You can change the resync interval by setting the flag `-resync-interval=`. - -## Running Spark Applications on a Schedule using a ScheduledSparkApplication - -The operator supports running a Spark application on a standard [cron](https://en.wikipedia.org/wiki/Cron) schedule using objects of the `ScheduledSparkApplication` custom resource type. A `ScheduledSparkApplication` object specifies a cron schedule on which the application should run and a `SparkApplication` template from which a `SparkApplication` object for each run of the application is created. The following is an example `ScheduledSparkApplication`: - -```yaml -apiVersion: "sparkoperator.k8s.io/v1beta2" -kind: ScheduledSparkApplication -metadata: - name: spark-pi-scheduled - namespace: default -spec: - schedule: "@every 5m" - concurrencyPolicy: Allow - successfulRunHistoryLimit: 1 - failedRunHistoryLimit: 3 - template: - type: Scala - mode: cluster - image: gcr.io/spark/spark:v3.1.1 - mainClass: org.apache.spark.examples.SparkPi - mainApplicationFile: local:///opt/spark/examples/jars/spark-examples_2.12-3.1.1.jar - driver: - cores: 1 - memory: 512m - executor: - cores: 1 - instances: 1 - memory: 512m - restartPolicy: - type: Never -``` - -The concurrency of runs of an application is controlled by `.spec.concurrencyPolicy`, whose valid values are `Allow`, `Forbid`, and `Replace`, with `Allow` being the default. The meanings of each value is described below: -* `Allow`: more than one run of an application are allowed if for example the next run of the application is due even though the previous run has not completed yet. -* `Forbid`: no more than one run of an application is allowed. The next run of the application can only start if the previous run has completed. -* `Replace`: no more than one run of an application is allowed. When the next run of the application is due, the previous run is killed and the next run starts as a replacement. - -A scheduled `ScheduledSparkApplication` can be temporarily suspended (no future scheduled runs of the application will be triggered) by setting `.spec.suspend` to `true`. The schedule can be resumed by removing `.spec.suspend` or setting it to `false`. A `ScheduledSparkApplication` can have names of `SparkApplication` objects for the past runs of the application tracked in the `Status` section as discussed below. The numbers of past successful runs and past failed runs to keep track of are controlled by field `.spec.successfulRunHistoryLimit` and field `.spec.failedRunHistoryLimit`, respectively. The example above allows 1 past successful run and 3 past failed runs to be tracked. - -The `Status` section of a `ScheduledSparkApplication` object shows the time of the last run and the proposed time of the next run of the application, through `.status.lastRun` and `.status.nextRun`, respectively. The names of the `SparkApplication` object for the most recent run (which may or may not be running) of the application are stored in `.status.lastRunName`. The names of `SparkApplication` objects of the past successful runs of the application are stored in `.status.pastSuccessfulRunNames`. Similarly, the names of `SparkApplication` objects of the past failed runs of the application are stored in `.status.pastFailedRunNames`. - -Note that certain restart policies (specified in `.spec.template.restartPolicy`) may not work well with the specified schedule and concurrency policy of a `ScheduledSparkApplication`. For example, a restart policy of `Always` should never be used with a `ScheduledSparkApplication`. In most cases, a restart policy of `OnFailure` may not be a good choice as the next run usually picks up where the previous run left anyway. For these reasons, it's often the right choice to use a restart policy of `Never` as the example above shows. - -## Enabling Leader Election for High Availability - -The operator supports a high-availability (HA) mode, in which there can be more than one replicas of the operator, with only one of the replicas (the leader replica) actively operating. If the leader replica fails, the leader election process is engaged again to determine a new leader from the replicas available. The HA mode can be enabled through an optional leader election process. Leader election is disabled by default but can be enabled via a command-line flag. The following table summarizes the command-line flags relevant to leader election: - -| Flag | Default Value | Description | -| ------------- | ------------- | ------------- | -| `leader-election` | `false` | Whether to enable leader election (or the HA mode) or not. | -| `leader-election-lock-namespace` | `spark-operator` | Kubernetes namespace of the lock resource used for leader election. | -| `leader-election-lock-name` | `spark-operator-lock` | Name of the lock resource used for leader election. | -| `leader-election-lease-duration` | 15 seconds | Leader election lease duration. | -| `leader-election-renew-deadline` | 14 seconds | Leader election renew deadline. | -| `leader-election-retry-period` | 4 seconds | Leader election retry period. | - -## Enabling Resource Quota Enforcement - -The Spark Operator provides limited support for resource quota enforcement using a validating webhook. It will count the resources of non-terminal-phase SparkApplications and Pods, and determine whether a requested SparkApplication will fit given the remaining resources. ResourceQuota scope selectors are not supported, any ResourceQuota object that does not match the entire namespace will be ignored. Like the native Pod quota enforcement, current usage is updated asynchronously, so some overscheduling is possible. - -If you are running Spark applications in namespaces that are subject to resource quota constraints, consider enabling this feature to avoid driver resource starvation. Quota enforcement can be enabled with the command line arguments `-enable-resource-quota-enforcement=true`. It is recommended to also set `-webhook-fail-on-error=true`. - -## Running Multiple Instances Of The Operator Within The Same K8s Cluster - -If you need to run multiple instances of the operator within the same k8s cluster. Therefore, you need to make sure that the running instances should not compete for the same custom resources or pods. You can achieve this: - -Either: -* By specifying a different `namespace` flag for each instance of the operator. - -Or if you want your operator to watch specific resources that may exist in different namespaces: - -* You need to add custom labels on resources by defining for each instance of the operator a different set of labels in `-label-selector-filter (e.g. env=dev,app-type=spark)`. -* Run different `webhook` instances by specifying different `-webhook-config-name` flag for each deployment of the operator. -* Specify different `webhook-svc-name` and/or `webhook-svc-namespace` for each instance of the operator. -* Edit the job that generates the certificates `webhook-init` by specifying the namespace and the service name of each instance of the operator, `e.g. command: ["/usr/bin/gencerts.sh", "-n", "ns-op1", "-s", "spark-op1-webhook", "-p"]`. Where `spark-op1-webhook` should match what you have specified in `webhook-svc-name`. For instance, if you use the following [helm chart](https://github.com/helm/charts/tree/master/incubator/sparkoperator) to deploy the operator you may specify for each instance of the operator a different `--namespace` and `--name-template` arguments to make sure you generate a different certificate for each instance, e.g: -``` -helm install spark-op1 incubator/sparkoperator --namespace ns-op1 -helm install spark-op2 incubator/sparkoperator --namespace ns-op2 -``` -Will run 2 `webhook-init` jobs. Each job executes respectively the command: -``` -command: ["/usr/bin/gencerts.sh", "-n", "ns-op1", "-s", "spark-op1-webhook", "-p"`] -command: ["/usr/bin/gencerts.sh", "-n", "ns-op2", "-s", "spark-op2-webhook", "-p"`] -``` - -* Although resources are already filtered with respect to the specified labels on resources. You may also specify different labels in `-webhook-namespace-selector` and attach these labels to the namespaces on which you want the webhook to listen to. - -## Customizing the Operator - -To customize the operator, you can follow the steps below: - -1. Compile Spark distribution with Kubernetes support as per [Spark documentation](https://spark.apache.org/docs/latest/building-spark.html#building-with-kubernetes-support). -2. Create docker images to be used for Spark with [docker-image tool](https://spark.apache.org/docs/latest/running-on-kubernetes.html#docker-images). -3. Create a new operator image based on the above image. You need to modify the `FROM` tag in the [Dockerfile](https://github.com/kubeflow/spark-operator/blob/master/Dockerfile) with your Spark image. -4. Build and push your operator image built above. -5. Deploy the new image by modifying the [/manifest/spark-operator-install/spark-operator.yaml](https://github.com/kubeflow/spark-operator/blob/master/manifest/spark-operator-install/spark-operator.yaml) file and specifying your operator image.