diff --git a/docs/guidance/volcano-integration.md b/docs/guidance/volcano-integration.md index 3b2420ebcb..10b08480d4 100644 --- a/docs/guidance/volcano-integration.md +++ b/docs/guidance/volcano-integration.md @@ -6,76 +6,90 @@ Note that this is a new feature. Feedback and contributions welcome. ## Setup -### Install Volcano +### Step 1: Create a Kubernetes cluster with KinD +```shell +kind create cluster +``` -Volcano needs to be successfully installed in your Kubernetes cluster before enabling Volcano integration with KubeRay. Refer to the [Quick Start Guide](https://github.com/volcano-sh/volcano#quick-start-guide) for Volcano installation instructions. +### Step 2: Install Volcano -### Install KubeRay Operator with Batch Scheduling +Volcano needs to be successfully installed in your Kubernetes cluster before enabling Volcano integration with KubeRay. +Refer to the [Quick Start Guide](https://github.com/volcano-sh/volcano#quick-start-guide) for Volcano installation instructions. -Deploy the KubeRay Operator with the `--enable-batch-scheduler` flag to enable Volcano batch scheduling support. +### Step 3: Install KubeRay Operator with Batch Scheduling -When installing via Helm, you can set the following in your `values.yaml` file: +Deploy the KubeRay Operator with the `--enable-batch-scheduler` flag to enable Volcano batch scheduling support. -``` +When installing KubeRay Operator via Helm, you should either set `batchScheduler.enabled` to `true` in your +[`values.yaml`](https://github.com/ray-project/kuberay/blob/753dc05dbed5f6fe61db3a43b34a1b350f26324c/helm-chart/kuberay-operator/values.yaml#L48) +file: +```shell +# values.yaml file batchScheduler: enabled: true ``` -Or on the command line: - -``` -# helm install kuberay-operator --set batchScheduler.enabled=true +**or** pass `--set batchScheduler.enabled=true` flag when running on the command line: +```shell +# Install Helm chart with --enable-batch-scheduler flag set to true +helm install kuberay-operator kuberay/kuberay-operator --version ${KUBERAY_VERSION} --set batchScheduler.enabled=true ``` -## Run Ray Cluster with Volcano scheduler +Follow the [KubeRay installation documentation](https://github.com/ray-project/kuberay/blob/master/helm-chart/kuberay-operator/README.md) to install the latest stable KubeRay operator. -Add the `ray.io/scheduler-name: volcano` label to your RayCluster CR to submit the cluster pods to Volcano for scheduling. +### Step 4: Install a RayCluster with Volcano scheduler -Example: +RayCluster custom resource must include label `ray.io/scheduler-name: volcano` to submit the cluster pods to Volcano for scheduling. -``` -apiVersion: ray.io/v1alpha1 -kind: RayCluster -metadata: - name: test-cluster - labels: - ray.io/scheduler-name: volcano - volcano.sh/queue-name: kuberay-test-queue -spec: - rayVersion: '2.6.3' - headGroupSpec: - rayStartParams: {} - replicas: 1 - template: - spec: - containers: - - name: ray-head - image: rayproject/ray:2.6.3 - resources: - limits: - cpu: "1" - memory: "2Gi" - requests: - cpu: "1" - memory: "2Gi" - workerGroupSpecs: [] +```shell +# Path: kuberay/ray-operator/config/samples +# Includes label `ray.io/scheduler-name: volcano` in the metadata.labels +kubectl apply -f ray-cluster.volcano-scheduler.yaml + +# Check RayCluster +kubectl get pod -l ray.io/cluster=test-cluster-0 +# NAME READY STATUS RESTARTS AGE +# test-cluster-0-head-jj9bg 1/1 Running 0 36s ``` -The following labels can also be provided in the RayCluster metadata: +In addition, the following labels can also be provided in the RayCluster metadata: - `ray.io/priority-class-name`: the cluster priority class as defined by Kubernetes [here](https://kubernetes.io/docs/concepts/scheduling-eviction/pod-priority-preemption/#priorityclass). + - This label will only work after the creation of a `PriorityClass` resource + - ```shell + labels: + ray.io/scheduler-name: volcano + ray.io/priority-class-name: + ``` - `volcano.sh/queue-name`: the Volcano [queue](https://volcano.sh/en/docs/queue/) name the cluster will be submitted to. + - This label will only work after the creation of a `Queue` resource + - ```shell + labels: + ray.io/scheduler-name: volcano + volcano.sh/queue-name: + ``` If autoscaling is enabled, `minReplicas` will be used for gang scheduling, otherwise the desired `replicas` will be used. -### Example: Gang scheduling +### Step 5: Use Volcano for batch scheduling + +If you need some guidance, check out [examples](https://github.com/volcano-sh/volcano/tree/master/example) available. + +## Example + +Before going through the example, remove any ray clusters running to ensure successful run through of the example below. +```shell +kubectl delete raycluster --all +``` + +### Gang scheduling In this example, we'll walk through how gang scheduling works with Volcano and KubeRay. First, let's create a queue with a capacity of 4 CPUs and 6Gi of RAM: -``` -$ kubectl create -f - <