Skip to content

Commit

Permalink
Add a document to outline the default settings for rayStartParams i…
Browse files Browse the repository at this point in the history
…n Kuberay (ray-project#1057)

Add a document to outline the default settings for `rayStartParams` in Kuberay
  • Loading branch information
Yicheng-Lu-llll authored May 22, 2023
1 parent a794d1f commit 676e99f
Show file tree
Hide file tree
Showing 16 changed files with 159 additions and 27 deletions.
37 changes: 37 additions & 0 deletions docs/guidance/rayStartParams.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@

## Default Ray Start Parameters for KubeRay

This document outlines the default settings for `rayStartParams` in KubeRay.


### Options Exclusive to the Head Pod

- `--dashboard-host`: Host for the dashboard server, either `localhost` (127.0.0.1) or `0.0.0.0`.
The latter setting exposes the Ray dashboard outside the Ray cluster, which is required when [ingress](https://github.com/ray-project/kuberay/blob/master/docs/guidance/ingress.md) is utilized for Ray cluster access.
The default value for both Ray and KubeRay 0.5.0 is `localhost`. Please note that this will change for versions of KubeRay later than 0.5.0, where the default setting will be `0.0.0.0`.

- `--no-monitor`: This option disables the monitor and autoscaler in the **user's container**. It will be automatically set when [autoscaling](https://github.com/ray-project/kuberay/blob/master/docs/guidance/autoscaler.md) is enabled. The autoscaling feature introduces the autoscaler as a sidecar container within the head pod, thereby obviating the need for a monitor and autoscaler in the **user's container**. See [PR #13505](https://github.com/ray-project/ray/pull/13505) for more details. Modification is not recommended.


- `--port`: Port for the GCS server. The port is set to `6379` by default. Please ensure that this value matches the `gcs-server` container port in Ray head container.

- `--redis-password`: Redis password for an external Redis, necessary when [fault tolerance](https://github.com/ray-project/kuberay/blob/master/docs/guidance/gcs-ft.md) is enabled.
The default value is `""` after Ray 2.3.0. See [#929](https://github.com/ray-project/kuberay/pull/929) for more details.

### Options Exclusive to the worker Pods

- `--address`: Address of the GCS server. Worker pods utilize this address to establish a connection with the Ray cluster. By default, this address takes the form `<FQDN>:<GCS_PORT>`. The `GCS_PORT` corresponds to the value set in the `--port` option. For more insights on Fully Qualified Domain Name (FQDN), refer to [PR #938](https://github.com/ray-project/kuberay/pull/938) and [PR #951](https://github.com/ray-project/kuberay/pull/951).

### Options Applicable to Both Head and Worker Pods

- `--block`: This option blocks the ray start command indefinitely. It will be automatically set by KubeRay. See [PR #675](https://github.com/ray-project/kuberay/pull/675) for more details. Modification is not recommended.

- `--memory`: Amount of memory on this Ray node. Default is determined by Ray container resource limits. Modify Ray container resource limits instead of this option. See [PR #170](https://github.com/ray-project/kuberay/pull/170).

- `--metrics-export-port`: Port for exposing Ray metrics through a Prometheus endpoint. The port is set to `8080` by default. Please ensure that this value matches the `metrics` container port if you need to customize it. See [PR #954](https://github.com/ray-project/kuberay/pull/954) and [prometheus-grafana doc](https://github.com/ray-project/kuberay/blob/master/docs/guidance/prometheus-grafana.md) for more details.

- `--num-cpus`: Number of logical CPUs on this Ray node. Default is determined by Ray container resource limits. Modify Ray container resource limits instead of this option. See [PR #170](https://github.com/ray-project/kuberay/pull/170). However, it is sometimes useful to override this autodetected value. For example, setting `num-cpus:"0"` for the Ray head pod will prevent Ray workloads with non-zero CPU requirements from being scheduled on the head.

- `--num-gpus`: Number of GPUs on this Ray node. Default is determined by Ray container resource limits. Modify Ray container resource limits instead of this option. See [PR #170](https://github.com/ray-project/kuberay/pull/170).


10 changes: 6 additions & 4 deletions ray-operator/config/samples/ray-cluster.autoscaler.large.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -57,11 +57,11 @@ spec:
memory: "512Mi"
# Ray head pod template
headGroupSpec:
# the following params are used to complete the ray start: ray start --head --block --port=6379 ...
# The `rayStartParams` are used to configure the `ray start` command.
# See https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayStartParams.md for the default settings of `rayStartParams` in KubeRay.
# See https://docs.ray.io/en/latest/cluster/cli.html#ray-start for all available options in `rayStartParams`.
rayStartParams:
# Flag "no-monitor" will be automatically set when autoscaling is enabled.
dashboard-host: '0.0.0.0'
# num-cpus: '14' # can be auto-completed from the limits
# Use `resources` to optionally specify custom resource annotations for the Ray node.
# The value of `resources` is a string-integer mapping.
# Currently, `resources` must be provided in the specific format demonstrated below:
Expand Down Expand Up @@ -112,7 +112,9 @@ spec:
# - raycluster-complete-worker-large-group-bdtwh
# - raycluster-complete-worker-large-group-hv457
# - raycluster-complete-worker-large-group-k8tj7
# the following params are used to complete the ray start: ray start --block --node-ip-address= ...
# The `rayStartParams` are used to configure the `ray start` command.
# See https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayStartParams.md for the default settings of `rayStartParams` in KubeRay.
# See https://docs.ray.io/en/latest/cluster/cli.html#ray-start for all available options in `rayStartParams`.
rayStartParams: {}
#pod template
template:
Expand Down
9 changes: 6 additions & 3 deletions ray-operator/config/samples/ray-cluster.autoscaler.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -49,10 +49,11 @@ spec:
memory: "512Mi"
# Ray head pod template
headGroupSpec:
# the following params are used to complete the ray start: ray start --head --block ...
# The `rayStartParams` are used to configure the `ray start` command.
# See https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayStartParams.md for the default settings of `rayStartParams` in KubeRay.
# See https://docs.ray.io/en/latest/cluster/cli.html#ray-start for all available options in `rayStartParams`.
rayStartParams:
dashboard-host: '0.0.0.0'
# num-cpus: '1' # can be auto-completed from the limits
# Use `resources` to optionally specify custom resource annotations for the Ray node.
# The value of `resources` is a string-integer mapping.
# Currently, `resources` must be provided in the specific format demonstrated below:
Expand Down Expand Up @@ -112,7 +113,9 @@ spec:
# - raycluster-complete-worker-small-group-bdtwh
# - raycluster-complete-worker-small-group-hv457
# - raycluster-complete-worker-small-group-k8tj7
# the following params are used to complete the ray start: ray start --block ...
# The `rayStartParams` are used to configure the `ray start` command.
# See https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayStartParams.md for the default settings of `rayStartParams` in KubeRay.
# See https://docs.ray.io/en/latest/cluster/cli.html#ray-start for all available options in `rayStartParams`.
rayStartParams: {}
#pod template
template:
Expand Down
8 changes: 6 additions & 2 deletions ray-operator/config/samples/ray-cluster.complete.large.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,9 @@ spec:
# for the head group, replicas should always be 1.
# headGroupSpec.replicas is deprecated in KubeRay >= 0.3.0.
replicas: 1
# the following params are used to complete the ray start: ray start --head --block --dashboard-host: '0.0.0.0' ...
# The `rayStartParams` are used to configure the `ray start` command.
# See https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayStartParams.md for the default settings of `rayStartParams` in KubeRay.
# See https://docs.ray.io/en/latest/cluster/cli.html#ray-start for all available options in `rayStartParams`.
rayStartParams:
dashboard-host: '0.0.0.0'
# pod template
Expand Down Expand Up @@ -80,7 +82,9 @@ spec:
# - raycluster-complete-worker-large-group-bdtwh
# - raycluster-complete-worker-large-group-hv457
# - raycluster-complete-worker-large-group-k8tj7
# the following params are used to complete the ray start: ray start --block ...
# The `rayStartParams` are used to configure the `ray start` command.
# See https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayStartParams.md for the default settings of `rayStartParams` in KubeRay.
# See https://docs.ray.io/en/latest/cluster/cli.html#ray-start for all available options in `rayStartParams`.
rayStartParams: {}
#pod template
template:
Expand Down
8 changes: 6 additions & 2 deletions ray-operator/config/samples/ray-cluster.complete.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,9 @@ spec:
# Kubernetes Service Type. This is an optional field, and the default value is ClusterIP.
# Refer to https://kubernetes.io/docs/concepts/services-networking/service/#publishing-services-service-types.
serviceType: ClusterIP
# the following params are used to complete the ray start: ray start --head --block --dashboard-host: '0.0.0.0' ...
# The `rayStartParams` are used to configure the `ray start` command.
# See https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayStartParams.md for the default settings of `rayStartParams` in KubeRay.
# See https://docs.ray.io/en/latest/cluster/cli.html#ray-start for all available options in `rayStartParams`.
rayStartParams:
dashboard-host: '0.0.0.0'
# pod template
Expand Down Expand Up @@ -80,7 +82,9 @@ spec:
# - raycluster-complete-worker-small-group-bdtwh
# - raycluster-complete-worker-small-group-hv457
# - raycluster-complete-worker-small-group-k8tj7
# the following params are used to complete the ray start: ray start --block
# The `rayStartParams` are used to configure the `ray start` command.
# See https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayStartParams.md for the default settings of `rayStartParams` in KubeRay.
# See https://docs.ray.io/en/latest/cluster/cli.html#ray-start for all available options in `rayStartParams`.
rayStartParams: {}
#pod template
template:
Expand Down
12 changes: 11 additions & 1 deletion ray-operator/config/samples/ray-cluster.external-redis.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -90,9 +90,11 @@ spec:
rayVersion: '2.4.0'
headGroupSpec:
replicas: 1
# The `rayStartParams` are used to configure the `ray start` command.
# See https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayStartParams.md for the default settings of `rayStartParams` in KubeRay.
# See https://docs.ray.io/en/latest/cluster/cli.html#ray-start for all available options in `rayStartParams`.
rayStartParams:
dashboard-host: "0.0.0.0"
num-cpus: "1" # can be auto-completed from the limits
# redis-password should match "requirepass" in redis.conf in the ConfigMap above.
# Ray 2.3.0 changes the default redis password from "5241590000000000" to "".
redis-password: $REDIS_PASSWORD
Expand All @@ -102,6 +104,11 @@ spec:
containers:
- name: ray-head
image: rayproject/ray:2.4.0
resources:
limits:
cpu: "1"
requests:
cpu: "200m"
env:
# RAY_REDIS_ADDRESS can force ray to use external redis
- name: RAY_REDIS_ADDRESS
Expand Down Expand Up @@ -131,6 +138,9 @@ spec:
maxReplicas: 2
# logical group name, for this called small-group, also can be functional
groupName: small-group
# The `rayStartParams` are used to configure the `ray start` command.
# See https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayStartParams.md for the default settings of `rayStartParams` in KubeRay.
# See https://docs.ray.io/en/latest/cluster/cli.html#ray-start for all available options in `rayStartParams`.
rayStartParams: {}
#pod template
template:
Expand Down
5 changes: 3 additions & 2 deletions ray-operator/config/samples/ray-cluster.head-command.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11,10 +11,11 @@ spec:
rayVersion: '2.4.0' # should match the Ray version in the image of the containers
# Ray head pod template
headGroupSpec:
# the following params are used to complete the ray start: ray start --head --block --redis-port=6379 ...
# The `rayStartParams` are used to configure the `ray start` command.
# See https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayStartParams.md for the default settings of `rayStartParams` in KubeRay.
# See https://docs.ray.io/en/latest/cluster/cli.html#ray-start for all available options in `rayStartParams`.
rayStartParams:
dashboard-host: '0.0.0.0'
num-cpus: '1' # can be auto-completed from the limits
#pod template
template:
spec:
Expand Down
18 changes: 14 additions & 4 deletions ray-operator/config/samples/ray-cluster.heterogeneous.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -38,16 +38,22 @@ spec:
######################headGroupSpecs#################################
# Ray head pod template
headGroupSpec:
# the following params are used to complete the ray start: ray start --head --block ...
# The `rayStartParams` are used to configure the `ray start` command.
# See https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayStartParams.md for the default settings of `rayStartParams` in KubeRay.
# See https://docs.ray.io/en/latest/cluster/cli.html#ray-start for all available options in `rayStartParams`.
rayStartParams:
dashboard-host: '0.0.0.0'
num-cpus: '1' # can be auto-completed from Ray container resource limits
#pod template
template:
spec:
containers:
- name: ray-head
image: rayproject/ray:2.4.0
resources:
limits:
cpu: "1"
requests:
cpu: "200m"
volumeMounts:
- mountPath: /opt
name: config
Expand All @@ -72,7 +78,9 @@ spec:
maxReplicas: 10
# logical group name, for this called small-group, also can be functional
groupName: small-group
# the following params are used to complete the ray start: ray start --block ...
# The `rayStartParams` are used to configure the `ray start` command.
# See https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayStartParams.md for the default settings of `rayStartParams` in KubeRay.
# See https://docs.ray.io/en/latest/cluster/cli.html#ray-start for all available options in `rayStartParams`.
rayStartParams: {}
#pod template
template:
Expand Down Expand Up @@ -106,7 +114,9 @@ spec:
# workersToDelete:
#- raycluster-heterogeneous-worker-medium-group-7bv5h
# - worker-4k2ih
# the following params are used to complete the ray start: ray start --block --node-ip-address= ...
# The `rayStartParams` are used to configure the `ray start` command.
# See https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayStartParams.md for the default settings of `rayStartParams` in KubeRay.
# See https://docs.ray.io/en/latest/cluster/cli.html#ray-start for all available options in `rayStartParams`.
rayStartParams: {}
#pod template
template:
Expand Down
5 changes: 3 additions & 2 deletions ray-operator/config/samples/ray-cluster.mini.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -13,10 +13,11 @@ spec:
rayVersion: '2.4.0' # should match the Ray version in the image of the containers
# Ray head pod template
headGroupSpec:
# the following params are used to complete the ray start: ray start --head --block --redis-port=6379 ...
# The `rayStartParams` are used to configure the `ray start` command.
# See https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayStartParams.md for the default settings of `rayStartParams` in KubeRay.
# See https://docs.ray.io/en/latest/cluster/cli.html#ray-start for all available options in `rayStartParams`.
rayStartParams:
dashboard-host: '0.0.0.0'
num-cpus: '1' # can be auto-completed from the limits
#pod template
template:
spec:
Expand Down
10 changes: 8 additions & 2 deletions ray-operator/config/samples/ray-cluster.separate-ingress.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11,16 +11,22 @@ spec:
headGroupSpec:
serviceType: NodePort
replicas: 1
# The `rayStartParams` are used to configure the `ray start` command.
# See https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayStartParams.md for the default settings of `rayStartParams` in KubeRay.
# See https://docs.ray.io/en/latest/cluster/cli.html#ray-start for all available options in `rayStartParams`.
rayStartParams:
port: '6379'
dashboard-host: '0.0.0.0'
num-cpus: '1' # can be auto-completed from the limits
#pod template
template:
spec:
containers:
- name: ray-head
image: rayproject/ray:2.4.0
resources:
limits:
cpu: "1"
requests:
cpu: "200m"
ports:
- containerPort: 6379
name: gcs-server
Expand Down
6 changes: 6 additions & 0 deletions ray-operator/config/samples/ray-cluster.tls.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,9 @@ spec:
rayVersion: '2.4.0'
# Ray head pod configuration
headGroupSpec:
# The `rayStartParams` are used to configure the `ray start` command.
# See https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayStartParams.md for the default settings of `rayStartParams` in KubeRay.
# See https://docs.ray.io/en/latest/cluster/cli.html#ray-start for all available options in `rayStartParams`.
rayStartParams:
dashboard-host: '0.0.0.0'
# pod template
Expand Down Expand Up @@ -96,6 +99,9 @@ spec:
minReplicas: 1
maxReplicas: 10
groupName: small-group
# The `rayStartParams` are used to configure the `ray start` command.
# See https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayStartParams.md for the default settings of `rayStartParams` in KubeRay.
# See https://docs.ray.io/en/latest/cluster/cli.html#ray-start for all available options in `rayStartParams`.
rayStartParams: {}
#pod template
template:
Expand Down
6 changes: 6 additions & 0 deletions ray-operator/config/samples/ray-service.autoscaler.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,9 @@ spec:
memory: "1000Mi"
######################headGroupSpecs#################################
headGroupSpec:
# The `rayStartParams` are used to configure the `ray start` command.
# See https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayStartParams.md for the default settings of `rayStartParams` in KubeRay.
# See https://docs.ray.io/en/latest/cluster/cli.html#ray-start for all available options in `rayStartParams`.
rayStartParams: {"num-cpus": "0"}
#pod template
template:
Expand Down Expand Up @@ -86,6 +89,9 @@ spec:
maxReplicas: 5
# logical group name, for this called small-group, also can be functional
groupName: small-group
# The `rayStartParams` are used to configure the `ray start` command.
# See https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayStartParams.md for the default settings of `rayStartParams` in KubeRay.
# See https://docs.ray.io/en/latest/cluster/cli.html#ray-start for all available options in `rayStartParams`.
rayStartParams: {}
#pod template
template:
Expand Down
8 changes: 6 additions & 2 deletions ray-operator/config/samples/ray_v1alpha1_rayjob.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -19,10 +19,11 @@ spec:
rayVersion: '2.4.0' # should match the Ray version in the image of the containers
# Ray head pod template
headGroupSpec:
# the following params are used to complete the ray start: ray start --head --block --redis-port=6379 ...
# The `rayStartParams` are used to configure the `ray start` command.
# See https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayStartParams.md for the default settings of `rayStartParams` in KubeRay.
# See https://docs.ray.io/en/latest/cluster/cli.html#ray-start for all available options in `rayStartParams`.
rayStartParams:
dashboard-host: '0.0.0.0'
num-cpus: '1' # can be auto-completed from the limits
#pod template
template:
spec:
Expand Down Expand Up @@ -63,6 +64,9 @@ spec:
maxReplicas: 5
# logical group name, for this called small-group, also can be functional
groupName: small-group
# The `rayStartParams` are used to configure the `ray start` command.
# See https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayStartParams.md for the default settings of `rayStartParams` in KubeRay.
# See https://docs.ray.io/en/latest/cluster/cli.html#ray-start for all available options in `rayStartParams`.
rayStartParams: {}
#pod template
template:
Expand Down
Loading

0 comments on commit 676e99f

Please sign in to comment.