Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-41514][K8S][DOCS] Add PVC-oriented executor pod allocation doc and revise config name #39058

Closed
wants to merge 4 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 33 additions & 0 deletions docs/running-on-kubernetes.md
Original file line number Diff line number Diff line change
Expand Up @@ -354,6 +354,27 @@ spark.kubernetes.executor.volumes.persistentVolumeClaim.data.mount.readOnly=fals

For a complete list of available options for each supported type of volumes, please refer to the [Spark Properties](#spark-properties) section below.

### PVC-oriented executor pod allocation

Since disks are one of the important resource types, Spark driver provides a fine-grained control
via a set of configurations. For example, by default, on-demand PVCs are owned by executors and
the lifecycle of PVCs are tightly coupled with its owner executors.
However, on-demand PVCs can be owned by driver and reused by another executors during the Spark job's
lifetime with the following options. This reduces the overhead of PVC creation and deletion.

```
spark.kubernetes.driver.ownPersistentVolumeClaim=true
spark.kubernetes.driver.reusePersistentVolumeClaim=true
```

In addition, since Spark 3.4, Spark driver is able to do PVC-oriented executor allocation which means
Spark counts the total number of created PVCs which the job can have, and holds on a new executor creation
if the driver owns the maximum number of PVCs. This helps the transition of the existing PVC from one executor
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the maximum number of PVCs limited by cluster capacity?

Copy link
Member Author

@dongjoon-hyun dongjoon-hyun Dec 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I used the cluster = the spark cluster = the spark job = the spark driver + executor. It could be misleading as the k8s cluster. Let me revise it. Thank you!

to another executor.
```
spark.kubernetes.driver.waitToReusePersistentVolumeClaim=true
```

## Local Storage

Spark supports using volumes to spill data during shuffles and other operations. To use a volume as local storage, the volume's name should starts with `spark-local-dir-`, for example:
Expand Down Expand Up @@ -1475,6 +1496,18 @@ See the [configuration page](configuration.html) for information on Spark config
</td>
<td>3.2.0</td>
</tr>
<tr>
<td><code>spark.kubernetes.driver.waitToReusePersistentVolumeClaim</code></td>
<td><code>false</code></td>
<td>
If true, driver pod counts the number of created on-demand persistent volume claims
and wait if the number is greater than or equal to the total number of volumes which
the Spark job is able to have. This config requires both
<code>spark.kubernetes.driver.ownPersistentVolumeClaim=true</code> and
<code>spark.kubernetes.driver.reusePersistentVolumeClaim=true.</code>
</td>
<td>3.4.0</td>
</tr>
<tr>
<td><code>spark.kubernetes.executor.disableConfigMap</code></td>
<td><code>false</code></td>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ import java.util.concurrent.TimeUnit

import org.apache.spark.deploy.k8s.Constants._
import org.apache.spark.internal.Logging
import org.apache.spark.internal.config.{ConfigBuilder, DYN_ALLOCATION_MAX_EXECUTORS, EXECUTOR_INSTANCES, PYSPARK_DRIVER_PYTHON, PYSPARK_PYTHON}
import org.apache.spark.internal.config.{ConfigBuilder, PYSPARK_DRIVER_PYTHON, PYSPARK_PYTHON}

private[spark] object Config extends Logging {

Expand Down Expand Up @@ -101,12 +101,11 @@ private[spark] object Config extends Logging {
.createWithDefault(true)

val KUBERNETES_DRIVER_WAIT_TO_REUSE_PVC =
ConfigBuilder("spark.kubernetes.driver.waitToReusePersistentVolumeClaims")
ConfigBuilder("spark.kubernetes.driver.waitToReusePersistentVolumeClaim")
.doc("If true, driver pod counts the number of created on-demand persistent volume claims " +
s"and wait if the number is greater than or equal to the maximum which is " +
s"${EXECUTOR_INSTANCES.key} or ${DYN_ALLOCATION_MAX_EXECUTORS.key}. " +
s"This config requires both ${KUBERNETES_DRIVER_OWN_PVC.key}=true and " +
s"${KUBERNETES_DRIVER_REUSE_PVC.key}=true.")
"and wait if the number is greater than or equal to the total number of volumes which " +
"the Spark job is able to have. This config requires both " +
s"${KUBERNETES_DRIVER_OWN_PVC.key}=true and ${KUBERNETES_DRIVER_REUSE_PVC.key}=true.")
.version("3.4.0")
.booleanConf
.createWithDefault(false)
Expand Down