-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Mesos] expand coarse-grained mode docs #14059
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -180,30 +180,53 @@ Note that jars or python files that are passed to spark-submit should be URIs re | |
|
||
# Mesos Run Modes | ||
|
||
Spark can run over Mesos in two modes: "coarse-grained" (default) and "fine-grained". | ||
|
||
The "coarse-grained" mode will launch only *one* long-running Spark task on each Mesos | ||
machine, and dynamically schedule its own "mini-tasks" within it. The benefit is much lower startup | ||
overhead, but at the cost of reserving the Mesos resources for the complete duration of the | ||
application. | ||
|
||
Coarse-grained is the default mode. You can also set `spark.mesos.coarse` property to true | ||
to turn it on explicitly in [SparkConf](configuration.html#spark-properties): | ||
|
||
{% highlight scala %} | ||
conf.set("spark.mesos.coarse", "true") | ||
{% endhighlight %} | ||
|
||
In addition, for coarse-grained mode, you can control the maximum number of resources Spark will | ||
acquire. By default, it will acquire *all* cores in the cluster (that get offered by Mesos), which | ||
only makes sense if you run just one application at a time. You can cap the maximum number of cores | ||
using `conf.set("spark.cores.max", "10")` (for example). | ||
|
||
In "fine-grained" mode, each Spark task runs as a separate Mesos task. This allows | ||
multiple instances of Spark (and other frameworks) to share machines at a very fine granularity, | ||
where each application gets more or fewer machines as it ramps up and down, but it comes with an | ||
additional overhead in launching each task. This mode may be inappropriate for low-latency | ||
requirements like interactive queries or serving web requests. | ||
Spark can run over Mesos in two modes: "coarse-grained" (default) and | ||
"fine-grained". | ||
|
||
## Coarse-Grained | ||
|
||
In "coarse-grained" mode, each Spark executor runs as a single Mesos | ||
task. Spark executors are sized according to the following | ||
configuration variables: | ||
|
||
* Executor memory: `spark.executor.memory` | ||
* Executor cores: `spark.executor.cores` | ||
* Number of executors: `spark.cores.max`/`spark.executor.cores` | ||
|
||
Please see the [Spark Configuration](configuration.html) page for | ||
details and default values. | ||
|
||
Executors are brought up eagerly when the application starts, until | ||
`spark.cores.max` is reached. If you don't set `spark.cores.max`, the | ||
Spark application will reserve all resources offered to it by Mesos, | ||
so we of course urge you to set this variable in any sort of | ||
multi-tenant cluster, including one which runs multiple concurrent | ||
Spark applications. | ||
|
||
The scheduler will start executors round-robin on the offers Mesos | ||
gives it, but there are no spread guarantees, as Mesos does not | ||
provide such guarantees on the offer stream. | ||
|
||
The benefit of coarse-grained mode is much lower startup overhead, but | ||
at the cost of reserving Mesos resources for the complete duration of | ||
the application. To configure your job to dynamically adjust to its | ||
resource requirements, look into | ||
[Dynamic Allocation](#dynamic-resource-allocation-with-mesos). | ||
|
||
## Fine-Grained | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fine-Grained Mode |
||
In "fine-grained" mode, each Spark task inside the Spark executor runs | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. can you add a message saying this is deprecated as of spark 2.0? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I just made a separate PR for that: #14078 Does that work? Also, will this be merged into Spark 2.0.0? I need to make the copy consistent with the version. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yup both should go into 2.0. |
||
as a separate Mesos task. This allows multiple instances of Spark (and | ||
other frameworks) to share cores at a very fine granularity, where | ||
each application gets more or fewer cores as it ramps up and down, but | ||
it comes with an additional overhead in launching each task. This mode | ||
may be inappropriate for low-latency requirements like interactive | ||
queries or serving web requests. | ||
|
||
Note that while Spark tasks in fine-grained will relinquish cores as | ||
they terminate, they will not relinquish memory, as the JVM does not | ||
give memory back to the Operating System. Neither will executors | ||
terminate when they're idle. | ||
|
||
To run in fine-grained mode, set the `spark.mesos.coarse` property to false in your | ||
[SparkConf](configuration.html#spark-properties): | ||
|
@@ -212,7 +235,9 @@ To run in fine-grained mode, set the `spark.mesos.coarse` property to false in y | |
conf.set("spark.mesos.coarse", "false") | ||
{% endhighlight %} | ||
|
||
You may also make use of `spark.mesos.constraints` to set attribute based constraints on mesos resource offers. By default, all resource offers will be accepted. | ||
You may also make use of `spark.mesos.constraints` to set | ||
attribute-based constraints on Mesos resource offers. By default, all | ||
resource offers will be accepted. | ||
|
||
{% highlight scala %} | ||
conf.set("spark.mesos.constraints", "os:centos7;us-east-1:false") | ||
|
@@ -246,7 +271,7 @@ In either case, HDFS runs separately from Hadoop MapReduce, without being schedu | |
|
||
# Dynamic Resource Allocation with Mesos | ||
|
||
Mesos supports dynamic allocation only with coarse-grain mode, which can resize the number of | ||
Mesos supports dynamic allocation only with coarse-grained mode, which can resize the number of | ||
executors based on statistics of the application. For general information, | ||
see [Dynamic Resource Allocation](job-scheduling.html#dynamic-resource-allocation). | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Coarse-Grained Mode