SPARK-1713. Use a thread pool for launching executors. #663

sryza · 2014-05-06T08:24:46Z

This patch copies the approach used in the MapReduce application master for launching containers.

AmplabJenkins · 2014-05-06T08:27:57Z

Merged build triggered.

AmplabJenkins · 2014-05-06T08:28:03Z

Merged build started.

AmplabJenkins · 2014-05-06T09:58:51Z

Merged build finished.

AmplabJenkins · 2014-05-06T09:58:51Z

Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14710/

vanzin · 2014-05-06T16:30:43Z

yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala

+    10, Integer.MAX_VALUE,
+    1, TimeUnit.MINUTES,
+    new LinkedBlockingQueue[Runnable](),
+    Executors.defaultThreadFactory())


nit: use guava's ThreadFactoryBuilder and name these threads?

sryza · 2014-05-06T16:34:59Z

Updated patch uses Guava ThreadFactoryBuilder

AmplabJenkins · 2014-05-06T16:37:58Z

Merged build triggered.

AmplabJenkins · 2014-05-06T16:38:07Z

Merged build started.

AmplabJenkins · 2014-05-06T17:14:09Z

Merged build finished. All automated tests passed.

AmplabJenkins · 2014-05-06T17:14:09Z

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14722/

tgravescs · 2014-05-06T17:50:29Z

This doesn't dynamically change the thread pool size. Its basically limited at 10 since its using a LinkedBlockingQueue. And the value of the maximumPoolSize doesn't have any effect. The MR AM has code to change the size and also has a config for max limit.

srowen · 2014-05-06T19:22:53Z

See also Executors.newCachedThreadPool if the intent is to always create a new thread if no idle ones are available. That's closer in behavior to always creating a new Thread. You may still get loads of simultaneous threads and may not save many allocations.

Capping the number of threads helps those two factors (why wouldn't the max pool size have effect?) but at some level this means this may block when the executor is busy. Right now the no-arg LinkedBlockingQueue will go on accepting work until it has 2 billion entries, if the executor can't keep up. This is probably not ideal, and should be capped at something more reasonable. Not sure if the possibility of blocking the thread that invokes the task is a problem or not.

How about setting the max pool size, and queue capacity, to a value that is deemed quite large? like, you never want more than 1000 threads and 100K tasks queued? at least you're putting some sane defenses up against these situations rather than make 100K threads.

sryza · 2014-05-06T20:17:11Z

Ah, misunderstood how ThreadPoolFactory worked.

To take a step back, we expect to have a burst (max low thousands) of executors to run at the beginning of an application. Then the only time we'll use the thread pool is when executors fail and we need to start new ones.

So the behavior we want is to be able to scale the pool up to a large number of threads, and then shrink it after the initial burst is done. We'd probably like to limit this number of threads at some number, but keep queueing up containers to start.

I think the right solution is probably to create a pool with a large core size (500?), and then, after the initial containers are launched, bring this down to something more manageable like 10.

sryza · 2014-05-06T20:23:21Z

Or, actually, if we use allowCoreThreadTimeout then we don't need to change the core size.

mridulm · 2014-05-06T20:41:57Z

That sounds reasonable approach - will know more when PR is updated.

The initial idea behind separate Thread for each was exactly that - we expected low hundreds as upper limit (at that time); we needed to send request to RM asap while not being sure how long the rpc can take. This is, clearly, suboptimal leading to unbounded increase in threads during initial allocation burst.

AmplabJenkins · 2014-05-06T20:57:58Z

Merged build triggered.

AmplabJenkins · 2014-05-06T20:58:07Z

Merged build started.

tgravescs · 2014-05-06T21:38:14Z

We don't necessarily need to start with 500 threads. In the very least it should be configurable. In our tests of MR we found that 500 threads didn't actually help, we found that 10's worked just fine (20-30), even when launching thousands of tasks. If you actually get 500 threads it can cause memory issues. I guess if we have them timeout quicker it might not be as much of an issues (I believe MR's timeout is high), but I would like to see it a config just in case it is. Also 1 second might be a bit to short if you actually want to reuse them as the AM heartbeats to the RM every 5 seconds, so if you get one round launch them, heartbeat back in, it will have shutdown those threads even though it could reuse them.

@srowen See the javadoc on the ThreadPoolExecutor about LInkedBlockingQueue:

Unbounded queues. Using an unbounded queue (for example a LinkedBlockingQueue without a predefined capacity) will cause new tasks to wait in the queue when all corePoolSize threads are busy. Thus, no more than corePoolSize threads will ever be created. (And the value of the maximumPoolSize therefore doesn't have any effect.) This may be appropriate when each task is completely independent of others, so tasks cannot affect each others execution; for example, in a web page server. While this style of queuing can be useful in smoothing out transient bursts of requests, it admits the possibility of unbounded work queue growth when commands continue to arrive on average faster than they can be processed.

Also perhaps adding a comment there to explain it would be good as its easy to miss that max isn't used.

AmplabJenkins · 2014-05-06T22:28:49Z

Merged build finished.

AmplabJenkins · 2014-05-06T22:28:49Z

Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14734/

sryza · 2014-05-06T22:55:34Z

By default, ThreadPoolExecutor doesn't start all the core threads at once. Though I just learned that when it gets tasks, it will start new ones even if others are idle, so I agree that 500 seems high. Uploading a patch that defaults to 25 and makes it configurable.

AmplabJenkins · 2014-05-06T23:07:58Z

Merged build triggered.

AmplabJenkins · 2014-05-06T23:08:03Z

Merged build started.

AmplabJenkins · 2014-05-06T23:49:17Z

Merged build finished. All automated tests passed.

AmplabJenkins · 2014-05-06T23:49:17Z

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14740/

tgravescs · 2014-06-04T18:52:27Z

@sryza can you update this to the latest master and also update the yarn alpha side. Since we are adding a config I think it should apply to both of them.

tgravescs · 2014-06-04T19:37:17Z

Also I'm curious if you ran any benchmarks test startup time?

andrewor14 · 2014-08-25T19:29:44Z

test this please

andrewor14 · 2014-08-25T19:30:09Z

@sryza When you have a chance can you upmerge this to master and resolve the conflicts?

tgravescs · 2014-08-25T19:39:55Z

@andrewor14 fyi - I want to finish reviewing and merge in #2020 which will probably affect many of the existing PRs. So you might want to hold off on upmerging.

andrewor14 · 2014-08-25T19:46:41Z

@tgravescs What is the approximate timeline on the refactor? We are cutting a release candidate for 1.1 soon and at this point we are basically considering only bug fixes and other minor changes. The changes in #2020 seem a little big.

tgravescs · 2014-08-25T20:59:45Z

the changes in 2020 are meant only for master, not 1.1. At this point I don't see this one going into 1.1 either. Or did you hit some particular issues this is needed?

tgravescs · 2014-08-25T21:00:11Z

Note I'm hoping to get that in in the next few days.

andrewor14 · 2014-08-25T21:40:44Z

Ah I see. Yeah I agree that the refactor should go into master but not 1.1, and I suppose the same goes for this PR since we still need to test it. Sounds good.

tgravescs · 2014-09-09T15:28:44Z

@sryza sorry these have gone so long. Would you mind upmerging?

sryza · 2014-09-10T03:11:36Z

Upmerged

SparkQA · 2014-09-10T03:51:28Z

QA tests have started for PR 663 at commit 036550d.

This patch merges cleanly.

SparkQA · 2014-09-10T05:01:17Z

QA tests have finished for PR 663 at commit 036550d.

This patch passes unit tests.
This patch merges cleanly.
This patch adds no public classes.

tgravescs · 2014-09-10T19:33:35Z

+1, looks good. Thanks @sryza !

sryza changed the title ~~YARN-1713. Use a thread pool for launching executors.~~ SPARK-1713. Use a thread pool for launching executors. May 6, 2014

vanzin reviewed May 6, 2014
View reviewed changes

SPARK-1713. [YARN] Use a threadpool for launching executor containers

036550d

sryza force-pushed the sandy-spark-1713 branch from cf0ff3f to 036550d Compare September 10, 2014 03:11

asfgit closed this in 1f4a648 Sep 10, 2014

SPARK-1713. Use a thread pool for launching executors. #663

SPARK-1713. Use a thread pool for launching executors. #663

Conversation

sryza commented May 6, 2014

AmplabJenkins commented May 6, 2014

AmplabJenkins commented May 6, 2014

AmplabJenkins commented May 6, 2014

AmplabJenkins commented May 6, 2014

vanzin May 6, 2014

Choose a reason for hiding this comment

sryza commented May 6, 2014

AmplabJenkins commented May 6, 2014

AmplabJenkins commented May 6, 2014

AmplabJenkins commented May 6, 2014

AmplabJenkins commented May 6, 2014

tgravescs commented May 6, 2014

srowen commented May 6, 2014

sryza commented May 6, 2014

sryza commented May 6, 2014

mridulm commented May 6, 2014

AmplabJenkins commented May 6, 2014

AmplabJenkins commented May 6, 2014

tgravescs commented May 6, 2014

AmplabJenkins commented May 6, 2014

AmplabJenkins commented May 6, 2014

sryza commented May 6, 2014

AmplabJenkins commented May 6, 2014

AmplabJenkins commented May 6, 2014

AmplabJenkins commented May 6, 2014

AmplabJenkins commented May 6, 2014

tgravescs commented Jun 4, 2014

tgravescs commented Jun 4, 2014

andrewor14 commented Aug 25, 2014

andrewor14 commented Aug 25, 2014

tgravescs commented Aug 25, 2014

andrewor14 commented Aug 25, 2014

tgravescs commented Aug 25, 2014

tgravescs commented Aug 25, 2014

andrewor14 commented Aug 25, 2014

tgravescs commented Sep 9, 2014

sryza commented Sep 10, 2014

SparkQA commented Sep 10, 2014

SparkQA commented Sep 10, 2014

tgravescs commented Sep 10, 2014