Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPARK-1713. Use a thread pool for launching executors. #663

Closed
wants to merge 1 commit into from

Conversation

sryza
Copy link
Contributor

@sryza sryza commented May 6, 2014

This patch copies the approach used in the MapReduce application master for launching containers.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@sryza sryza changed the title YARN-1713. Use a thread pool for launching executors. SPARK-1713. Use a thread pool for launching executors. May 6, 2014
@AmplabJenkins
Copy link

Merged build finished.

@AmplabJenkins
Copy link

Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14710/

10, Integer.MAX_VALUE,
1, TimeUnit.MINUTES,
new LinkedBlockingQueue[Runnable](),
Executors.defaultThreadFactory())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: use guava's ThreadFactoryBuilder and name these threads?

@sryza
Copy link
Contributor Author

sryza commented May 6, 2014

Updated patch uses Guava ThreadFactoryBuilder

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@AmplabJenkins
Copy link

Merged build finished. All automated tests passed.

@AmplabJenkins
Copy link

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14722/

@tgravescs
Copy link
Contributor

This doesn't dynamically change the thread pool size. Its basically limited at 10 since its using a LinkedBlockingQueue. And the value of the maximumPoolSize doesn't have any effect. The MR AM has code to change the size and also has a config for max limit.

@srowen
Copy link
Member

srowen commented May 6, 2014

See also Executors.newCachedThreadPool if the intent is to always create a new thread if no idle ones are available. That's closer in behavior to always creating a new Thread. You may still get loads of simultaneous threads and may not save many allocations.

Capping the number of threads helps those two factors (why wouldn't the max pool size have effect?) but at some level this means this may block when the executor is busy. Right now the no-arg LinkedBlockingQueue will go on accepting work until it has 2 billion entries, if the executor can't keep up. This is probably not ideal, and should be capped at something more reasonable. Not sure if the possibility of blocking the thread that invokes the task is a problem or not.

How about setting the max pool size, and queue capacity, to a value that is deemed quite large? like, you never want more than 1000 threads and 100K tasks queued? at least you're putting some sane defenses up against these situations rather than make 100K threads.

@sryza
Copy link
Contributor Author

sryza commented May 6, 2014

Ah, misunderstood how ThreadPoolFactory worked.

To take a step back, we expect to have a burst (max low thousands) of executors to run at the beginning of an application. Then the only time we'll use the thread pool is when executors fail and we need to start new ones.

So the behavior we want is to be able to scale the pool up to a large number of threads, and then shrink it after the initial burst is done. We'd probably like to limit this number of threads at some number, but keep queueing up containers to start.

I think the right solution is probably to create a pool with a large core size (500?), and then, after the initial containers are launched, bring this down to something more manageable like 10.

@sryza
Copy link
Contributor Author

sryza commented May 6, 2014

Or, actually, if we use allowCoreThreadTimeout then we don't need to change the core size.

@mridulm
Copy link
Contributor

mridulm commented May 6, 2014

That sounds reasonable approach - will know more when PR is updated.

The initial idea behind separate Thread for each was exactly that - we expected low hundreds as upper limit (at that time); we needed to send request to RM asap while not being sure how long the rpc can take. This is, clearly, suboptimal leading to unbounded increase in threads during initial allocation burst.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@tgravescs
Copy link
Contributor

We don't necessarily need to start with 500 threads. In the very least it should be configurable. In our tests of MR we found that 500 threads didn't actually help, we found that 10's worked just fine (20-30), even when launching thousands of tasks. If you actually get 500 threads it can cause memory issues. I guess if we have them timeout quicker it might not be as much of an issues (I believe MR's timeout is high), but I would like to see it a config just in case it is. Also 1 second might be a bit to short if you actually want to reuse them as the AM heartbeats to the RM every 5 seconds, so if you get one round launch them, heartbeat back in, it will have shutdown those threads even though it could reuse them.

@srowen See the javadoc on the ThreadPoolExecutor about LInkedBlockingQueue:

Unbounded queues. Using an unbounded queue (for example a LinkedBlockingQueue without a predefined capacity) will cause new tasks to wait in the queue when all corePoolSize threads are busy. Thus, no more than corePoolSize threads will ever be created. (And the value of the maximumPoolSize therefore doesn't have any effect.) This may be appropriate when each task is completely independent of others, so tasks cannot affect each others execution; for example, in a web page server. While this style of queuing can be useful in smoothing out transient bursts of requests, it admits the possibility of unbounded work queue growth when commands continue to arrive on average faster than they can be processed.

Also perhaps adding a comment there to explain it would be good as its easy to miss that max isn't used.

@AmplabJenkins
Copy link

Merged build finished.

@AmplabJenkins
Copy link

Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14734/

@sryza
Copy link
Contributor Author

sryza commented May 6, 2014

By default, ThreadPoolExecutor doesn't start all the core threads at once. Though I just learned that when it gets tasks, it will start new ones even if others are idle, so I agree that 500 seems high. Uploading a patch that defaults to 25 and makes it configurable.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@AmplabJenkins
Copy link

Merged build finished. All automated tests passed.

@AmplabJenkins
Copy link

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14740/

@tgravescs
Copy link
Contributor

@sryza can you update this to the latest master and also update the yarn alpha side. Since we are adding a config I think it should apply to both of them.

@tgravescs
Copy link
Contributor

Also I'm curious if you ran any benchmarks test startup time?

@andrewor14
Copy link
Contributor

test this please

@andrewor14
Copy link
Contributor

@sryza When you have a chance can you upmerge this to master and resolve the conflicts?

@tgravescs
Copy link
Contributor

@andrewor14 fyi - I want to finish reviewing and merge in #2020 which will probably affect many of the existing PRs. So you might want to hold off on upmerging.

@andrewor14
Copy link
Contributor

@tgravescs What is the approximate timeline on the refactor? We are cutting a release candidate for 1.1 soon and at this point we are basically considering only bug fixes and other minor changes. The changes in #2020 seem a little big.

@tgravescs
Copy link
Contributor

the changes in 2020 are meant only for master, not 1.1. At this point I don't see this one going into 1.1 either. Or did you hit some particular issues this is needed?

@tgravescs
Copy link
Contributor

Note I'm hoping to get that in in the next few days.

@andrewor14
Copy link
Contributor

Ah I see. Yeah I agree that the refactor should go into master but not 1.1, and I suppose the same goes for this PR since we still need to test it. Sounds good.

@tgravescs
Copy link
Contributor

@sryza sorry these have gone so long. Would you mind upmerging?

@sryza
Copy link
Contributor Author

sryza commented Sep 10, 2014

Upmerged

@SparkQA
Copy link

SparkQA commented Sep 10, 2014

QA tests have started for PR 663 at commit 036550d.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Sep 10, 2014

QA tests have finished for PR 663 at commit 036550d.

  • This patch passes unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@tgravescs
Copy link
Contributor

+1, looks good. Thanks @sryza !

@asfgit asfgit closed this in 1f4a648 Sep 10, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants