[QST] Working configuration for multi executor - multi GPU environment on Spark Standalone cluster #5366

tpapaj-CKPL · 2021-02-05T09:22:07Z

tpapaj-CKPL
Feb 5, 2021

Hello,
from what I see it should be possible to run spark-rapids on multi GPU setups: #1667 so instead of "Bug" I am tagging it a "question"

The question is, what is the correct way to run the multiGPU worker on Spark Standalone Cluster?
I am working on DGX-1 instance with Spark Standalone Cluster worker on it and so far I am able to run Spark jobs only when single GPU is used, for example when the amount of cores is max so second executor can't be started.

As soon as I define for example max cores configuration so more than one executor can be started on the worker all executors start failing because of out of memory:

21/02/05 09:06:55 INFO rapids.RapidsExecutorPlugin: Initializing memory from Executor Plugin
21/02/05 09:07:01 WARN rapids.GpuDeviceManager: Initial RMM allocation (14652.787109375 MB) is larger than free memory (13986.875 MB)
21/02/05 09:07:01 INFO rapids.GpuDeviceManager: Initializing RMM ARENA initial size = 14652.787109375 MB, max size = 15256.875 MB on gpuId 0
21/02/05 09:07:01 INFO rapids.GpuDeviceManager: Using per-thread default stream
21/02/05 09:07:01 ERROR rapids.RapidsExecutorPlugin: Exception in the executor plugin
java.lang.OutOfMemoryError: Could not allocate native memory: std::bad_alloc: CUDA error at: /usr/local/rapids/include/rmm/mr/device/cuda_memory_resource.hpp:69: cudaErrorMemoryAllocation out of memory
	at ai.rapids.cudf.Rmm.initializeInternal(Native Method)
	at ai.rapids.cudf.Rmm.initialize(Rmm.java:198)
	at com.nvidia.spark.rapids.GpuDeviceManager$.initializeRmm(GpuDeviceManager.scala:256)
	at com.nvidia.spark.rapids.GpuDeviceManager$.initializeMemory(GpuDeviceManager.scala:287)
	at com.nvidia.spark.rapids.GpuDeviceManager$.initializeGpuAndMemory(GpuDeviceManager.scala:126)
	at com.nvidia.spark.rapids.RapidsExecutorPlugin.init(Plugin.scala:145)
	at org.apache.spark.internal.plugin.ExecutorPluginContainer.$anonfun$executorPlugins$1(PluginContainer.scala:111)
	at scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:245)
	at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
	at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
	at scala.collection.TraversableLike.flatMap(TraversableLike.scala:245)
	at scala.collection.TraversableLike.flatMap$(TraversableLike.scala:242)
	at scala.collection.AbstractTraversable.flatMap(Traversable.scala:108)
	at org.apache.spark.internal.plugin.ExecutorPluginContainer.<init>(PluginContainer.scala:99)
	at org.apache.spark.internal.plugin.PluginContainer$.apply(PluginContainer.scala:164)
	at org.apache.spark.internal.plugin.PluginContainer$.apply(PluginContainer.scala:152)
	at org.apache.spark.executor.Executor.$anonfun$plugins$1(Executor.scala:220)
	at org.apache.spark.util.Utils$.withContextClassLoader(Utils.scala:221)
	at org.apache.spark.executor.Executor.<init>(Executor.scala:220)
	at org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$receive$1.applyOrElse(CoarseGrainedExecutorBackend.scala:152)
	at org.apache.spark.rpc.netty.Inbox.$anonfun$process$1(Inbox.scala:115)
	at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:203)
	at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:100)
	at org.apache.spark.rpc.netty.MessageLoop.org$apache$spark$rpc$netty$MessageLoop$$receiveLoop(MessageLoop.scala:75)
	at org.apache.spark.rpc.netty.MessageLoop$$anon$1.run(MessageLoop.scala:41)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.base/java.lang.Thread.run(Thread.java:834)
21/02/05 09:07:01 ERROR util.Utils: Uncaught exception in thread Thread-1
java.lang.NullPointerException
	at org.apache.spark.executor.Executor.$anonfun$stop$3(Executor.scala:289)
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
	at org.apache.spark.util.Utils$.withContextClassLoader(Utils.scala:221)
	at org.apache.spark.executor.Executor.stop(Executor.scala:289)
	at org.apache.spark.executor.Executor.$anonfun$new$2(Executor.scala:74)
	at org.apache.spark.util.SparkShutdownHook.run(ShutdownHookManager.scala:214)
	at org.apache.spark.util.SparkShutdownHookManager.$anonfun$runAll$2(ShutdownHookManager.scala:188)
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
	at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1932)
	at org.apache.spark.util.SparkShutdownHookManager.$anonfun$runAll$1(ShutdownHookManager.scala:188)
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
	at scala.util.Try$.apply(Try.scala:213)
	at org.apache.spark.util.SparkShutdownHookManager.runAll(ShutdownHookManager.scala:188)
	at org.apache.spark.util.SparkShutdownHookManager$$anon$2.run(ShutdownHookManager.scala:178)
	at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
21/02/05 09:07:01 INFO storage.DiskBlockManager: Shutdown hook called
21/02/05 09:07:01 INFO util.ShutdownHookManager: Shutdown hook called

My SC configuration:

import pyspark
conf = pyspark.SparkConf()
conf.set("spark.master","spark://localhost:7077")
conf.set("spark.plugins","com.nvidia.spark.SQLPlugin")
conf.set("spark.executor.cores", 20)

My worker configurtion

SPARK_WORKER_OPTS="-Dspark.worker.resource.gpu.amount=8 -Dspark.worker.resource.gpu.discoveryScript=/spark/spark-3.0.1-bin-hadoop2.7/bin/getGpusResources.sh"
SPARK_WORKER_INSTANCES=8

When I run the SC without max cores config:

import pyspark
conf = pyspark.SparkConf()
conf.set("spark.master","spark://localhost:7077")
conf.set("spark.plugins","com.nvidia.spark.SQLPlugin")

Only one executor of max cores is created and the job runs correctly.

Answered by tpapaj-CKPL

Feb 8, 2021

I found the problem. It seems it was very dumb thing.

My spark-env file was not correctly loaded.

Now, instead I loaded config files with --properties-file option when starting a worker with spark.worker.resource.gpu.discoveryScript /spark/spark-3.0.1-bin-hadoop2.7/bin/getGpuResources.sh in this file and it works. It was my fault.

Thank you for help.

View full answer

revans2 · 2021-02-05T14:02:29Z

revans2
Feb 5, 2021
Maintainer

Did you follow the instructions at https://nvidia.github.io/spark-rapids/docs/get-started/getting-started-on-prem.html#spark-standalone-cluster ? I mostly want to know if we need to update the docs because it is not clear or if you didn't see them.

I think your problem is that Spark is some how assigning multiple executors to a single GPU, so GPU scheduling is not working as expected. My guess is that you didn't request any GPUs for your executors.

You need to set spark.executor.resource.gpu.amount to 1, so you get one GPU per executor. You also need to set spark.task.resource.gpu.amount to 1/spark.executor.cores So if your spark.executor.cores is 4 you would set it to 0.25. We should be able to run with 20 cores per executor, but a DGX-1 has 2 - 20-core Xeon processors in it. For 40 cores, with hyper-threading that should be 80 threads so you probably want to set spark.executor.cores to 10 and spark.task.resource.gpu.amount to 0.1

Another thing to do is to go to the Spark cluster UI and look at the resources per worker to be sure that Spark sees the GPUs. You can also look at the application UI page to be sure that the application has been assigned GPUs.

0 replies

tpapaj-CKPL · 2021-02-05T14:29:06Z

tpapaj-CKPL
Feb 5, 2021
Author

Thank you for reply.
I have these settings on my worker

SPARK_WORKER_OPTS="-Dspark.worker.resource.gpu.amount=8 -Dspark.worker.resource.gpu.discoveryScript=/spark/spark-3.0.1-bin-hadoop2.7/bin/getGpusResources.sh"
SPARK_WORKER_INSTANCES=8

I checked spark.executor.cores 10 and spark.task.resource.gpu.amount 0.1 before and nothing happened, tasks were stuck.

Job configuration on that website gives me App app-20210205141250-0010 requires more resource than any of Workers could have. logs on the Master, I checked the master now and it seems that the worker does not see GPUs(blank resources) and any task with GPU resources set like in these instructions cannot get scheduled because of that (1GPU on resources needed on the master page). This is probably why tasks spark.executor.cores 10 and spark.task.resource.gpu.amount 0.1 were stuck.
The situation I described at the beginning is weird. Now I see that tasks without that extended config do run because they do not have GPU resource as a requirement but all operations all done on the GPU but only on the first one.

Thanks, it may get me little closer to solving that issue. It seems that making worker see GPUs should resolve the issue. The question now is how to do it? As I wrote above, I set needed config options for the worker from that page.

0 replies

revans2 · 2021-02-05T14:34:31Z

revans2
Feb 5, 2021
Maintainer

The worker should be able to figure it out form the discovery script. Could you try to run it and see what it reports?

0 replies

tgravescs · 2021-02-05T14:34:46Z

tgravescs
Feb 5, 2021
Maintainer

you are starting 8 workers on the same node or different nodes?
You only need 1 worker per physical node and that one should be setup to see all the GPUs

0 replies

tpapaj-CKPL · 2021-02-08T06:42:26Z

tpapaj-CKPL
Feb 8, 2021
Author

The worker should be able to figure it out form the discovery script. Could you try to run it and see what it reports?

This is output of my script: {"name": "gpu", "addresses":["0","1","2","3","4","5","6","7"]}

you are starting 8 workers on the same node or different nodes?
You only need 1 worker per physical node and that one should be setup to see all the GPUs

I am starting one worker on DGX and configuring jobs to run more than one executor.

I will let you know if I find any solution to this problem.

0 replies

tpapaj-CKPL · 2021-02-08T10:32:11Z

tpapaj-CKPL
Feb 8, 2021
Author

I found the problem. It seems it was very dumb thing.

My spark-env file was not correctly loaded.

Now, instead I loaded config files with --properties-file option when starting a worker with spark.worker.resource.gpu.discoveryScript /spark/spark-3.0.1-bin-hadoop2.7/bin/getGpuResources.sh in this file and it works. It was my fault.

Thank you for help.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[QST] Working configuration for multi executor - multi GPU environment on Spark Standalone cluster #5366

{{title}}

Replies: 6 comments

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

[QST] Working configuration for multi executor - multi GPU environment on Spark Standalone cluster #5366

tpapaj-CKPL Feb 5, 2021

Replies: 6 comments

revans2 Feb 5, 2021 Maintainer

tpapaj-CKPL Feb 5, 2021 Author

revans2 Feb 5, 2021 Maintainer

tgravescs Feb 5, 2021 Maintainer

tpapaj-CKPL Feb 8, 2021 Author

tpapaj-CKPL Feb 8, 2021 Author

tpapaj-CKPL
Feb 5, 2021

revans2
Feb 5, 2021
Maintainer

tpapaj-CKPL
Feb 5, 2021
Author

revans2
Feb 5, 2021
Maintainer

tgravescs
Feb 5, 2021
Maintainer

tpapaj-CKPL
Feb 8, 2021
Author

tpapaj-CKPL
Feb 8, 2021
Author