No big difference on running time between CPU and GPU #6942

njalan · 2022-10-28T05:54:52Z

njalan
Oct 28, 2022

We have setup one standalone spark cluster with four workers (one worker node with one GPU installed).
Spark version:3.1.3
hudi version:0.9
rapid jar:rapids-4-spark_2.12-22.10.0.jar

Below is spark-env.sh added:
SPARK_WORKER_OPTS="-Dspark.worker.resource.gpu.amount=1 -Dspark.worker.resource.gpu.discoveryScript=/xxx/getGpusResources.sh"

Below is my spark submit parameter (I have tried spark.sql.shuffle.partitions as 4/16/64/220/400 but no big difference):

$SPARK_HOME/bin/spark-submit
--class DataWorkflowMain
--master spark://${MASTER_HOST}:7077
--deploy-mode client
--conf spark.executor.extraClassPath=${SPARK_RAPIDS_PLUGIN_JAR}
--conf spark.driver.extraClassPath=${SPARK_RAPIDS_PLUGIN_JAR}
--conf spark.rapids.sql.concurrentGpuTasks=4
--conf spark.kryo.registrator=com.nvidia.spark.rapids.GpuKryoRegistrator
--driver-memory 10G
--conf spark.executor.memory=40G
--conf spark.executor.cores=4
--conf spark.executor.resource.gpu.amount=1
--conf spark.task.resource.gpu.amount=0.25
--conf spark.rapids.sql.explain=ALL
--conf spark.rapids.memory.pinnedPool.size=4G
--conf spark.locality.wait=0s
--conf spark.sql.shuffle.partitions=64 \
--conf spark.sql.adaptive=true
--conf spark.rapids.sql.enabled=true
--conf spark.sql.adaptive.coalescePartitions.enabled=true
--conf spark.sql.files.maxPartitionBytes=512m
--conf spark.plugins=com.nvidia.spark.SQLPlugin
--conf 'spark.sql.legacy.parquet.datetimeRebaseModeInRead'=CORRECTED
--conf 'spark.sql.legacy.parquet.datetimeRebaseModeInRead'=LEGACY
--conf spark.sql.session.timeZone=UTC
--conf 'spark.sql.extensions=org.apache.spark.sql.hudi.HoodieSparkSessionExtension'
--conf 'spark.hadoop.mapreduce.input.pathFilter.class=org.apache.hudi.hadoop.HoodieROTablePathFilter'
--conf spark.worker.resourcesFile=${SPARK_RAPIDS_DIR}/bin/getGpusResources.sh
--conf spark.executor.resourcesFile=${SPARK_RAPIDS_DIR}/bin/getGpusResources.sh
--name $job_name
/home/hadoop/xxx-jar-with-dependencies.jar \

Below is Job info:

ETL job is to call hudi api to load the data and register it as temp table in spark sql and rest are pure table joins in spark sql.

Below are the cannot run on GPU messages I extract from log:

! cannot run on GPU because GPU does not currently support the operator class org.apache.spark.sql.execution.LocalTableScanExec

!Exec cannot run on GPU because unsupported data types in output: TimestampType

!Expression AppointmentDate#755 cannot run on GPU because expression AttributeReference AppointmentDate#755 produces an unsupported type TimestampType

!Expression cast(TradeInValueUpliftAmount#2054 as decimal(20,4)) cannot run on GPU because Only UTC zone id is supported. Actual default zone id: Asia/Shanghai

!Exec cannot run on GPU because the BroadcastHashJoin this feeds is not on the GPU

Is there any tuning parameter to enable GPU support above operations?

I have GPU installed and I can see each worker is using one GPU Why I still faced below error messages(many error messages):
2022-10-28 12:46:46,467 ERROR scheduler.TaskSchedulerImpl: Lost executor 999 on 10.120.39.19: Unable to create executor due to com.nvidia.spark.SQLPlugin

tgravescs · 2022-10-28T14:41:16Z

tgravescs
Oct 28, 2022
Maintainer

A few things:
--conf spark.rapids.sql.concurrentGpuTasks=4 this is rather large, many times to many concurrent hurts performance. I would suggest trying 2.

so it sounds like you aren't using the Spark readers directly, like parquet or orc, but using the hudi reader. I'm assuming that is reading on cpu and then has to do columnar to row, which might be a lot of overhead. You can check the metrics in the spark sql table for that to see how much time is spent there vs the rest of the query. The readers Re one place we accelerate very well and it goes directly to GPU.

!Expression cast(TradeInValueUpliftAmount#2054 as decimal(20,4)) cannot run on GPU because Only UTC zone id is supported. Actual default zone id: Asia/Shanghai

Unless you really need another timezone to be Asia/Shanghai suggest you run your spark cluster with timezone in UTC.
-conf spark.sql.session.timeZone=UTC --conf spark.driver.extraJavaOptions=-Duser.timezone=UTC --conf spark.executor.extraJavaOptions=-Duser.timezone=UTC

!Exec cannot run on GPU because unsupported data types in output: TimestampType

What is the operation that this isn't supported with?

2022-10-28 12:46:46,467 ERROR scheduler.TaskSchedulerImpl: Lost executor 999 on 10.120.39.19: Unable to create executor due to com.nvidia.spark.SQLPlugin

So this is weird. What is the rest of this error? I assume it failed to load.. Look in the executor logs for that fails executor to get more information.

The other thing that would be useful is if we can get screen shot of hte sql query (spark ui SQL tab) if that is something you can share.

1 reply

abellina Oct 28, 2022
Collaborator

To add to what @tgravescs mentioned for the "Lost executor" message, one way to find the root of it in, is to go to the lowest executor id number in the specific worker (999 is one of them, presumably there are many others... find the smallest id in the work directory). This should give you the root cause that started the avalanche of executors to be created.

njalan · 2022-11-01T02:06:19Z

njalan
Nov 1, 2022
Author

@abellina I fix it, I got the error because I put the jar on the wrong folder on some work nodes

2 replies

njalan Nov 1, 2022
Author

Below is my latest running command:
$SPARK_HOME/bin/spark-submit
--class com.xxxx.xxx
--master spark://${MASTER_HOST}:7077
--deploy-mode client
--conf spark.executor.extraClassPath=${SPARK_RAPIDS_PLUGIN_JAR}
--conf spark.driver.extraClassPath=${SPARK_RAPIDS_PLUGIN_JAR}
--conf spark.rapids.sql.concurrentGpuTasks=2
--conf spark.kryo.registrator=com.nvidia.spark.rapids.GpuKryoRegistrator
--driver-memory 10G
--conf spark.executor.memory=20G
--conf spark.executor.cores=4
--conf spark.executor.resource.gpu.amount=1
--conf spark.task.resource.gpu.amount=0.5
--conf spark.rapids.sql.explain=ALL
--conf spark.rapids.memory.pinnedPool.size=4G
--conf spark.locality.wait=0s
--conf spark.sql.shuffle.partitions=300
--conf spark.sql.adaptive=true
--conf spark.rapids.sql.enabled=true
--conf spark.sql.adaptive.coalescePartitions.enabled=true
--conf spark.sql.files.maxPartitionBytes=512m
--conf spark.plugins=com.nvidia.spark.SQLPlugin
--conf spark.sql.session.timeZone=UTC --conf spark.driver.extraJavaOptions=-Duser.timezone=UTC --conf spark.executor.extraJavaOptions=-Duser.timezone=UTC
--conf 'spark.sql.legacy.parquet.datetimeRebaseModeInRead'=CORRECTED
--conf 'spark.sql.legacy.parquet.datetimeRebaseModeInRead'=LEGACY
--conf spark.sql.session.timeZone=UTC
--conf 'spark.sql.extensions=org.apache.spark.sql.hudi.HoodieSparkSessionExtension'
--conf 'spark.hadoop.mapreduce.input.pathFilter.class=org.apache.hudi.hadoop.HoodieROTablePathFilter'
--conf spark.worker.resourcesFile=${SPARK_RAPIDS_DIR}/bin/getGpusResources.sh
--conf spark.executor.resourcesFile=${SPARK_RAPIDS_DIR}/bin/getGpusResources.sh \

I got below " cannot run on GPU" messages:

**! cannot run on GPU because GPU does not currently support the operator class org.apache.spark.sql.execution.LocalTableScanExec

!Exec <ProjectExec> cannot run on GPU because not all expressions can be replaced

    ! <TruncTimestamp> date_trunc(WEEK, OrderPlacedDate#538 + -1 days, Some(UTC)) cannot run on GPU because GPU does not currently support the operator class org.apache.spark.sql.catalyst.expressions.TruncTimestamp

  !Exec <InMemoryTableScanExec> cannot run on GPU because ParquetCachedBatchSerializer is not being used**

tgravescs Nov 1, 2022
Maintainer

Yeah we currently don't support TruncTimestamp. If you look at the SQL graph output in the spark ui, you can get an idea on if this is causing you a lot of performance by looking at the metrics in the columnarToRow, the the project with the TruncTimestamp and then the rowToColumnar if there is one. If the columnarToRow and rowToColumnar is adding a lot of extra time and slowing down the query, you could file a feature request with us to support it and we can take a look. Many times basic operations not supported on the GPU don't cause a lot of overhead so you will have to look at the metrics to see.

Did you run again with concurrent tasks set to 2, what were performance results on gpu vs cpu? Also how much data are you reading here?

njalan · 2022-11-03T02:56:32Z

njalan
Nov 3, 2022
Author

@tgravescs Thanks for your reply. I have tried oncurrent tasks as 2. Overall the job is running 20 minutes the same as cpu but use much less resource.
CPU: 48 Cores, 300G
GPU: 16 cores,40G(10 per executor.)

1 reply

tgravescs Nov 3, 2022
Maintainer

So overall that seems good from a cost point of view, you are using less resources and getting the same performance. Is this your own local nodes or in the cloud? Did you try giving the GPU run more cores/memory (which might help if anything is running on cpu and things like shuffle and potentially reads since reading from Hudi)? Is your primary goal here just to speed it up vs cost savings?
Would you be able to send a screenshot of the SQL UI with all the metrics showing there? If you don't want to post here you can send to our email [email protected]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

No big difference on running time between CPU and GPU #6942

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 3 comments 4 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

No big difference on running time between CPU and GPU #6942

njalan Oct 28, 2022

Replies: 3 comments · 4 replies

tgravescs Oct 28, 2022 Maintainer

abellina Oct 28, 2022 Collaborator

njalan Nov 1, 2022 Author

njalan Nov 1, 2022 Author

tgravescs Nov 1, 2022 Maintainer

njalan Nov 3, 2022 Author

tgravescs Nov 3, 2022 Maintainer

njalan
Oct 28, 2022

Replies: 3 comments 4 replies

tgravescs
Oct 28, 2022
Maintainer

abellina Oct 28, 2022
Collaborator

njalan
Nov 1, 2022
Author

njalan Nov 1, 2022
Author

tgravescs Nov 1, 2022
Maintainer

njalan
Nov 3, 2022
Author

tgravescs Nov 3, 2022
Maintainer