Skip to content

[QST] - Spark3 question #5335

Answered by jlowe
eyalhir74 asked this question in General
Discussion options

You must be logged in to vote

With the GPU being mostly idle, I'm wondering about two possibilities:

  • is the entire query eligible to run on the GPU? There are costs to transitioning between CPU and GPU, and this could potentially cause some of the slowdown
  • is the query mostly bound by the filesystem read?

To answer the first question, you could run with the config spark.rapids.sql.explain set to true, and then you should see log messages for any portions of queries that are not on the GPU (and why they're not on the GPU). Depending on how many rows are being processed by nodes not on the GPU, it could contribute substantially to the slowdown you're seeing. Also if there are portions of the query not running on the G…

Replies: 7 comments

Comment options

You must be logged in to vote
0 replies
Answer selected by sameerz
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
3 participants
Converted from issue

This discussion was converted from issue #4828 on April 27, 2022 16:35.