Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ScanExec metrics do not get reported in Spark UI for aggregate, join, sort, etc #1110

Closed
andygrove opened this issue Nov 21, 2024 · 0 comments · Fixed by #1111
Closed

ScanExec metrics do not get reported in Spark UI for aggregate, join, sort, etc #1110

andygrove opened this issue Nov 21, 2024 · 0 comments · Fixed by #1111
Assignees
Labels
bug Something isn't working
Milestone

Comments

@andygrove
Copy link
Member

Describe the bug

Here is the native plan for a join. The join metrics of build_time and join_time get reported in the Spark UI but we do not report the metrics for fetching the input batches from the JVM or for unpacking dictionaries and performing deep copies where needed.

For this example it means we are reporting a time of ~410ms when the actual time is closer to ~600ms, and this is just for one partition.

HashJoinExec: metrics=[build_time=400.827077ms, join_time=8.557039ms]
  CopyExec [UnpackOrDeepCopy], metrics=[elapsed_compute=18.643737ms]
    ScanExec: source=[ShuffleQueryStage], metrics=[elapsed_compute=186.719525ms]
  CopyExec [UnpackOrDeepCopy], metrics=[..., elapsed_compute=293.113µs]
    ScanExec: source=[ShuffleQueryStage ...], metrics=[elapsed_compute=5.906924ms]

Steps to reproduce

No response

Expected behavior

No response

Additional context

No response

@andygrove andygrove added the bug Something isn't working label Nov 21, 2024
@andygrove andygrove added this to the 0.5.0 milestone Nov 21, 2024
@andygrove andygrove self-assigned this Nov 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant