-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-24613][SQL] Cache with UDF could not be matched with subsequent dependent caches #21602
Conversation
Test build #92149 has finished for PR 21602 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -132,4 +132,19 @@ class DatasetCacheSuite extends QueryTest with SharedSQLContext with TimeLimits | |||
df.unpersist() | |||
assert(df.storageLevel == StorageLevel.NONE) | |||
} | |||
|
|||
test("SPARK-24613 Cache with UDF could not be matched with subsequent dependent caches") { | |||
val expensiveUDF = udf({x: Int => Thread.sleep(10000); x}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we use accumulator and make sure this UDF only run 10 times? sleeping 10 seconds is not good in a unit test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Accumulators probably wouldn't work. I'll do verify plan though.
Test build #92158 has finished for PR 21602 at commit
|
retest this please |
Test build #92174 has finished for PR 21602 at commit
|
Thanks! Merged to master. |
This is also a regression. Backported to 2.3 branch too. |
…t dependent caches Wrap the logical plan with a `AnalysisBarrier` for execution plan compilation in CacheManager, in order to avoid the plan being analyzed again. Add one test in `DatasetCacheSuite` Author: Maryann Xue <[email protected]> Closes #21602 from maryannxue/cache-mismatch.
What changes were proposed in this pull request?
Wrap the logical plan with a
AnalysisBarrier
for execution plan compilation in CacheManager, in order to avoid the plan being analyzed again.How was this patch tested?
Add one test in
DatasetCacheSuite